Document retention whether you want it or not

Wired on Friday: Microsoft prides itself on selling you software that lets you to quickly locate and organise your e-mail

Wired on Friday: Microsoft prides itself on selling you software that lets you to quickly locate and organise your e-mail. Yet, two years ago, Microsoft employees were instructed by a senior executive, Jim Allchin, to "not be foolish, do not archive your mail".

A prominent Silicon Valley company - not Google - has a business model that focuses on collecting and archiving web pages from around the internet: yet it has a policy of deleting employees' e-mails after a few weeks.

The schizophrenic nature of these tech companies, who, while they create incredible tools for storing and generating data, now methodically destroy their own digital records, comes from bitter experience.

Companies have suffered under discovery, the legal process in which the private internal communications of their executives are summoned and poured over by opposing attorneys.

READ MORE

Microsoft first learnt the pain of saving too much when discovery uncovered a delicate query from Bill Gates, asking "How much do we need to pay you to screw Netscape?". Microsoft's hyper-competitive internal culture looked far from innocent when ancient e-mails were uncovered and quoted in court.

But Netscape, Microsoft's ancient competitor in the browser wars, had its own discovery woes. Unlike Microsoft, it had a strict document retention policy before the battle with the Redmond giant that meant that almost every mail was shredded within 90 days. Microsoft's own lawyers discovered and pounced upon an employee's private archive of a mailing list called "Really Bad Attitude", consisting entirely of Netscape workers freely venting about their work conditions.

So what should you do? The standard legal suggestion, which Microsoft, Netscape and others have - against their geekish instincts - adopted, is to delete all messages other than those they are legally bound to preserve.

But that is easier said than done. Data oozes everywhere, and with every computer logging so promiscuously, attorneys will often dig far further than lax destruction policies. Allchin's missive to delete mail itself was preserved.

And the closer to watertight a document retention policy gets, the worse the discovery can become. A deposition in a class suit against Boeing revealed that the company, while it had a retention policy, did have 13,000 e-mail backup tapes stored in a Washington DC warehouse. Decoding, restoring and picking over the tapes took thousands of hours of employee time, and the contents were so damning as to contribute to the $92.5 million (€76.5 million) the company settled for.

The problem is this: for most computer applications, the cost of storing data that may be useful one day is so low, and the price of deleting the wrong thing is so high that their creators err on the side of logging everything.

E-mail clients like Outlook save every mail by default, even when it is impossible for anyone to rationally pick through its archives.

Your web browser idly stores the address of every single page you visit in its history file for anything up to a month, on the assumption that such a surfeit of data might one day be useful. Back-up programs back up everything, for fear that some vital bit of data might be lost.

That you will want to refer to this history is unlikely: the chances are that the data is only used occasionally to suggest the ending of a website address, or to turn those hyperlinks that you have clicked on purple.

Unless, of course, you are a jealous spouse, an investigating attorney or a nosy company tech guy. Retaining data is cheap: until it becomes very expensive indeed.

But merely having a data retention policy can cause its own problems: the American Civil Liberties Union, whose investigations often revolve around an expectation of meticulous record-keeping practices of government, was embroiled in scandal when its archivist resigned. Her complaint was that despite a careful retention policy, key officials had "personal shredders" in their offices to selectively destroy documents.

And the recent overturning of a criminal conviction against Arthur Andersen by the US Supreme Court revolved around how and when, precisely, it was alright for them to destroy archives. In the end, the Supreme Court decided that Andersen's decision to enforce a previous retention policy more strictly was acceptable, even though lower courts had decided that this was incriminating behaviour.

So, delete mail or not? For most of us, a lawsuit against us is barely imaginable, and for that far-off reason we neither have the time nor the organisation to successfully purge our files. We act in exactly the opposite way that lawyers - and perhaps our best interests - would recommend: delete sporadically, let our applications collect data where they may. Our computers file away everything for a determined investigator, but nothing we can casually use ourselves.

No matter how many times their company or their profits are affected, deleting data that might one day be useful, no matter how unnecessary it seems now, is against every instinct a technologist treasures. So they - and we - log away, hoping contradictorily that this data will stay safe and stay out of the wrong hands.