The Dark Side of Data Retention Policies

When people weren't buying their story, they elaborated:

Prior to the eruption of the IRS controversy last spring, the IRS had a policy of backing up the data on its email server (which runs Microsoft Outlook) every day. It kept a backup of the records for six months on digital tape, according to a letter sent from the IRS to Sens. Ron Wyden (D-Ore.) and Orrin Hatch (R-Utah). After six months, the IRS would reuse those tapes for newer backups. So when Congressional committees began requesting emails from the agency, its records only went back to late 2012.

This is vaguely plausible, in the sense that, yes, sometimes backups get overwritten on a tape backup system, and there is often a rotating media policy. But 6 months is a laughably short data retention period for a government agency. Normally data retention policies cover years (3-5 years is typical of a large organization). Using tape backups is another anachronistic element; while there has been a lot of storage technology improvements since then, archiving data on writable CDs or DVDs is cheaper, stores more data, and is frankly a lot simpler than using tape drives.

I know this because I have history with tape drives. I'm not fond of them. They have been obsolete for years. I've done hot-backups to spinning disk arrays since, well,roughly the same time Lerner was getting wild and crazy with the rubber hoses on the Tea Party. The storage size of available disks has grown faster than my data has over that time period.

The IRS also had two other policies that complicated things. The first was a limit on how big its employees' email inboxes could be. At the IRS, employees could keep 500 megabytes of data on the email server. If the mailbox got too big, email would need to be deleted or moved to a local folder on the user's computer.

This part I can sort of believe, too, but there's a problem with it. I know quite well how annoying it can be for email admins to deal with users who send and receive huge file attachments on a regular basis. While today's computers can handle that without breaking a sweat, computers in 2009-2011 were only just getting close to the point where worrying about how much data users kept in their on-server files could be put behind us.

Except for one thing.

We're talking about a government agency. Neither a 6-month "reuse cycle" for tape backups nor a "archive your personal email to your personal workstation" policy is acceptable for a government agency which has to produce records on demand, particularly if the personal copy is expected to be the permanent, long-term archive of the data.

An insider's view:

In the case of the prime contract and record retention, “The IRS IT projects were fully funded and never lacked for resources. To state ‘Backup tapes were reused after some short period’ is a complete joke. The IRS had thousands and thousands of tapes and ‘Virtual Tape Libraries’ (VTL or non-tape backups based on hard drive storage technologies). There was never a reason to reuse tapes.”

That matches my experience in government, though I wasn't running the email systems. Insane amounts of data was stored and archived in order to enable auditing later on (and obviously, investigations, if necessary).

What are the possible motives for having a 6-month data retention policy supplemented by emails stored on the personal computers of individual employees?
Quoting from the EFF on Data Retention policies:

The best defense against a search or a subpoena is to minimize the amount of information that it can reach. Every organization should have a clear policy on how long to keep particular types of information, for three key reasons:
It’s a pain and an expense to keep everything.
It’s a pain and an expense to have to produce everything in response to subpoenas.
It’s a real pain if any of it is used against you in court — just ask Bill Gates. His internal emails about crushing Netscape were not very helpful at Microsoft’s antitrust trial.

Think about it — how far back does your email archive go? Do you really need to keep every email? Imagine you got a subpoena tomorrow — what will you wish you’d destroyed?

I mentioned data retention policies earlier. They have a good side and a bad side. The good side is that they retain email and similar documents when needed, either for internal investigation purposes or responding to discovery in lawsuits. The point of having a data retention policy is to do two things.

1) The policy protects you from accusations of spoiling evidence. You can point to your policy and say "We retain data according to this policy. If this policy is not followed, blame the individuals who failed to follow the policy rather than the corporate entity."

2) The policy protects you by deleting data on a specified schedule after a specified time. In other words, the data retention policy exists in order to allow data to NOT be retained past a certain time. If you want to sue, you had better do so within the data retention period, and notify the target of your discovery requests promptly if you want to be able to use any data covered by the policy. If you wait too long, the company can point to the policy and say, "Don't blame us, we had a policy and followed it. Any destruction of evidence was entirely inadvertent and caused by the policy, not improper motives."

I suspect the IRS is engaging in the second point here. Their data retention policy is laughably short and probably designed deliberately to destroy evidence promptly and automatically. A six-month retention cycle might make sense in, say, 2000... for a small company. For a large government agency that needs to support internal investigations and audits, in 2009? I'm still not buying it.

Quoting from the EFF's page on data retention:

Your organization should review all of the types of documents, computer files, communications records, and other information that it collects and then develop a policy defining whether and when different types of data should be destroyed. For example, you may choose to destroy case files six months after cases are closed, or destroy Internet logs showing who visited your website immediately, or delete emails after one week. This is called a "document retention policy," and it’s your best defense against a subpoena — they can’t get it if you don’t have it. And the only way to make sure you don’t have it is to establish a policy that everyone follows. Set a clear written policy for the length of time documents are kept (both electronic and paper documents). Having a written policy and following it will help you if you are accused of destroying documents to hide evidence.

As a government agency that is constantly involved in litigation and expects the people it regulates to retain their tax records for years, one would expect the IRS to retain data for years or decades. They claim to retain data for 6 months.

Is it technically possible that the IRS has a retention policy so short (6 months) and so stupid (relies on personal computers to retain email data past the backup expiration)? I suppose so. But if that it is the case, it represents incompetence and negligence on a quite-possibly-criminal scale. It represents the deliberate use of a data retention policy that is absurdly, inappropriately short, and more importantly, puts the potential evidence of criminal wrongdoing in the control of the potential criminal, who will often possess the sole copy of that data and can choose to destroy it.

The IRS claims they have already increased the data retention time on their tape backups. This addresses, at least somewhat, the policy issue. We should definitely pursue the intent of policy in another matter, however: we should hold Lerner fully and completely responsible for the data she failed to retain. If it was deliberately deleted to avoid accountability for her actions, that is criminal obstruction of justice.

UPDATE: 6 more individuals involved in targeting the Tea Party had "computer crashes" that destroyed their email records.

Quoting from the EFF again:

Do not destroy evidence. You should never destroy anything after it has been subpoenaed or if you have reason to believe you are under investigation and it is about to be subpoenaed — destruction of evidence and obstruction of justice are serious crimes that carry steep fines and possible jail time, even if you didn’t do the original crime. Nor should you selectively destroy documents — for example, destroying some intake files or emails but not others — unless it’s part of your policy. Otherwise, it may look like you were trying to hide evidence, and again might make you vulnerable to criminal charges. Just stick to your policy.

We have a clear case here of "destroying some emails but not others". They say it was a series of computer crashes, but if so, it was a series of remarkably selective computer crashes that the IRS is only now, after a year's time, telling Congress about.

When did the IRS put this retention policy in place?

I'd like to know if it was something done recently, or a technological anachronism that survived longer than it should have. I want to know when it was put in place, who made the decision to institute it, and what the policy was beforehand. Assuming, of course, that it wasn't simply made up on the spot from thin air.

This entry was published Tue Jun 17 13:58:58 CDT 2014 by TriggerFinger and last updated 2015-03-05 04:11:38.0. [Tweet]

comments powered by Disqus

This website is an Amazon affiliate and will receive financial compensation for products purchased from Amazon through links on this site.