Data Life Cycle Management

Electronic information management is now a primary business and legal concern. Sarbanes Oxley, information security, expanded electronic discovery demands, and new penalties for spoliation of evidence have made "document retention" an issue of urgency for general counsel.

The term "document retention," however, is a profound misnomer. "Document" refers to physical media, like paper. And "retention" implies there is either an "original," existing as unique artifact, or at least physical copies which can be stored in a known place.

But business no longer deals in documents. It deals in "data." And data is given life by ever-flowing electronic records - shared by users on multi-layered networks. Such records are accessed, edited, and transmitted with applications that defy understanding. Electronic data is then saved, not in boxes or files, but in a myriad of "media" that challenge accountability. Data resides in our servers, laptop computers, handheld devices, old back-up tapes, CD-ROM disks, thumb drives, and, now, even cellular phones and pagers. As a result of such dynamics, a new complexity has emerged.

The sudden evolution of such an "information ecosystem" poses fundamental questions: Can we still control information, so as to comply with law and business strategies? How do we know who is changing which records? What is authentic? How is a record "private?"

How Do We Impose Internal Control

Therefore, how do we impose "internal control" over our share of the information ecosystem? Clearly, a new approach to information must evolve. Business should remember that:

  • Electronic information exhibits complex behavior that was absent in the age of documents. When there is an audit, or a dispute, the evidence comprises snapshots of our electronic ecosystem.
  • Data management is not just the domain of the IS department. It is the combined concern of the CEO, General Counsel, CFO, outside counsel and auditors with expertise in the field. Strategy requires teamwork.
  • There are severe penalties for destruction of information, even if inadvertent.
  • With appropriate data management, electronic discovery in litigation is facilitated, and reduced in cost.
  • With appropriate data management, proof of the authenticity of one's own records is possible, whereas before, it was problematic.
  • There are rapidly emerging privacy rules exemplified by HIPAA, California's SB-1386, and directives from Europe that deal not with retention of data, but with access and destruction. Such rules co-exist with retention, thus necessitating a "life cycle" for data.

Missteps in handling information can cause business failures. These are now both more likely and more severe than before-- given the new, more dynamic behavior of information. Such issues are legal concerns, because the obligations to control information constitute legal obligations, and the risks of lack of control are legal risks.

Finally, because the Enterprise exists only in so far as its ability to control information, there is no choice but to adapt.

The Problem

We are awash in electronic information. It is smeared across our technology systems. Managing this morass of information is one of the most serious problems facing business today. Companies, large and small, often don't appreciate the ramifications until it is too late and they are already at risk of serious legal liability. At that point, trying to fix the problem can be dearly expensive, and is often unsuccessful.

During the seemingly ancient days before computers and networks became the predominant business paradigm, most data was kept in the form of laboriously typed paper documents that were neatly filed and, at the end of their life cycle, either thrown in the trash or filed away in document storage facilities. We have personally searched for old documents in places such as an abandoned Colorado mine shaft (hazardous waste suits required), a forgotten storage building in the middle of the Puerto Rican jungle (mosquito netting required), and the garages of long retired engineers (great patience required).

Because paper document preparation and storage were both burdensome and expensive, there was a natural limitation on the volume of documents produced and subsequently preserved. But computers and electronic data storage have changed all of that. Electronic document generation and storage is simple and cheap. The digital equivalent of a warehouse full of paper documents can now be stored in a small server no bigger than the average computer. An encyclopedia can exist on a thin piece of plastic. As a result, rather than cull old files before storage, it is much quicker, easier, and cheaper just to save everything.

The Need for "Data Life Cycle Management"

So, this is a good thing, right? Not really. Data Life Cycle Management requires thoughtful procedures governing what data to keep; what data to discard (we never say "destroy"); and critically, how to control what's left.

First, there is a growing collection of federal and state laws, as well as international rules, that require companies to preserve specific data for prescribed periods of time. For example, Sarbanes Oxley, HIPAA, and the Internal Revenue Code establish data retention and reporting requirements. The European Union Directives establish stringent policies governing the collection, storage, and use of personal information that will apply to any foray into international commerce. The FTC is following suit. Failure to abide by these regulations can subject companies to fines, penalties, and the forfeiture of the privilege of doing business in the respective jurisdictions. And the federal courts are now imposing strict penalties for spoliation of evidence.

So the answer is simple: just save everything. Wrong. If a business saves all of its electronic data, whether required to or not, litigation opening that data up to review can result in the company literally drowning in its digital waste. Electronic discovery can be debilitatingly expensive and may turn up difficult-to-explain documents or emails that the company had no obligation to save prior to the litigation. And once litigation is foreseeable, disposing of potentially relevant data can result in court ordered sanctions or even a default judgment.

We have seen many examples where the failure to properly save or properly dispose of electronic data either saved the ship, or sank it. Arthur Anderson and Enron went down, at least in part, because they illegally attempted to destroy documents after litigation was on the horizon. Likewise, we can't forget the stock broker e-mails in which brokers trashed the very stocks they were trying to sell to the public.

The Beginnings of A New Policy

With thought and planning, businesses can create electronic data life cycle policies that will eliminate potential liability associated with either saving too much or too little of the company's data; or more important, of losing control of what is there. An electronic data life cycle policy can be built around several basic principles.

Identify The Objectives of the Enterprise

First, it is critical that each company self-consciously devise "Objectives" in maintaining its electronic records. This function has become more important now than ever, as it is easier to treat all records the same--just save everything, and let everything go unprotected on the "network." But depending on the business, different types of records have vastly different degrees of importance. Universities treat their academic records as sacrosanct, whereas payments for lawn care might be much lower on the scale. Companies hired to keep track of individuals crossing national borders might have strict record keeping priorities and objectives for certain databases, but not others, such as their own payroll. Health and insurance records are treated one way, and payment for staples and copy services another. Information categories are not the same. Importance and function vary by orders of magnitude in any enterprise.

Accordingly, the first job is to prioritize. What is critical for the business? Devise information life cycle management with such critical records in mind.


Next, data should be discarded unless there is a good business and/or legal reason to retain it. Implementation of this principle requires that a company, once again, take a hard look both at the types of data it collects and the regulatory constraints relating to that data. Data should also be preserved if it is potentially relevant to any ongoing or foreseeable litigation, now known as the Zubulake standard. The overall goal is to comply with law and to achieve business objectives, but not to save data that is not required by law or for business purposes. And given the fact the digital files can be copied ad infinitum onto different media, unless one controls access to data during the time it is stored by a business, one loses control over the ability to discard information. Maintaining such control is no easy task. Achieving it puts one on the cutting edge of business process.

Training and Simplicity of Procedure

Of course, for everyone, a data life cycle policy must be simple and easy to implement. As with all things corporate, there is a strong tendency for policy initiatives to become increasingly intricate to the point of dysfunction (only interpretable by those with graduate degrees in operations research). Once the policy becomes too complex, it is virtually guaranteed that employees will simply ignore it.

For example, during the beginning of the Internet boom, the National Security Agency created complex internal rules for the transfer of sensitive data from one NSA employee to another. Rather than comply with the rules governing NSA's secured systems, employees discovered it was easier to simply send data to each other, around the NSA systems and through Internet as email, thus defeating the policy. Therefore, any policy should strive for simplicity by establishing a limited number of broad subject matter categories and functions. While simplicity might result in some over-inclusion of data retained, it nevertheless increases the chances employees will actually comply with the policy.

Adequacy of Infrastructure

There must be adequate hardware and software to accomplish the task, once objectives and policy are identified. It is, indeed, often astounding how "out-of-scale" a company's infrastructure is to accomplish appropriate data management. Here, as elsewhere, teamwork between higher management and IS workers is critical. Hardware is seldom the problem. The problem is the software and human systems infrastructure relating to information security; access control; authenticity; retrievability and auditability.

Information Security

Perhaps no practice can enhance Data Life Cycle Management better than appropriate "information security" procedures. A primer on information security is beyond the scope of this article. But how else can one ensure that shared records are not improperly accessed or edited? How else a company keep its valuable information from being stolen, for example, and sold to spammers? Given that such a theft incident may now trigger notice obligations, and perhaps liability, information security reigns supreme. It is fundamental to protecting the assets of the enterprise. It is fundamental to Data Life Cycle Management.

Authenticity of One's Own Records

Give thought to how one might prove the authenticity of one's own records if they are ever challenged in court, an administrative proceeding, or an audit. This implicates the need for proactive procedures. Authenticity, which has been stretched to the breaking point by the new information paradigm, should no longer be taken for granted.


One of the major problems with electronic record keeping is that when a request for information does come--for example in discovery in litigation--it can be a six to seven figure chore just finding the data that formerly could easily be retrieved from a set of file cabinets. Hence one of the hallmarks of the metamorphosis from a document/ record/ file keeping culture to a culture of data multiplying on a shared networked, edited by many and stored on scalable media.

Accordingly, far more advance planning is now required to ensure future retrievability of data. Law firms, strange to say, are in the vanguard of businesses in this respect. Their handling of huge numbers of different types of electronic files for many different customers has led to databases that facilitate filing by subject matter, with automatic indexing, and easy retrievability. This "subject matter centered" database control of data has yet to make it into the mainstream of businesses' data storage.

Businesses, therefore, must attack a mounting data retrievability issue. Just complying with one discovery request, when one has data stored hodgepodge on various media in various types of systems, could pay for an entirely new infrastructure. Don't be penny wise and pound foolish.

Distribution Controls

The interactive ease of networks, including that network of networks, the Internet, mean that once access to specific data is acquired, the data can be transmitted to countless destinations around the globe in a matter of seconds. Once the digital genie is out of the electronic bottle, no amount of wishing can contain it. Every day there are new examples of this phenomenon. It can involve the public release of valuable intellectual property, such as the case of the Swedish man who released highly confidential DVD source code on to the internet. Or it can involve the unconsented dissemination of personal and private matters, such as the apparently unauthorized released of Paris Hilton's "homemade movie." (Remember, there is no such thing as bad publicity).

At least when it comes to business data, unauthorized access can largely be eliminated by employing network security controls. A different problem, however, arises regarding the distribution of data by persons who have legitimate access to the data. Whether the situation involves complex project files shared by a team of engineers or a simple email communication, uncontrolled electronic replication can be a disaster. One solution is to employ one of the available software solutions that encrypt the data and allow the sender to specify the degree of republication rights granted to the recipient. Sophisticated companies are beginning to utilize these types of solutions as part of their overall data management strategy.


The idea behind the PCAOB's new Auditing Standard No. 2 is "internal control" over information. Public companies, obviously, will need to pass an audit of their financial statements based on Auditing Standard No. 2. Their management of information will need to pass the various types of tests auditors will devise, in the future, to gage financial data as represented in electronic records, from the transaction level on up to the Income Statement and Balance Sheet. Unless the Data Life Cycle Management system can pass an audit, a company is put in an unfortunate situation indeed. Accordingly, a sound Data Life Cycle Management Plan is at the same time a Sarbanes Oxley compliance program. Along these lines, all companies should seriously consider the use of "hashing," "digital signatures," and logging of network events to provide a framework of "testability" for the information flowing in their ecosystem at any point in time.


Data retention practices must be consistent. Inconsistent document retention actions will create a taint of intentional spoliation and wrongdoing. It's hard to explain why you discarded data pursuant to your 3-year-old policy for the first time just three days before being served with that antitrust complaint. Therefore, if you have a data retention policy, make sure it is implemented consistently. If there are dates or milestones for data review and disposal, they should be adhered to.


And enforcement must be simple and consistent. The policy should use both automated systems to dispose of unnecessary data and procedures to motivate employees to appropriately deal with the rest of the data that cannot be picked up through automated systems. So, for example, unnecessary email accumulation can be limited either by strictly controlling the size of employee mailboxes, thus forcing employees to delete old emails in order to receive new ones, or by automated systems that automatically dispose of old emails after a set period of time (i.e., 30 days).

For example, backup tapes can present an glitch in a data retention policy. Backup tapes are intended primarily for one purpose: the emergency restoration of a computer system following a crash. Unfortunately, many businesses make the mistake of saving multiple sets of backup tapes as data archives. This action, alone, can create a nightmare of enormous litigation discovery costs if it is ever necessary to search those tapes. Therefore, a company should limit itself to no more then two sets of backup tapes, which are consistently recycled at particular intervals to capture the existing computer system. This procedure will make it unlikely that the backup tapes will ever be successfully demanded as a source of old and otherwise disposed of data.


Information management is a dynamic concept that has, and will continue, to change in co-evolution with the gigantic "morph" of information from artifact to ecosystem. Therefore, establishing data life cycle management policies is not a one-time process. The advent of electronic data storage and digital communications has provided business, consumers, and the public with untold benefits, including access to vast amounts of information and incredible speed in analysis and distribution. Implementing and maintaining a data life cycle management system is a small, but necessary, price to pay for continuing to be a player in the marketplace.