An Introduction to Electronic Discovery
Minimize

With over 90% of new information being stored electronically, and billions of emails being sent daily, key evidence is likely to be stored not on paper, but electronically. Twenty years ago, if one wished to send a letter to a friend located across the country, one would write them a letter. If this letter was suspected to contain information relevant to a lawsuit, a physical search could be conducted for the letter - one might look in filing cabinets, desks, recycling bins, etc. Now, such correspondence would likely be sent over the internet, through the use of email. While an email may exist in hardcopy form if it is printed out, most emails are stored electronically, and a search for an email will involve a search of the sender and the recipient’s computers. Moreover, an electronic version of an email, or any other electronic document or file, will contain information beyond that which is provided in a hardcopy print-out of the email. Like fingerprints on a handwritten letter, an email contains hidden metadata which provides a wealth of information that is not immediately visible.   

Society’s increased reliance on computers poses both advantages and challenges for the legal community, as information that used to be inaccessible is now available. ‘Deleted’ documents may be recovered; GPS devices may pinpoint someone’s exact location. Since this information may prove to be invaluable in the context of a lawsuit, issues have arisen with regards to how best to access the information, and what duties arise for its preservation and production. Properly conducted electronic discovery may require the aid of computer forensic experts, and depending on the scope of the investigation, may become quite costly. According to Socha Consulting, approximately $2.8 billion dollars was spent on electronic discovery services in the United States in 2007, with that number expected to grow to almost $4.7 billion by 2010.¹

Paper versus Electronic Evidence Stored Information (“ESI”) – Is There A Difference?

There are clear and distinct advantages to obtaining the electronic versions of paper documents in discovery requests for production. Although it is more convenient to review a paper document or business record, such a review lacks some of the ‘hidden’ information contained in the electronic version such as the date the document was created, the identity of the author and subsequent editors, the distribution route and the history of editorial changes.

Authenticated computer files are considered ‘best-evidence’ since there are no issues with photocopy versus original document. In addition, ESI can be found in varied and multiple locations and formats, making it difficult for all copies of a given document to be intentionally destroyed. By far the most voluminous type of evidence is email; these typically exist only in electronic form and are generally ‘unsanitized’.

On the negative side, ESI can be physically harder to locate and capture than paper documents, which may often be limited to only one or two locations. ESI can also be easily and unknowingly destroyed by simply turning on a computer, by saving new files, and by the rotational reuse of back-up tapes, all in the normal course of business. 

 
What is Electronic Evidence and Where is it Found?

Types of Electronic Devices

Today, corporations and their employees can purchase a wide variety of electronic devices, all of which contain potential electronic evidence. In addition to laptop and desktop computers, ESI can be found in access control devices (smart cards, dongles), digital answering machines, digital cameras, PDA’s/ Blackberrys, cell phones, portable hard drives, memory cards and thumb drives. The potential evidence that can be found in each of these electronic devices obviously depends on the specific features of that device. Some sources of electronic evidence are obvious; the text of a Word document file that has been saved onto a personal computer is easy to find and recover; one would expect that anyone with access to this computer may view such a file. Less immediately obvious is the meta-data associated with that document.

Metadata

Metadata is information stored about a particular file that is not immediately viewable upon opening the file; in other words, it is hidden data. It is often described as "data about the data." For instance, an email stores the IP address from which the email was sent, as well as information about anyone who was ‘blind copied’ to the email. A Word document stores information about who created the document, who last modified or accessed the document, as well as the dates when the document was created, modified, and accessed. Information about the changes made to the content of a document is also stored as metadata. Most word processing programs have “undo” and “redo” functions, as do most photo editing software and other computer programs. The metadata is the reason why undo/redo functions work. Metadata can contain a wealth of information, but it is extremely sensitive, and can be inadvertently modified simply by opening a file. It is important to recognize that metadata does not appear on the printed page. Since metadata could contain potentially valuable information in respect of a litigation, it is imperative that, if this is indeed the case, little or no reliance be placed on the printed page that is presented as evidence by the opposing party. It is also important to note that different applications collect and store metadata differently - even versions of the same program can treat metadata differently.

Deleted Files

Deleted files are an example of electronic evidence that one may not expect to be recoverable. After all, doesn’t deleting a file mean getting rid of it entirely? Not exactly. When you recycle a paper document, the possibility of it being recovered exists while it sits in your recycling bin, when it is being transported to a recycling plant, and even after arrival, while it waits to be turned into pulp. The same is true of an electronic document. Most people are aware that files placed into their computer’s ‘recycle bin’ are easily recoverable by opening the recycle bin and selecting ‘recover file’. Less obvious, perhaps, is that the file may still be recoverable after the “empty recycle bin’ function is used. The information stored in the deleted file is not destroyed at this point – rather, it is essentially sitting at the “recycling plant”, waiting to be re-used. A file is not entirely ‘deleted’ until it is over-written by another file. Documents which, to the average user, appear to have been erased may in fact be recovered by a computer forensics expert or an unerase/undelete software utility.

Another location where ‘deleted’ data may be found is on backup tapes. Backup tapes are file archives; copies of all files on an office server are made on a regular basis in order to safeguard against disasters and provide for data recovery. While backup tapes are frequently re-used, with older archives being overwritten by newer ones, companies sometimes store old backup tapes for years. While the files on backup tapes are not stored in a readily usable form, they may be restored at any time by a computer forensics expert, providing access to literally millions of files. Any files that were active when the backup tape was created would be persevered, regardless of whether the files were later modified or deleted.

Files that may be recovered through backup tapes and other recovery methods are not limited to Word documents, spreadsheets and other such files. Files that seem transitory, such as emails and even instant messages, may be stored on the company’s server and copied onto backup tapes. This includes messages sent with the office mail server using a handheld BlackBerry cell phone. Even if the BlackBerry user is accessing their email far from their office building, messages are still processed and stored on the company server. Using instant messaging systems carries the same risks. Messages sent through programs such as MSN, ICQ and AOL Instant Messenger are archived on the user’s hard drive if the user has selected the ‘save message history’ option, available with most instant messaging software. If not, the messages may still be logged if the company uses a private instant messenger program, or if it has instant messaging security software installed. With such systems, messages can be archived and saved on the central server. As most, if not all, employment agreements dictate that an employer may access any file on a company-owned computer, employers may review this correspondence whenever they wish.

Files stored on personal computers, servers, and backup tapes are but one type of electronic evidence. There are a multitude of different devices on which to store computer files, including zip drives, USB keys, CDs, and memory cards. Additionally, electronic information is constantly being collected about one’s personal habits and movements. Electronic door keys may monitor when one enters a room or a building. Fax machines may store copies of incoming and outgoing faxes. Using a cell phone creates a record of any text messages sent or received, as well as a record of incoming and outgoing calls. This record also may contain the location of the user when the call was placed. Using credit or debit cards creates a record of one’s geographical location as well as spending habits.

 
What is eDiscovery?

The average personal computer contains thousands of active files. When one adds deleted files and files obtained from backup disks to this, the number of files can quickly jump into the millions. Obviously, it would be impossible to review this number of documents manually. While the industry-standard eDiscovery process map (EDRM or Electronic Data Reference Model) is relatively complex, eDiscovery is essentially the collection, review, and production of electronic data files. A critical aspect of the eDiscovery process involves the use of specific technologies to pare down the huge number of potential sources of relevant information, reducing them to a smaller number of potentially relevant documents. This is done partially by hiding duplicate files and filtering out files which are clearly not relevant, such as system files. These two steps alone can significantly reduce the amount of data to deal with; often by over 70%.

The first step of the e-discovery process is collection. Before data can be collected, one must be aware of where relevant data may be stored. Will it be enough to examine the active files on one specific computer? Will email be relevant? Should backup tapes be collected? Depending on the facts of the case, relevant data may be stored in a multitude of different places, including office computers, laptops, CDs, PDAs and even iPods. Once the location of the data has been identified it can be collected, and then reviewed.

The second step of the eDiscovery process is review. Simply producing all data found during the collection phase would be incredibly inefficient and costly. When a hard drive is imaged, every single file is captured, most of which would not be relevant in litigation. Files with “.exe” extensions are executable files which execute computer programs – word processors, web browsers, and games are all executable files. Such files would never contain data relevant to a legal proceeding as they are responsible only for running a program and contain no data unique to any particular user. Such files, therefore, may safely be removed from the data set.

Another method for reducing the number of files collected is de-duplication. When data is collected from multiple sources, including backup tapes and hard drives from multiple users, there will be a large number of repeat documents. For example, if a file was attached to an email and sent to six different users, that file may appear a total of seven times in the data set. If none of the users made any changes to the file, then their inclusion in the data set is redundant – they should be marked as duplicates and hidden.

Once clearly irrelevant files, such as executable files and duplicate files, have been removed from the data set, the remaining files can be reviewed to determine which are relevant to the litigation. Files may be searched with simple ‘keyword’ searches, or with the aid of more advanced concept searches. While a search for the word ‘theft’ in a keyword search would highlight only the instances where the word ‘theft’ appeared, a concept search would also return results where ‘steal’, ‘shoplift’, ‘rob’, ‘nick’, etc appeared. Files can be reviewed in their native format, or as PDF or TIFF files. Files reviewed in their native format have the benefit of being seen in their original form; in the context of the way they were created and meant to be viewed. A Microsoft PowerPoint file would be reviewed in Microsoft PowerPoint, and therefore, all special effects would be available to the reviewer. PDF and TIFF files are scanned images of the original file; they display only the visible information contained in a document, so, for instance, hidden fields in a spreadsheet file would not be captured. Using optical character recognition (OCR) technology, these files may be searched for relevant text.

The final step in the e-discovery process is production. Files that have been deemed to be relevant to the litigation must be provided to the lawyer(s) working on the file. Generally, this is done by creating a PDF or TIFF of the file, and uploading it to a litigation support system, such as CT Summation, Casemap or Concordance. Opposing counsel must be provided with paper copies of the files, or with a CD containing electronic versions of all relevant files, which may be in TIFF, PDF, or native file format.

Summary

Electronic evidence is fast becoming the critical tool-of-the-trade in litigation – the winning edge. For each lawsuit awaiting those corporations who are unprepared to deal with electronic evidence, there exists an opportunity for those who do prepare themselves for such an eventuality.

 

¹ George J. Socha, Jr., "2008 Socha-Gelbmann 6th Annual Electronic Discovery Survey", online: Socha Consulting www.sochaconsulting.com

This article is excerpted from: Electronic Discovery in Canada: Best Practices and Guidelines, by Oleh Hrycko, CCH Canadian Limited, Toronto, 2007. For more information call H&A eDiscovery @ 416-233-5577. To purchase this book, visit www.cch.ca/eDiscovery.

Back to Publications

The Impact of Similar-Document Review Technology

Using technology to group similar-documents and code them in the same manner, rather than reviewing and coding them individually, can save significant document review time.   More

H&A eDiscovery Releases 2nd Edition of ’Electronic Discovery in Canada’ Book

Electronic Discovery in Canada provides introductions to the unique characteristics of electronic data and to current Canadian eDiscovery practice directions.   More

 
H&A Forensics
Copyright 2009 by H&A eDiscovery