E-Discovery Collections also known as Electronic Evidence Discovery (EED) or Electronic Data Discovery (EDD) can include a review of all the data stored on employee desktop or laptop computers, company servers, camera cards, cell phones, smart phones, GPS devices, digital video recorders, digital answering systems, thumb drives, RAID arrays and any other form of electronic media capable of storing data.
Types of Electronic Discovery Content
Employee Work Product – Computer Files are by far the most common arrangement for a forensic e-discovery collection. Files (also referred to as loose files or active files) are similar to their paper equivalent. They can be copied, moved, and even “shredded”. Work product could include sales reports, QA reports, product or service information, client lists, engineering designs and much more.
Employee Correspondence - Email has practically replaced letters and interoffice memos. A forensic e-discovery collection of correspondence is often a critical piece and can often contain the “smoking gun”. What someone said, to whom, and when are some of the first questions asked in a legal matter. Since emails are a form of documented communication, they comprise highly sought-after data when it comes to legal matters. Emails themselves may be contained in databases, files, or unallocated space.
Customer Relations and Accounting Data – Customer lists, internal notes, and financial records are also a critical component in forensic e-discovery collection or computer forensic investigations. Properly collecting the live database files that store this information can be a challenge. Single entries in a database often require export to another format in order to be useful or even readable by humans. Most databases include this ability.
User Logs – Collecting user logs isn’t always as relevant in an e-discovery collection/review as it is in computer forensics analysis, however, they can be and are worth mentioning. User logs will contain entries about the activities performed on a computer and different user accounts. Attorneys may want to know when emails were sent or received between accounts in case the emails were deleted. Log entries may require conversion into human-readable form before they can be processed.
Raw or Unallocated Data – Unless a forensic image of the source data has been requested a forensically sound e-discovery collection will focus on “active” files. However, it is helpful to understand the difference between “unallocated” and “active” data. Raw or unallocated data is data that resides in segments of the storage media (hard drive, camera card, etc) that are not being used by files. This data can contain all or part of files that were once referenced in the file allocation table but were subsequently deleted. Much of this data can even survive a reformatting of the disk itself. Since this data can come from any number of sources that had once been active on the drive, it can make or break a case where it is suspected that deletions may have occurred.
Tools for Forensic E-Discovery Collection
With the exception of unallocated space, tools such as One Click Collect Harvester from Pinpoint Labs have the ability to collect loose files, emails and whole databases with the added benefits of being able to specify key words, date ranges, domains and email addresses among other very useful filters.
Tools for collecting the unallocated space on a drive usually require an experienced forensic examiner in order to get useful interpretations of the data collected. In cases where this is necessary, it is recommended that a certified computer examiner be hired for the collection and analysis of the data.
PST Regeneration is used during electronic discovery processing or even during an ESI collection. A Personal Folder File (PST) is a container file created by Microsoft Outlook which stores email messages and other data (i.e. contacts, calendar entries, tasks, to do list etc.)
How it’s done
Regenerating PSTs refers to the identification, isolation and often deduplication of electronic mail (email) messages that pertain to a specific legal matter in civil litigation cases. The filtered email messages are copied to a new “regenerated” PST file. The resulting PST can be considerably smaller than the original and results in the following benefits:
1) Quicker attorney review
2) Electronic Discovery processing and hosting cost reduction
3) Significantly smaller ESI collection
Practical application
PST regeneration is commonly used when there are dozens of archive (backup) PST files that contain many duplicate messages. It is a common practice for companies to set up Microsoft Outlook or Exchange servers to create daily, weekly or monthly PST backups of employee email messages.
The result is potentially dozens of employee backup PST files which contain duplicate messages. Why? Each backup will contain many of the same messages as the last. Only new emails sent or received (that have not been deleted) since the last backup will be considered “unique” to each PST. Regenerating PSTs with only one copy of each email (deduplication) significantly reduces the number of messages and the size of the PST data to be processed or produced.
Maintaining defensibility
Significant cost reductions related to electronic discovery processing and hosting are gained by deduping, performing key word, date range, and email/domain filtering on the emails in PST files. However, it is critical to use an application that is designed to regenerate PSTs in a defensible manner and maintains the chain of custody.
Software such as Harvester from Pinpoint Labs (designed by Certified Computer Examiners (CCE’s)) can regenerate PST files at the point of collection or during in-house processing. Harvester also creates an extensive verification log (chain of custody) for all copied and duplicate messages.
What to remember
Creating deduped, targeted PSTs is common practice in the electronic discovery lifecycle because it saves clients a considerable amount of money as well as reducing attorney review time. PST regeneration may be performed onsite (during an ESI collection) or in-house to cull down responsive data.
ESI (Electronically Stored Information) is the general term for all of the data stored on the hard drives, camera cards, cell phones, GPS devices, digital video recorders, digital answering systems, thumb drives, RAID arrays and any other form of electronic media capable of storing data.
Types of Electronically Stored Information:
Files – Files are by far the most common arrangement for ESI data. Files (also referred to as loose files or active files) can be thought of as data containers similar to files in the real world. They can be copied, moved, and distributed freely on a variety of different media from DVDs to hard disk drives.
Emails - Emails are messages sent from user to another. In their raw form, they are simply a stream of data that contains everything needed to get the message from one user to another user. Since emails are a form of documented communication, they comprise highly sought-after data when it comes to legal matters. Emails themselves may be contained in databases, files, or unallocated space.
Database Entries - Database entries is data stored in a database. This type of data is usually context-specific and may be information pertaining to financial records, personnel entries or other data that is interrelated. Single entries in a database require export to another format in order to be useful or even readable by humans. Most databases include this ability.
Log Entries – Log entries are lines in files or entries in databases that contain information about activity on a particular computer. The more commonly useful log entries pertain to users logging into and out of a computer, accessing specific internet sites, the sending or receiving of email or other messages and the moving, copying or accessing of files on the computer. Log entries may require conversion into human-readable form before they can be processed.
Raw or Unallocated Data - Raw or unallocated data is data that resides in segments of the storage media (hard drive, camera card, etc) that are not being used by files. This data can contain all or part of files that were once referenced in the file allocation table but were subsequently deleted. It can also contain deleted internet history, old information from the computer’s RAM (Random Access Memory) or even old configuration data about the computer itself. Much of this data can even survive a reformatting of the disk itself. Since this data can come from any number of sources that had once been active on the drive, it can make or break a case where it is suspected that deletions may have occurred.
Tools for Collecting ESI
With the exception of unallocated space, tools such as One Click Collect Harvester from Pinpoint Labs have the ability to collect loose files, emails and whole databases with the added benefits of being able to specify key words, date ranges, domains and email addresses among other very useful filters.
Tools for collecting the unallocated space on a drive usually require an experienced forensic examiner in order to get useful interpretations of the data collected. In cases where this is necessary, it is recommended that a certified examiner be hired for the collection and analysis of the data.
Each day, corporate IT managers, computer forensic examiners, and litigation support professionals are tasked with performing ESI collections for relevant files which reside in file shares, on client systems, and other popular data sources. The content may include Microsoft Exchange mailboxes, departmental data, individual custodian files, internet logs, telephone logs, or other critical corporate content.
Over 4 years ago, Pinpoint Labs released SafeCopy version 2.0 (SafeCopy 2) which alleviated several common problems encountered when using alternative copy utilities to collect client files. Here are a few of those problems that the SafeCopy 2 upgrade addressed:
In September 2009, Pinpoint Labs released One Click Collect – Harvester (Portable/Server), which was a new product that included the proven SafeCopy 2 engine. The Pinpoint Harvester 2.0 ESI collection software includes:
![]() |
||
| Great for Legal Holds | ![]() |
![]() |
| Preserve Metadata and Time Stamps | ![]() |
![]() |
| Filter by Extension and Date Range | ![]() |
![]() |
| Select from multiple data sources | ![]() |
![]() |
| Compatible with all electronic and litigation platforms | ![]() |
![]() |
| 100% File copy verification | ![]() |
![]() |
| Extensive chain of custody report | ![]() |
![]() |
| Process file lists | ![]() |
![]() |
| Resume easily | ![]() |
![]() |
| Supports path lengths greater than 255 characters | ![]() |
![]() |
| Transfer licenses quickly to another location | ![]() |
![]() |
| Create and deploy remote collections | ![]() |
|
| Keyword Filter MS Outlook PSTs | ![]() |
|
| Keyword Filter Loose Files | ![]() |
|
| Keyword Filter Attachments | ![]() |
|
| Keyword Filter Archives | ![]() |
|
| Dedupe and Filter Multiple PSTs | ![]() |
|
| Regenerate New PSTs | ![]() |
|
| Export Emails to 8 Different Message Formats | ![]() |
|
| Remove System Files Listed in NSRL (deNISTing) | ![]() |
|
| Filter by Header Signature | ![]() |
|
| Create Portable and Automated Collection Jobs | ![]() |
|
| Preconfigured Work Orders | ![]() |
|
| Can Be Used for In-House, Production-Level Culling (deNIST/dedupe) | ![]() |
|
| Scriptable Profiles and Collection Jobs | ![]() |
|
| Easily Save and Reuse Job Settings | ![]() |
|
![]() |
||
Pinpoint Labs has a proven record of developing defensible, affordable ESI collection software. Many Fortune 500 companies, government agencies, and computer forensic professionals rely on SafeCopy 2 and One Click Collect – Harvester every day.
We will be attending LegalTech New York and I wanted to invite you to come by and visit our booth. We will be in booth #429 on the 1st floor of the exhibit hall. There are several updates to our software applications that we will be demonstrating, and if you attend, I would appreciate the opportunity to meet with you.
I’ve been participating in LegalTech shows for more than ten years, and it’s a great way to see how computer forensics has filtered into the litigation processes. File collections, electronic discovery, and review have been increasingly influenced by computer forensics due to the large amounts of electronically stored information relevant to litigation cases.
When I first attended the LegalTech shows in the mid 90’s, the initial popularity of imaging (paper scanning) and managing paper based documents electronically was evident. A new breed of software emerged to handle the demands, to create full text searchable versions of the images, to endorse the documents (electronic bates numbering), and to create load files so the information could be imported into litigation support databases and review tools.
Towards the end of the 90’s, “electronic discovery” began to take shape. An increasing proportion of relevant documents were files and emails on computers versus paper stored in filing cabinets. A new wave of applications appeared that could convert the files directly into images (print to TIFF). Harvesting metadata of file contents and indices also emerged, along with creating searchable databases.
Now it is a decade later, and we can see that the majority of relevant documents are electronic and the means to preserve them have become widely recognized. The Federal Rules of Civil Procedure were amended to accommodate the new electronic world we live it. The requirements to properly preserve electronically stored information (ESI), establish timelines, examine metadata and recover user activity, deleted files and emails, and several other critical tasks has created the need for computer forensics and electronic discovery professionals to work together to conduct electronic based litigation.
Wow! – The last 15 years have been packed with some drastic changes in the way we collect, filter, and review documents. This new era that requires litigation support and computer forensics professionals to work hand in hand is challenging and rewarding. Our professional goals include developing products, services, and educational materials that help guide legal departments, service bureaus, and computer forensics experts through this changing environment…
If you are in New York next week, please drop by our booth (#429) so we can visit. LegalTech New York runs from February 2-4 at the New York Hilton and Towers. By working together and sharing our knowledge and experiences, we will continue to improve the processes and support tools available.
It’s important to understand that deleted email is not recovered or indexed using common litigation support or electronic discovery software. These applications only process email that is still visible within the email software.
Some email recovery software can also fall short when restoring deleted email records. Why is that? Because they are designed to undelete email records that still have an entry in the mail store index. Unfortunately, many mail stores will remove those entries once the database is compressed. So many people believe that email cannot be recovered once the mail stores database has been compressed. However, this isn’t always the case.
Deleted email content may still be intact and recoverable. By using software tools designed to ‘carve’ email data, it is still possible to recover the original content. Using the following steps, email can often be recovered even after typical recovery tools fail.
1) Use Winhex, EnCase or other file recovery tools that can recover email fragments
2) Import recovered files (MBOX) through Aid4Mail into Paraben’s Email Examiner
3) Export email and attachments to msg, pst and other formats
Using the same approach to recover email as deleted files can often provide better results than doing a recovery on the individual mail store. As mentioned above, when performing recovery on Mozilla Thunderbird mail stores and others, many programs only recover what is still listed in the index files. If these files are missing, corrupted, or no longer contain the email record, you can try Zmeil from Zero Assumption Recovery
(http://www.z-a-recovery.com/zmeil-email-recovery.htm). Zmeil doesn’t rely on the mail store index; it parses the data files and is a great tool to use for additional verification of recovered email data. Zmeil works great as an inexpensive standalone email recovery tool.
Email communication is often a critical piece of the electronic discovery puzzle. Deleted email doesn’t get fully processed with common electronic discovery software. If you believe you may miss critical evidence because a custodian deleted important emails. then a specialized recovery process should be performed by someone with the appropriate training and knowledge of the process.
Copying corporate data and using it at a competing company (intellectual property/corporate asset theft) is a common and serious concern for companies and their legal counsel. When employees leave companies, there are often questions about the security of the information they previously accessed. Will they use the contacts, forms, or product details as a competitive advantage in their new job?
I had previously written about how to use the file activity records located in the index.dat file to identify when files were accessed. This can help determine if files were copied from a corporate file server. I want to expand on a couple of additional artifacts that can be used and then provide an illustration. There are three primary artifacts that can be used to help determine if someone accesses and copies specific files using an external drive, CD/DVD, flash device, or other storage media.
1) USBStor Registry Entry – Microsoft Windows uses its registry to track information about the computer’s users, operating system, hardware, applications, security, and other relevant information. When USB devices are plugged into a computer, several key artifacts are captured including the make, model, serial number (if available), and when the device was plugged in.
2) Index.dat Access Record – Microsoft Windows uses the index.dat file to track website activity in Internet Explorer. It also contains when and from where files were accessed. We often have to recover deleted or purged activity using programs like NetAnalysis to do a thorough analysis. NetAnalysis can often recover hundreds of thousands of records that are no longer available in the index.dat files on the system.
3) Link File (.lnk shortcut) – Shortcuts can be created by a user and are commonly stored on the desktop. Microsoft Windows also automatically creates shortcuts for files that are accessed in .lnk files. These files store a wealth of information about the source document, including the path, date and time created, written, last accessed, size, volume serial, and several others. This information is encoded and requires special software to display it in a format that is useful.
4) “File Sniper” - Use a product like Harvester from Pinpoint Labs to create a hash list of the suspect files and scan all locations where the files could be in use. It isn’t uncommon for a computer forensic examiner to be asked if there is a way to create a list of files from a corporate network or employees system and check if they are in use by a competitor.
By using the above artifacts, it is possible to determine that files located on a company server or client machine were copied or accessed after a specific date and time. Note that this doesn’t provide the contents of the file and a thorough review would be necessary to make sure it is the same file. However, if the file name and other relevant metadata is a match, it does appear suspicious and may be enough to construct a solid argument that the employee did copy or burn files, access the contents, or used the information. This may lead to criminal and civil charges around possibly benefiting a future employer or a new company that the employee decided to start.
When examining or processing the files on a hard drive, it is extremely important to retain the original file contents and time stamps. Many people don’t realize that just connecting a hard drive to a PC will alter the contents of the hard drive. In order to preserve the original contents of the hard drive, it is important to implement a write blocking mechanism.
Law firm and service bureaus that process native files from hard drives should take the same care as computer forensic examiners. Today, CD’s and DVD’s will not be altered by common electronic discovery and litigation support applications. However, you should be aware that the process that burned the files to the disks most likely altered the original file system timestamps
There are several hardware devices that prevent the source media from being altered. There are also some recent software developments that are effective, more affordable, and provide faster throughput. If you need to purchase a write blocker, here are a few choices to review:
Hardware:
>Tableau
>WeibeTech
>ICS Drive Lock
Software:
>Safe Block XP
As you shop for hardware write blockers, you will find that you need to purchase multiple devices for different types of hard drive, flash or media cards and can easily spend over $1,000. So we were pleased with our recent test of Safe Block XP from Forensic Soft Inc. Safe Block is affordable write blocking software ($219) that runs on Windows XP and allows users to block multiple media types. Additionally, Safe Block XP can provide a significant improvement in copy, deNISTing or imaging speeds because it works at the speed of the native interface. Hardware write blockers can slow down the process and are often limited to a USB or FireWire connection.
Carving files, which can be performed manually or through an automated process, permits the recovery of a portion of a corrupted or deleted file. During a computer investigation, examiners may encounter deleted files that cannot be fully recovered. However, enough of the file may still be intact and worth restoring.
For example, if a deleted Microsoft Word document called ”sales report.doc” contains the keywords ”Mr. Smith, Bonus,” but the file cannot be viewed, then it is most likely damaged. Even though not enough of the file is intact to be opened, it may still be possible to carve the useful content. Using specialized software, such as Encase, FTK, WinHex, ProDiscover and several others, it may be possible to locate, restore and even repair damaged or overwritten files.
If you or a partnering service bureau need to be able to process or review your client’s files from an imaged hard drive, you may be in for a surprise. The results of an imaged hard drive are often stored in a forensic image format or what is referred to as an ”evidence file” container. Common evidence file formats include Encase, DD (RAW), SMART, AFF and Safeback, just to name a few.
These forensic image formats are designed to allow access to the files from computer forensic software. Most electronic discovery and litigation support applications are unable to access the file contents of an imaged drive that is stored as a forensic image. If you need to access the copied files, you have three options.
It is important to talk to the company or individual performing the collection to ensure that the collected files can be accessed by those performing the electronic discovery processing and review.