LegalTech New York 2013

It’s hard to believe that LegalTech New York is upon us again. Pinpoint Labs will have representatives in New York next week, 28th – 31st , and we wanted to invite you to attend one of our sessions at the Sheraton Hotel if you will be in town.

Here is the information for the sessions:

During our sessions we are going to be discussing and demonstrating:

1) SharePoint E-Discovery and Forensic collections

2) ESI Self-Collections

3) New Pinpoint Labs Multi-Threaded Viper collection engine

If one of the sessions doesn’t fit your schedule and you would like to meet please let me know and we will do our best to find an alternative time. Please travel safe and we look forward to seeing some of you very soon!

PDF    Send article as PDF   

Microsoft SharePoint Forensic and E-Discovery Collections


Microsoft SharePoint forensic and E-Discovery collections have become a growing concern for legal IT professionals and their clients. An extensive amount of potentially relevant discoverable content stored in SharePoint sites can be a minefield of technical challenges. A quick Google search can point to some information about how to copy the document libraries in a SharePoint site.  However, what about the associated metadata? Linking document versions? Verification logs? This does not include the other content a SharePoint site can store like calendars, contacts, tasks, blogs, wikis, and many other “lists” that can also contain discoverable information.

Potential SharePoint Forensic E Discovery Sources

Potential SharePoint Sources

There are several obstacles that will be encountered during a typical SharePoint Forensic or E-Discovery collection:


It becomes apparent very quickly that just linking and copying the document libraries, especially without audit reports, is incomplete and could easily miss discovery that the attorneys need to do their job. If you have been involved in a SharePoint site collection that uses the option to link to document libraries and use a copy utility to collect the data, you realize it is very slow. This is due to the administrative overhead when mapping a drive letter to a SharePoint document library. Not only does it take a lot longer to collect documents from libraries using this method, the SharePoint metadata is not captured during the copy process.  It is much quicker to connect to the native SharePoint web services when “requesting” a copy of a document that resides in a SharePoint site.

Some litigation support service providers have built custom utilities to link metadata with documents. However, this still requires several additional steps including exporting from the MS SQL Server databases and linking the information to documents. After all these steps you still leave behind the “lists” which many custodians use daily.


There are multiple considerations when bridging the gap on how SharePoint data is structured and stored, and how to transfer the content into common E-Discovery and review applications. For example, many items in SharePoint site lists are displayed as part of a web page in SharePoint and the data is stored in SharePoint’s SQL Server database.

To extract and provide an easily viewable version offline, you need a tool that can take several dissimilar sources and “flatten” the content out while linking to the files. This allows you to easily import the content into popular E-Discovery tools. It isn’t enough to just backup or export SharePoint data because you have to convert it into a format that vendors and legal counsel can use. Comma separated files in various formats (i.e. Concordance DAT) are commonly used to transfer and link data between dissimilar systems.


At this time, there are only a few products and services offered that address a complete SharePoint site collection focused on defensibility and E-Discovery. Of those that are available, most are priced for larger collections with a large investment, or require using a specific vendor for your E-Discovery processing.

The most reasonably priced tool we have seen is Pinpoint Labs SharePoint Collector (SPC) for Forensic or E-Discovery collections.  When using SPC, you can still use your current E-Discovery processing vendor. You can also purchase affordable software licenses for one time use, unlimited single users, or enterprise-wide use.


Important features in Pinpoint Labs SharePoint Collection (SPC) tool include:

  • Fully portable (no local installation required)
  • Multi-threaded processing for faster collection
  • Collect documents, lists, calendars, and contacts
  • Capture announcements, attachments, wiki, blogs
  • Extract SharePoint file metadata
  • Extensive chain of custody
  • Collect user data and display file lists
  • Retain folder structures
  • Resume incomplete jobs and reprocess errors
  • Create Concordance DAT file for easy import into E-Discovery processing and review applications

You can obtain more information or request an evaluation license for Pinpoint Labs SPC by visiting You’ll find a short demonstration video below.

Microsoft SharePoint sites will continue to be a frequent and important E-Discovery source. This article provides a basic understanding of the obstacles commonly encountered. In most cases, it isn’t enough to simply map and copy the SharePoint document libraries, especially when there are more affordable, flexible and defensible alternatives available.

PDF Creator    Send article as PDF   

Mac Target Disk Mode

Booting an Apple Macintosh in target disk mode allows computer forensic examiners to copy relevant files from the internal drive on a Mac computer. Removing hard drives instead from a Mac computer can be time consuming and result in damage to the system when not performed properly. Therefore, target disk mode allows forensic examiners to access the internal Mac drive as an external hard drive from another Mac or a Windows PC.

Target disk mode requires the host computer (systems accessing the Mac) to connect via a FireWire port (or Thunderbolt with newer Macs) to the target Mac.  If there is not a FireWire port on the Windows host computer, you will need to purchase an adapter for the system. Many laptops have PCMCIA or ExpressCard/34 slots that allow insertion of a FireWire card.  Make sure the cable being used fits the FireWire connector on the Mac since there are different shape connectors for FireWire standards.


When using a Windows PC to access the target Mac hard drive, there are a few things to prepare:

  1. Make sure you have a FireWire port and the appropriate cable. You can purchase FireWire ExpressCard/34 adapters for laptops that include an ExpressCard slot.
  2. Software installed on the host computer that can recognize a Mac (HFS, HFS+ etc.) file system. MacDrive from Mediafour ( is a well-known and useful tool for accessing Mac drives in target mode or Mac formatted hard drives you receive.
  3. SafeCopy 3.0 and Harvester from Pinpoint Labs ( are popular applications that can copy and even keyword filter the files on a Mac drive once it is connected and mounted on the host computer.

Thunderbolt FireWire Boot Image


Once your host Windows computer is set up and ready to access the target Mac, follow these steps:

  1. Make sure the target Mac computer is turned off.
  2. Connect the FireWire cable from the Windows host to the target Mac.
  3. Turn on the target Mac while holding down the “T” key until you see the FireWire icon. The hard drive of the target computer should become available on the host computer, and you will then be able to access the files from the Mac hard drive.
  4. Use a defensible copy utility or hard drive imaging application to collect the relevant contents.
  5. When the copy is completed, you may turn off the target Mac computer and disconnect the FireWire cable.


Newer Apple Macintosh computers with Thunderbolt ports (faster than FireWire) will also display the Thunderbolt icon when going into target mode. If the target Mac only displays the FireWire icon as shown above, then you will need to use the FireWire port to access the Mac internal drive from the host computer.


MacBook Thunderbolt Port

Thunderbolt ports for Windows PC’s are available.  However at this time, Pinpoint Labs has not yet tested access to a Mac in target mode using a Windows PC via a Thunderbolt connection. We will provide a follow-up note at a later date once we have had a chance to physically determine its effectiveness during a computer forensic collection.


Apple Macintosh computers continue to increase in popularity. Corporate IT managers, litigation support professionals, and computer forensic examiners are often tasked with preserving data from Mac hard drives. Booting a Mac in target disk mode is one way to gain access to relevant files without disassembling the system, which in most circumstances would void the warranty and/or damage the system. SafeCopy 3.0 and Harvester from Pinpoint Labs ( are great utilities for fast and safe copying of files from Mac and Windows PC’s.

Create PDF    Send article as PDF   

SafeCopy 3.0: Fast Multi-Threaded Copy Tool from Pinpoint Labs

What is multi-threaded copying and why is it important?

When a large number of computer files need to be copied from their current location to a network or external hard drive, the process could take many hours or even days. For example during a civil litigation case, employee (“custodian”) files and emails often have to be preserved, which requires large amounts of data to be securely copied (backed up) to another location.

SafeCopy 3 Multi-Threaded Copy tool

SafeCopy 3.0

In such a case, it is common for the total of the file sizes of all custodians involved to become very large (hundreds of gigabytes up to several terabytes). Reducing the time required to copy this custodian data lowers costs, turnaround time, and frustrations. Completing a defensible preservation process with a minimum impact or drain on a company’s resources should be the operational goal.

SafeCopy 3.0 goes beyond typical multi-threaded copy utilities

The multi-threaded copy support in SafeCopy 3.0 from Pinpoint Labs not only copies multiple files simultaneously, but also optimizes each copy process by incorporating additional processing threads. These added threads speed up reading, writing, and hash verification.  Due to this optimization, many clients have found that SafeCopy 3.0 measurably outperforms other multi-threaded copy utilities.

SafeCopy 3.0 was designed to satisfy the demanding needs of litigation IT, corporate IT and computer forensics professionals who need to copy large amounts of active file data related to civil litigation. Because of this, our new version has a more extensive feature set than other multi-threaded copy utilities. Due to the extended duration of many collection processes and the need for defensibility, the following copy processing options are included in SafeCopy 3.0:

1)      Execute fast copy speeds with multi-threading

2)      Perform intelligent hardware optimization

3)      Cancel and resume jobs easily

4)      Detect network outages and wait to resume

5)      Maintain a detailed chain of custody (verification logs)

6)      Provide real-time statistics to help estimate job completion

7)      Save and load job templates

8)      Store snapshots of job settings

9)      Preserve file timestamps and metadata

10)   Support encrypted data volumes

SafeCopy 3.0 has balanced bandwidth utilization

By leveraging the additional CPU cores and memory that are available on faster computers, it is common to see SafeCopy 3.0 capable of maintaining 30-60 gigabytes per hour copy speeds. By using our multi-threaded copy engine, it is also able to balance and fully utilize the bandwidth available on a network, which single copy processes simply cannot do.

If you are in need of a faster file copy engine for large files or large dataset copy jobs, please visit the Pinpoint Labs website for more information on SafeCopy 3.0 @

PDF Printer    Send article as PDF   

What is deNISTing?

Saving clients money on electronic discovery processing is one of the challenges facing attorneys, service bureaus and their clients. Due to the amount of data collected when imaging custodian hard drives the resulting processing and labor costs can be significant and potentially prohibitive.

Reduction of 30%+ Through DeNISTing
Many firms have discovered that deNISTing is a relatively easy way to reduce the overall EED processing costs for imaged custodian drives by an average of 30%. How do they accomplish this reduction without missing potential evidence? By removing ‘known’ files for Microsoft Windows, Linux, Mac OS and other systems the overall production is substantial reduced.

The NIST (National Institute of Standards and Technology) NSRL list contains more than 115 million known files and by using this list to filter custodian hard drives files, prior to EED processing, a significant reduction can be realized.

What Brought on DeNISTing’s Recent Popularity?
‘DeNISTing‘ has become a requested service in just the last few years. Until recently there haven’t been tools available to handle the processing without significantly increasing the turnaround time and investing in expensive computer forensic software.

Pinpoint Labs’ Harvester Software Makes deNISTing a Reality
Harvester from Pinpoint Labs is an affordable and easy to use application which leverages the more than 115 million known hash values in the NIST list to filter custodian data and dramatically reduce the costs and processing time associated with imaged hard drives. Harvester can also dedupe while creating a chain of custody and safely copy filtered files while deNISTing. By performing these multiple processes simultaneously,  Pinpoint Harvester reduces electronic discovery processing costs and labor.


This information is provided by Jon Rowe, a Certified Computer Examiner (CCE) and the President of Pinpoint Labs. Please watch the video below to learn more about affordable and defensible tools for E-Discovery collections.

PDF Download    Send article as PDF   

ESI Self Collection Drives and Kits

Electronically Stored Information (ESI) self collection drives and kits have become popular in the last few years because they offer an affordable means of collecting electronic data for a legal matter without the need to hire in expensive forensic experts. This article covers what should be included in an ESI collection drive kit as well as some tips to ensure the collections are completed properly.

ESI Self Collection Tips and Resources

Here are a few tips to help ensure a successful ESI self collection:

1) IT Assistance –Have someone on hand with knowledge of the products, how they work and how to overcome any issues encountered. This could be an individual with the legal department, corporate IT, a forensic computer examiner, or a competent vendor.

2) Hard Drives – If the ESI self collection drive is being connected directly to a custodian PC or server, take a look at the 2.5 inch enclosed external hard drives that are powered from a USB port. If collecting data across a network, a Network Attached Storage (NAS) device should be considered.

3) Software – Require these key features from active file collection software (like SafeCopy 2 or Harvester from Pinpoint Labs):

  1. Preserves file timestamps and metadata – Using Windows Explorer to “drag and drop” files does not preserve critical metadata or confirm that the contents were copied exactly.
  2. Creates electronic chain of custody – Report(s) containing details of what happened, source and destination hash values, MAC times, where files were copied from/to and results are the audit trail required for defensibility.
  3. Hash verifies files – Files hashes of the source and destination are verifiable proof of a valid copy.
  4. No local installation – Ideally the software should run from an external device or from the network without installing anything on the host computer.
  5. Automated job tickets – Human involvement opens the risk of human error. Products like Harvester from Pinpoint Labs include features to automate the process with predefined work tickets.
  6. Filtering (Optional) – Filtering at the point of collection reduces the cost of processing the collected data. Some of the filters that can be applied at the point of collection are file types/headers, date ranges, folder names, key words, deduplication, and deNISTing.

4) Evidence Bags – Tamper-proof evidence bags provide additional security and defensibility. The following antistatic bags from Packaging Horizons ( are designed for hard drives.

5) Paper Chain of Custody –Most firms are familiar with transferring evidence and have forms already created. Include this form with the drives used in an ESI collection kit.

Larger Collection Alternatives

Putting together ESI self collection kits can save money and eliminate delay and additional costs. Harvester from Pinpoint Labs is offered at a flat rate (you own it) or per collection.

Unease with ESI Self Collections

There has been some concern over custodian self collections. Relying on untrained employees to find, and then properly collect the relevant data may present a defensibility problem.  This problem is overcome easily with automation features of data collection software. These features minimize the number of human errors that can occur by minimizing the amount of employee interaction with the collection process.

What you should know

ESI self collections and kits are here to stay. They significantly reduce discovery costs, perform targeted collections, and are the modern equivalent of boxing up relevant files. However, it is critical to ensure that the process is defensible by preserving the original content, with the correct process, products, and procedures. Further assistance designing an ESI self collection kit for specific project needs, contact one of the project leaders at Pinpoint Labs.


This information is provided by Jon Rowe, a Certified Computer Examiner (CCE) and the President of Pinpoint Labs. Please watch the video below to learn more about affordable and defensible tools for E-Discovery collections.

Free PDF    Send article as PDF   

What is a Hash Value?

A hash value is a result of a calculation (hash algorithm) that can be performed on a string of text, electronic file or entire hard drives contents. The result is also referred to as a checksum, hash code or hashes. Hash values are used to identify and filter duplicate files (i.e. email, attachments, and loose files) from an ESI collection or verify that a forensic image or clone was captured successfully.

Each hashing algorithm uses a specific number of bytes to store a “ thumbprint” of the contents. The following is a list of hash values for the same text file. Regardless of the amount of data feed into a specific hash algorithm or checksum it will return the same number of characters. For example, an MD5 hash uses 32 characters for the thumbprint whether it’s a single character in a text file or an entire hard drive.


MD5: 464668D58274A7840E264E8739884247

SHA-1: 4698215F643BECFF6C6F3D2BF447ACE0C067149E

SHA-256: F2ADD4D612E23C9B18B0166BBDE1DB839BFB8A376ED01E32FADB03A0D1B720C7





RIPEMD-128: A868B98EAEC84891A7B7BA620EDDE621

TIGER: F31A22CEED5848E69316649D4BAFBE8F9274DED53E25C02D

PANAMA: 7E703B1798A26A0AF21ECD661CBADB9C72B419455814CA7B82E29EE0C03FA493


CRC16: 117C

CRC32: FA2D47D4


As you can see there are also various length hashes within a family (SHA-1, SHA-256 et.) The most common hash values are MD5, SHA-1 and SHA-256. The longer hash values require more time to calculate and are designed to reduce the probability of a collision.

What is a Hash Value

A few other ways that hash values are used:

-  Verify a downloaded file was created by the publisher (oppose to a virus infected version)

-   Identify and filter files on the NSRL/NIST list (“deNISTing”)

-   Locate known contraband (illegal images and videos)

Here are a few reasons why hash values are so widely used as a means to validate and compare content:

1)  Privileged Data – There would be obvious issues storing and providing multiple copies of the contents of a company’s files or entire hard drives data in a database to perform a byte comparison. Not to mention illegal images and videos (child pornography) would have to be stored and used in each system scan. These scenarios are unacceptable.

2)  Speed – Comparing an indexed hash value versus what could be billions or trillions of bytes or source data is much quicker. Optimized hash engines (Pinpoint Harvester) can compare thousands of hash values in a second.

3)  Security  – Hashing data is a one way trip. The original data can’t be recreated or reverse engineered from the hash value. This provides additional security that a person can’t determine the source data from the hash.

The argument that data sources could be different and have the same hash value has raised a lot of concern. There are countless threads related to this issue on the litigation support and computer forensic forums. The bottom line is the only way to do an exact comparison of the original data is to store it everywhere you need to deduplicate or verify the information, however, as mentioned about this isn’t a practical alternative.

More complex hashing functions have been introduced (SHA-256, SHA-512 etc.) which will further reduce the likely hood of a collision. It is also worth noting that even in those cases where scientists have created collisions it was a result of exploiting the weaknesses in a specific hash algorithm. The same alterations would not create a collision in a different hashing algorithm.

So, if you still aren’t satisfied with the incredibly remote possibility a collision could happen using a single hash value then the easiest way to implement an extra precaution is to take the time to have your processes calculate hash values from two separate algorithms (i.e. MD5/SHA256) for each item. Unfortunately, most EED applications and forensic imaging tools don’t support this option, especially  in a single pass.

What to Remember

Hash values are a reliable, fast, and a secure way to compare the contents of individual files and media. Whether it’s a single text file containing a phone number or five terabytes of data on a server, calculating hash values are an invaluable process for Deduplication and evidence verification in electronic discovery and computer forensics.


This information is provided by Jon Rowe, a Certified Computer Examiner (CCE) and the President of Pinpoint Labs. Please watch the video below to learn more about affordable and defensible tools for E-Discovery collections.

PDF    Send article as PDF   

E-Discovery Collection

E-Discovery Collections also known as Electronic Evidence Discovery (EED) or Electronic Data Discovery (EDD) can include a review of all the data stored on employee desktop or laptop computers, company servers, camera cards, cell phones, smart phones, GPS devices, digital video recorders, digital answering systems, thumb drives, RAID arrays and any other form of electronic media capable of storing data.

Types of Electronic Discovery Content

Employee Work Product – Computer Files are by far the most common arrangement for a forensic e-discovery collection. Files (also referred to as loose files or active files) are similar to their paper equivalent. They can be copied, moved, and even “shredded”. Work product could include sales reports, QA reports, product or service information, client lists, engineering designs and much more.

Employee Correspondence - Email has practically replaced letters and interoffice memos. A forensic e-discovery collection of correspondence is often a critical piece and can often contain the “smoking gun”. What someone said, to whom, and when are some of the first questions asked in a legal matter. Since emails are a form of documented communication, they comprise highly sought-after data when it comes to legal matters. Emails themselves may be contained in databases, files, or unallocated space.

Customer Relations and Accounting Data – Customer lists, internal notes, and financial records are also a critical component in forensic e-discovery collection or computer forensic investigations. Properly collecting the live database files that store this information can be a challenge. Single entries in a database often require export to another format in order to be useful or even readable by humans. Most databases include this ability.

User Logs – Collecting user logs isn’t always as relevant in an e-discovery collection/review as it is in computer forensics analysis, however, they can be and are worth mentioning. User logs will contain entries about the activities performed on a computer and different user accounts. Attorneys may want to know when emails were sent or received between accounts in case the emails were deleted.  Log entries may require conversion into human-readable form before they can be processed.

Raw or Unallocated Data – Unless a forensic image of the source data has been requested a forensically sound e-discovery collection will focus on “active” files. However, it is helpful to understand the difference between “unallocated” and “active” data. Raw or unallocated data is data that resides in segments of the storage media (hard drive, camera card, etc) that are not being used by files. This data can contain all or part of files that were once referenced in the file allocation table but were subsequently deleted. Much of this data can even survive a reformatting of the disk itself. Since this data can come from any number of sources that had once been active on the drive, it can make or break a case where it is suspected that deletions may have occurred.

Tools for Forensic E-Discovery Collection

With the exception of unallocated space, tools such as One Click Collect Harvester from Pinpoint Labs have the ability to collect loose files, emails and whole databases with the added benefits of being able to specify key words, date ranges, domains and email addresses among other very useful filters.

Tools for collecting the unallocated space on a drive usually require an experienced forensic examiner in order to get useful interpretations of the data collected. In cases where this is necessary, it is recommended that a certified computer examiner be hired for the collection and analysis of the data.

PDF Creator    Send article as PDF   

How Much is a Petabyte, Exabyte, or Zettabyte?

As our electronically stored information (ESI) data universe continues to grow, we are hearing about increasing storage capacities. The size of a project in terabytes (TB – 1024 Gigabytes) comes up frequently and is often the amount of data that has to be collected, culled or processed on a corporate server. However, now you can purchase a 1TB drive that will fit in a laptop computer.

Have you heard of a job that will reach or exceed a petabyte? If not, you most likely will in the near future and the following will help if you aren’t familiar with the larger capacities.

Equivalent Storage in Terabytes

Petabyte = 1,024 TB

Exabyte = 1,048,576 TB

Zettabyte = 1,073,741,824 TB

Yottabyte = 1,099,511,627,776 TB

As the size of electronic data at client sites increases so will the need for refined, targeted ESI collections. Many litigation support and computer forensic professionals have encountered collection jobs that are several terabytes and are provided keyword search terms and other criteria to help identify relevant data and decrease the amount being collected, processed and hosted.

Create PDF    Send article as PDF   

Email Collection

Email Collection refers to the identification and isolation of electronic mail (email) messages that pertain to a specific legal matter in civil litigation cases.

What gets collected

What is actually being collected during email collections can be one of two things:

1. Files representing the contents of the transmitted email messages themselves (usually in MSG, HTML, EML or RTF format).

2. Container (or store) files that hold the contents and data associated with multiple email messages, usually all of the emails for a specific custodian.

Whether files for individual emails or container files are collected depend mostly on the type of email system being used by the custodian. If the custodian is a user of Microsoft Outlook for instance, then either container files or individual email files may be produced. If the custodian is a user of a webmail service, such as Gmail or Yahoo!, then it is likely only individual email files can be collected.

How it’s done

Software such as Harvester from Pinpoint Labs can search the PST store files produced by Microsoft Outlook and Exchange email systems for individual emails containing specific criteria, such as who sent the email, who received it, when these actions occurred and whether the subject, body, or attachments contain specified key words. It can also produce the result to either individual email files or whole, reconstructed container files, known as PST regeneration.

With other email systems, either the whole container file can be copied and sorted through manually, or the individual emails can be manually identified and exported as individual email files.

What to remember

As with any data being collected, the two concepts to remember are preservation and validation.

Preservation refers to keeping the metadata about the individual messages as well as the metadata contained within each of the messages intact so as to maintain their admissibility. PST regeneration is especially desirable in this case because it maintains both the email data and the data that linked it to contact data, task list data and other data integrated with these types of email messages.

Validation refers to the policy of insuring, either by hash value comparison (analogous to fingerprints for data) or bit-wise comparison, that the contents of the copy are the same as the contents of the original.

Software such as Harvester and SafeCopy 2, both from Pinpoint Labs, have built-in preservation and validation systems to certify that both of these conditions are always met.

PDF Printer    Send article as PDF   

Computer Investigations
Data Recovery
Electronic Discovery
Electronic Discovery Collection
ESI Collection
ESI Software