Data Targeting During Collection and Processing – Data Targeting Series, Part 2

2 / 3

A multi-part series on targeting the right data to reduce your downstream review costs

In the first Part of this series, we reviewed the relationship between data volumes and eDiscovery cost – particularly for review, and we reviewed data targeting steps you can take before litigation arises.  In this Part, we continue our discussion of data targeting with a review of targeting options during collection and processing.

Data Targeting During the Collection Phase

The extent to which the right data can be targeted during the collection phase will depend primarily on the specific devices and systems from which you are collecting.  For example, some enterprise systems will have useful integrated search tools, and others will offer only basic exports.  Different types of mobile devices may also have different options for capture.  This is one of the reasons why maintaining a data map is valuable, and why you will need to start your project with some similar research if you don’t currently maintain one.

Common choices available to you for employee computers and devices will include: capturing full physical images; capturing logical images at the file system level; and, doing targeted acquisitions of specific directories or file types.  Common choices available to you for enterprise systems will include: the application of date limitations; exports by user or mailbox; and, exports based on keyword search responsiveness.

When considering which choices to make, it is important to balance the cost of repeating collection activities later against the cost of sifting through more material during processing and ECA – and to consider whether there will be time to collect again later, if that’s needed.  The industry trend is currently towards narrower, more-targeted collection.

In addition to narrowing your scope of collection using these capture choices, you can also target your collection by engaging with the custodians from whom you intend to collect.  Learning about how those custodians actually work, what they generate, and where they store it can save you from collecting any more broadly from each of them than necessary.

Data Targeting During the Processing Phase

Once your material has been collected and you are ready to begin processing the data, another range of data targeting options is available to you.  In addition to standard removal of known system files (“de-NISTing”) and standard removal of duplicate files and messages (“deduplication”), the following options are typically available to you:

Directory or Custodian Filtering

At the start of the processing phase you will have another opportunity to target the right data by custodian, mailbox, or directory, if you did not do so already during collection.  Prioritizing key custodians and focusing on user document directories, for example, are both common.

File-Type Filtering

Beyond standard de-NISTing, you also typically have the option to perform additional filtering by file type.  This is done with the goal of eliminating additional system files not removed by de-NISTing and/or with the goal of narrowly targeting the specific user-generated file types believed most likely to be relevant.

This may be accomplished either through “stop filters” (also called “exclusion filters”) or “go filters” (also called “inclusion filters”).  Stop filters exclude specified file types and include everything else, while go filters do the opposite, including only the specified types and excluding anything else.  The difference is what happens to your unknown unknowns.

The application of a stop filter designed to clear out system files missed by de-NISTing is most common.

Date Range Filtering

The processing phase provides another opportunity to apply date range filters to eliminate any collected materials too new or old to be relevant to your matter.  Targeting the right materials by date range requires consideration of a few factors:

  • Files typically carry multiple date/time stamps, including date created, date last modified, date last accessed, date sent, etc., so you may have options as to which to use.
  • Most eDiscovery review and production is done by family group, keeping parents (e.g. emails) and children (e.g. email attachments) together, so you will need to consider whether parent dates override child dates or whether any file can pull in a family.
  • Because of computer errors or collection issues, some files may also have inaccurate, impossible, or nonexistent date stamps, so you may need to manually check any outliers in your chosen date field (e.g. 1/1/1900 or 0/0/0000).
Keyword Filtering

Finally, you will also typically have the opportunity to apply some form of keyword filtering during the processing phase.  In cases where the search terms to be used are fixed through negotiation or court order, this can be an effective step to take.  In projects where you are using your best judgment to develop keywords through trial and error, there are advantages to waiting on keyword filtering until early case assessment, when the tools available for your use are more sophisticated and the ease of testing and iteration is greater.

Upcoming in this Series

In the next Part of this series, we will review options for targeting the right data during early case assessment and discuss strategic considerations.

About the Author

Matthew Verga

Director of Education

Matthew Verga is an electronic discovery expert proficient at leveraging his legal experience as an attorney, his technical knowledge as a practitioner, and his skills as a communicator to make complex eDiscovery topics accessible to diverse audiences. A fourteen-year industry veteran, Matthew has worked across every phase of the EDRM and at every level from the project trenches to enterprise program design. He leverages this background to produce engaging educational content to empower practitioners at all levels with knowledge they can use to improve their projects, their careers, and their organizations.

Whether you prefer email, text or carrier pigeons, we’re always available.

Discovery starts with listening.

(877) 545-XACT / or / Email Us