Investigating the Realities, Identification and Preservation Fundamentals Series Part 4

4 / 6

A multi-part series on the fundamentals eDiscovery practitioners need to know about the identification and preservation of potentially-relevant ESI

In “In the Beginning,” we reviewed the importance of effective identification and preservation as well as the triggers for doing so.  In “Legal and Technological Scope,” we reviewed the scope of what must be identified and preserved.  In “Imagining the Possibilities,” we reviewed the first steps for identification.  In this Part, we discuss the investigative aspects of identification.

As we noted in the last part, the identification process breaks down into two parts: imagination and investigation.  Now that you have completed your initial brainstorming of potential materials, properties, and people, you are ready to begin the investigation part of identification.

Investigating Your Assumptions

A variety of investigative options are available for finding out how reality lines up with the brainstorming you’ve done to get started.  The most important are: targeted interviewing, data mapping, and sampling.  Which one(s) will be most useful to you will depend on your specific project – in particular, how large and diverse your brainstorming has led you to believe your project will be.  For example:

  • The larger your project, the more investigative steps you’ll need to take
  • The more systems and sources by count, the more useful a data map is
  • The more custodians by count, the more useful sampling is

We will discuss each of the three primary options in turn.

Targeted Interviews

Targeted interviews are the easiest investigative step and a common first one.  In this context, conducting targeted interviews is like conducting a limited number of custodian interviews with some important and key players.  This process is typically less formal (i.e., no full script) and less complete (i.e., not all individual custodians are included) than a full custodian interview process, which might come later in the project.

Your goal in the targeted interviews is to review your brainstormed lists of materials and people with individuals that have some direct knowledge of what likely exists, where it would be, and who else might know things or possess relevant materials.  This would include talking to individuals with knowledge of any potentially relevant enterprise or departmental systems, as well as any relevant third party service providers.

Data Mapping

Your next investigative option is data mapping.  Data mapping is the process of “mapping” the various data stores and sources in an organization.  Many organizations do some version of this already for non-legal purposes.  For example, the IT or IS department may have “maps” of the organization’s servers, computers, and enterprise systems, along with directories of installed software.  Ideally, data mapping for legal activities would be undertaken on a proactive, organization-wide basis rather than in response to a specific matter, but engaging in some targeted, reactive data mapping is better than none and well worth doing.

In this context, you would be working your way down your potential materials/hypothetical sources list, reviewing them with relevant individuals (from IT/IS, Records Management, etc.) and reviewing relevant documentation, attempting to flesh out that list with concrete details.  What you will be attempting to build is less a literal “map” than a spreadsheet or matrix.  Your final product will be a searchable, sortable, filterable reference tool listing sources in rows and relevant details about them in columns.  Important things to note about each source during this sort of reactive data mapping include:

  • Owner/manager of source (e.g., specific IT contact, department manager, or custodian)
  • Desired materials expected to be there (including expected formats, dates, etc.)
  • Expected volume of materials from source (e.g., record count, file volume)
  • Available native search and export tools/features, if any, and relevant details
  • Risk to those materials from automated janitorial functions or other normal processes

Gathering and organizing this information (and additional details as time and circumstances permit) will equip you to better plan your needed preservation (and later collection) activities.  Plus, from a data targeting perspective, the more information available to you about what you have and where it is, the more narrowly and accurately you can target what gets preserved and collected for subsequent phases of project work.


The final investigative option we’ll discuss in this Fundamentals Series is sampling.  In the context of eDiscovery, “sampling” is used to refer to both judgmental and statistical sampling.  Judgmental sampling is the informal process of looking at parts of something large to get an anecdotal sense of the whole, while statistical sampling refers to formal sampling to take a defined measurement.  In the identification and preservation phases at the beginning of a matter, you will primarily be engaged in judgmental sampling.

Judgmental sampling is essentially what you’re doing when you select key individuals for targeted interviews, using them as proxies for the whole list of hypothetical custodians.  More importantly, though, judgmental sampling is the way you will learn about what’s actually on various sources and systems that, unlike custodians, cannot self-report to you.  This kind of judgmental sampling might take a variety of forms, such as:

  • Searching electronic mailboxes to test for relevance
  • Indexing some backup tapes to test for unique materials in backups
  • Collecting representative custodians’ devices to check for relevant materials

These efforts will be aided by the time you spent brainstorming potential distinctive characteristics of the materials you are seeking.

Depending on your project’s scale and timeline, you may end up proceeding from judgmental sampling at the beginning to formal statistical sampling before collection (when it may be worth the effort and expense to get some firm measurements for decision-making and process negotiations about what comes next).  This has become especially true in this era of increased focus on proportionality, but those are generally questions to address after effective identification and preservation have already taken place (so that nothing unique is lost in the meantime).

Upcoming in this Series

In the next Part, we will continue our discussion of identification and preservation fundamentals with a look at the role of legal holds in preservation.

About the Author

Matthew Verga

Director of Education

Matthew Verga is an electronic discovery expert proficient at leveraging his legal experience as an attorney, his technical knowledge as a practitioner, and his skills as a communicator to make complex eDiscovery topics accessible to diverse audiences. A fourteen-year industry veteran, Matthew has worked across every phase of the EDRM and at every level from the project trenches to enterprise program design. He leverages this background to produce engaging educational content to empower practitioners at all levels with knowledge they can use to improve their projects, their careers, and their organizations.

Whether you prefer email, text or carrier pigeons, we’re always available.

Discovery starts with listening.

(877) 545-XACT / or / Email Us