In an era of increased cost-consciousness, relying on self-collection can seem like an appealing savings, but it can also lead to dramatic downstream complications and costs
In “A Shortsighted Shortcut,” we discussed why getting collection right is so important. In “Custodian Collection Risks,” we discussed the risks associated with custodian self-collection. In “IT Collection Risks,” we discussed the risks of organization self-collection. In this Part, we review two examples of what the courts have said and done about self-collection.
We’ve now seen – in the abstract – the wide range of risks that come with allowing either individual custodians or IT personnel to carry out collection, but what has happened in actual cases? The risks and consequences of employing self-collection approaches are not merely hypothetical. For many years, courts have highlighted those risks, have taken parties and their lawyers to task for their reliance on self-collection in the face of those risks, and have applied significant monetary and evidentiary sanctions for failures caused by taking those risks.
First, with respect to any document with metadata that was modified or deleted in discovery, plaintiffs will be precluded from using the date contained in any such document to argue that the presence of the date alone proves the date on which the document was created. Rather plaintiffs will be required to provide independent evidence (such as witness testimony) to show the document’s creation date. In addition, [defendant] will be permitted to present evidence to a factfinder regarding plaintiffs’ destruction of their metadata so that [defendant] may make arguments that the material presented was not created on the date claimed. [emphasis added]
The second answer to defendants’ question has emerged from scholarship and caselaw only in recent years: most custodians cannot be “trusted” to run effective searches because designing legally sufficient electronic searches in the discovery or FOIA contexts is not part of their daily responsibilities. Searching for an answer on Google (or Westlaw or Lexis) is very different from searching for all responsive documents in the FOIA or e-discovery context. Simple keyword searching is often not enough: “Even in the simplest case requiring a search of on-line e-mail, there is no guarantee that using keywords will always prove sufficient.” There is increasingly strong evidence that “[k]eyword search[ing] is not nearly as effective at identifying relevant information as many lawyers would like to believe.” As Judge Andrew Peck – one of this Court’s experts in e-discovery – recently put it: “In too many cases, however, the way lawyers choose keywords is the equivalent of the child’s game of ‘Go Fish’ . . . keyword searches usually are not very effective.”
There are emerging best practices for dealing with these shortcomings and they are explained in detail elsewhere. There is a “need for careful thought, quality control, testing, and cooperation with opposing counsel in designing search terms or ‘keywords’ to be used to produce emails or other electronically stored information.” And beyond the use of keyword search, parties can (and frequently should) rely on latent semantic indexing, statistical probability models, and machine learning tools to find responsive documents. . . . In short, a review of the literature makes it abundantly clear that a court cannot simply trust the defendant agencies’ unsupported assertions that their lay custodians have designed and conducted a reasonable search.
[footnotes omitted; emphasis added]
Upcoming in this Series
Up next, in the final Part of this series, we will conclude our discussion of self-collection approaches with a look at three more examples of what the courts have said about relying on self-collection approaches and what penalties they have imposed when things have gone wrong.