How Computers Store ESI – Collection Fundamentals Series, Part 3

3 / 8

A multi-part series on the essentials practitioners need to know about ESI collections

In “Collection and the Duty of Technology Competence,” we discussed lawyers’ duty of technology competence and the importance of understanding collection to fulfilling that duty.  In “The Broad Scope of Collection,” we discussed the potential legal and technological scope of collection.  In this part, we review how computer memory actually stores ESI.

A Tale of Tiers and Types

Modern computers employ a wide range of memory technology in concert to accomplish tasks, including read only memory (ROM), multiple levels of cache, random access memory (RAM), and hard drives, including both hard disk drives (HDDs) and solid state drives (SSDs).  This multiplicity of memory is partly for efficiency and partly for affordability.

To operate efficiently, computers need to be able to access and work with lots of stored information as quickly as possible.  Some information is needed to tell all of the computer’s components how to work together, some is needed to run the operating system and your applications, some is needed to track and respond to inputs, and some is needed to retain all of your activity and files.  Some of that information needs to be stored reliably even when the computer is off, and some of it is only needed temporarily when the computer is on and performing specific operations.  Some of it never changes, and some changes all the time.

As with most things, some memory technologies are fast and expensive, and others are slow and inexpensive.  Some of those technologies are volatile, requiring power to maintain storage; others are non-volatile, maintaining storage without power.  To achieve an effective balance between speed and cost, computers leverage different tiers of memory for different aspects of their operation:

  • ROM
    • Fast, non-volatile memory that contains essential instructions for the operation of the components in the computer
  • Cache
    • Fast, volatile memory that the central processing unit (CPU) and (other computer components) have to store information for rapid access during operation to speed up tasks
    • Most computers include two levels of CPU cache, and many now include 3, as well as caches for the graphics processing unit (GPU) and the hard drives
  • RAM
    • Fast, volatile memory that the computer uses for temporary storage of information in active use, including parts of the operating system and applications and open user files
    • Static RAM (SRAM) is faster but more expensive and is used for the caches described above, and Dynamic RAM (DRAM) is slower but less expensive and is used for the “RAM” component of most personal computers
  • Hard Drives
    • Slow, non-volatile memory that is used for the bulk of information storage, including the operating system, applications, and all user files and data; this is the memory from which collection is most often performed
    • Hard drives can be traditional HDDs, which work like rewritable record players, or newer SSDs, which work like large flash drives
    • Portions of hard drive memory may also be used as an extension of RAM, known as virtual memory, to further enhance operating efficiency

This basic model is also applicable to most mobile computing devices.  Smartphones and tablets employ similar tiered memory systems for the same reasons.

Memory in Motion

As your computer or mobile device operates, there is a constant flow of information being read from and written to hard drive storage, RAM, and the caches.  At any given moment, multiple copies of a file or portions of a file may exist in multiple locations.  These temporary copies are known as ephemeral data, since it typically only exists so long as the computer is on and the operation is active.  Collections from individuals’ computers and mobile devices are typically only concerned with the static ESI in hard drive storage, but the ephemeral data generated by enterprise systems has occasionally been implicated in legal matters.

Forgotten but Not Gone

Whether a computer or mobile device is using an HDD, an SSD, or both, it is managing a collection of thousands of discrete files that is constantly evolving as files are read, modified, written, and deleted.  The computer’s file system dictates how this occurs, and although there are a variety of file systems in use in different types of computers and servers, the underlying principles are the same for our purposes.

The immense volume of available storage is divided up into very small physical and logical units.  The smallest physical unit is typically referred to as a sector, and some common systems refer to the smallest logical unit as a cluster.  The specific nomenclature and the specific relationship between physical and logical units depend on the file system in use.  Regardless, the computer tracks all of those sectors and clusters in what is, essentially, an enormous spreadsheet that records what it has put where and where there is free space to put new things.

Almost all files will be large enough to occupy multiple physical sectors, but those sectors will not necessarily all be physically adjacent.  Most of the time, they are spread out across the physical storage, connected only by the entries in the computer’s master storage spreadsheet documenting their relationship.  And when files are deleted, the physical sectors are not wiped clean of their file fragments; the master spreadsheet is just updated to delete the references to that file and to show that those sectors are available once more.

This approach to storage of ESI has several implications for collection that we will discuss in the next Part.

Upcoming in this Series

In the next Part of this series, we will review the mechanics of forensic collection and retrieval from these memory systems.

About the Author

Matthew Verga

Director of Education

Matthew Verga is an electronic discovery expert proficient at leveraging his legal experience as an attorney, his technical knowledge as a practitioner, and his skills as a communicator to make complex eDiscovery topics accessible to diverse audiences. A fourteen-year industry veteran, Matthew has worked across every phase of the EDRM and at every level from the project trenches to enterprise program design. He leverages this background to produce engaging educational content to empower practitioners at all levels with knowledge they can use to improve their projects, their careers, and their organizations.

Whether you prefer email, text or carrier pigeons, we’re always available.

Discovery starts with listening.

(877) 545-XACT / or / Email Us