We live in a digital world, and more and more cases deal with eDiscovery. Document review is the most time-consuming, and therefore most costly, portion of an eDiscovery project. Here, we will look at the different methods for document review to help you better understand your options. We will also discuss the best ways to save time and money as you conduct your review.
Tools to Reduce the Document Review Burden
When we manage large document reviews for our clients we use a number of workflows to minimize data sets. Reducing the review population is the easiest way to save both time and money when it comes to eDiscovery. Our tools leverage metadata fields, search terms, email-threading, and TAR to help cull redundant and irrelevant data before attorney review.
Date filters – If the data you are looking for only happened within a certain time frame, use filters to limit your search to only documents within those date ranges.
Search Terms – Due to the volume of digital data produced by most people and companies, search terms are an excellent way to reduce the data set. These terms are usually negotiated with the opposing side based on the Request for Interrogatories. Any top-level documents (parents) or associated documents (children) that hit on the search terms are flagged for review, and non-search hits suppressed. This can increase the “richness” of your data set if the terms are thoughtfully chosen.
Email Threading/Near Dupes – New technologies can also help reduce the number of documents in your review. iDS utilizes Brainspace (an analytics tool) to analyze email threading and near duplicates. Many courts call for only inclusive documents in the review. Inclusive emails contain a complete history of a particular email thread (e.g. the original email and all replies). Non-inclusive emails and attachments can be excluded from the attorney review, further reducing cost and time.
Technology-Assisted Review (TAR) – This is the process of creating and training a predictive model to classify or rank documents. TAR combines the specific knowledge of a human reviewer with the efficiency of modern computing.
Types of Review Workflows
Linear review is the simplest and most traditional form of review. After processing, the parents and children are grouped together in a linear manner. Reviewers will then look at one document after another until the entire data set is reviewed. This option is best for small data sets, where a small group of reviewers can look through every document in a short amount of time. These small Data sets generally come from cases with a tight timeline for review, or very specific search terms. iDS can assist in setting up a linear review process for your data set. For example, we can create review batches (a set groups of documents) where the email and associated attachments appear together, organize by date; or help you craft targeted searches to review the most relevant documents first.
Hits only review:
There are several different ways a review team can tackle a “Hits Only” review.
Many cases leverage search terms to cull the datasets. iDS can help you run and analyze the search terms against your dataset and tailor your review around the terms. By combining the search terms with the linear review referenced above, iDS can create review batches to target priority terms, custodians, or specific dates.
When reviewers find responsive documents, they can proceed in one of two ways. Reviewers can either immediately review the other members of that family of documents, or they can continue to other documents with hits and return to the family members after all of the hits have been reviewed.
One of the most efficient ways to conduct a document review is to use a TAR workflow. There are two main flavors of TAR, while a third is still in its infancy. The first workflow is TAR 1.0 a.k.a. predictive coding. A TAR 1.0 workflow takes the coding decisions of a subject matter expert (SME) and builds a predictive model. The SME refines the model through additional document review to a point where further training no longer has a positive effect on the model’s performance. The predictive model is then used to rank and categorize the remaining unreviewed documents into two buckets, often responsive and non-responsive or relevant and not relevant.
The three key takeaways of a TAR 1.0 workflow are:
1. A subject matter expert is required for training the predictive model.
2. The predictive model becomes “static” once training is complete.
3. Much of the most critical work is performed upfront when training the model.
The second workflow is TAR 2.0 a.k.a. continuous active learning. A TAR 2.0 workflow also builds a predictive model using coding decisions, but an SME is not required. This is because the predictive model is trained throughout the entirety of the review, or continuously, as the name suggests. This eliminates much of the upfront burden seen in a TAR 1.0 workflow and enables clients to use lower cost resources for document review. Another major benefit to using a TAR 2.0 is how quickly you can begin ranking or categorizing documents. Depending on the software you are using, the predictive model can be created and applied to the unreviewed documents after coding just one document. Of course, the accuracy and performance of the model after only one training document will be extremely low. More realistically, a TAR 2.0 predictive model can prove effective after as little as 200 training documents and will constantly improve as more documents are reviewed when the workflow is executed properly.
The three key takeaways of a TAR 2.0 workflow are:
1. Lower cost resources can be used to conduct the review.
2. The predictive model is continuously learning as more documents are reviewed.
3. Very little upfront effort is required before the benefits of TAR are realized.
The third workflow is TAR 3.0. This is the newest adaptation of a TAR workflow and is not yet widely used. A TAR 3.0 workflow functions similarly to a TAR 2.0 workflow with one major difference: The review population first goes through a clustering algorithm that groups conceptually similar documents together. The document(s) at the center of each cluster are reviewed and if it is responsive, all documents within that cluster are then used to train the predictive model. Concept clustering is a separate analytics tool from TAR and is worthy of its own discussion in a separate blog post. Expect to see more blog posts and information regarding TAR 3.0 as the workflow matures and is adopted by more practitioners.
Review Workflows at iDS
At iDS, we have a team of experts dedicated to the application of TAR workflows and other analytics-based workflows to assist with complex litigation and investigations. Our Discovery Services Analytics team has over a decade of experience not just advising on and executing TAR workflows, but also defending the process through declarations and expert witness testimony. We work collaboratively with our clients to understand the goals of each document review and to craft a review strategy that maximizes efficiency and reduces the costs associated with document review.
Check back to learn more about iDS, TAR, and other review workflows, including iDS’s proprietary review offering called LeanReview™ in future blog posts.
iDiscovery Solutions is a strategic consulting, technology, and expert services firm – providing customized eDiscovery solutions from digital forensics to expert testimony for law firms and corporations across the United States and Europe.