Data Reliability Starts With Proper Collection
This is the eighth installment in a blog series on Fact Crashing™, the acceleration of the consideration of ACTION data (Ambient, Contextual, Transactional, IoT, Operational, Navigational) to the benefit of resolving disputes.
There are 9 Principles of Fact Crashing™. Earlier blogs covered:
Now, let’s take a look at the sixth principle.
“Give me six hours to chop down a tree and I will spend the first four sharpening the axe.”
This concept absolutely holds true when it comes to collecting data – plan and prepare before brute force efforts. Throughout the collection process, documentation is critical. No detail is too small.
Step 1: Understand the Possible Points of Collection
It’s important to understand the various points of collection, often identified as part of Principle 3 (Identify and Prioritize) but not always vetted for collection defensibility. You may find yourself asking simple questions like “What are all of the points of access, points of reporting, default/canned reports available, etc.” Sometimes the easiest point of collection can be a report that is already available while other times it requires custom inquiry and/or access via the application programming interface (API).
Regardless of the point of collection, there are a few items that you want to document as part of the collection process.
- Identify all points of access (e.g., where/how can you get the data)
- Determine the best point of access used for collection
- Document criteria used (e.g., date range, filter(s), etc.)
- If Report, version of that Report
- If Custom Query, document that Query
- Record person(s) involved in collection
Remember, not all points of access are created equally, and therefore, you should always understand what may be available via one point of access but not another.
Step 2: Sample Data
While it may be tempting to “go ahead and just run the report,” run a sample export/report that will allow you to evaluate the accuracy of the collection before potentially impacting the business. For example, if you have a three-year time period in question, pick a month or two that have the most systems in common and request the data for that month. A few areas that sample data can help with include:
- Identifying additional fields that need to be included
- Identifying unexpected data anomalies in certain fields
- Identifying additional look-up values that need to be defined
- Recently, we went through this exact exercise, picking a single month that had data from all of the available systems in question. This allowed us to not only validate the collection queries, but also evaluate how potential analyses would be performed so that we could further prioritize efforts.
Step 3: Collect (aka Extract)
Now, it’s time to get all of the data. Before you hit the big green GO button, always keep in mind that larger queries can impact business operations. Therefore, considerion of the timing of collection should always be part of your process.
Prior to collecting the entire data set, whenever available, establish any KPI that can help you validate the export (see below for suggestions). While many systems may provide this information and/or error reports, not all are built that way so being able to validate becomes a critical step.
Remember, document the entire process so that, if needed, it can be repeated. Being able to repeat a collection is not only good for defensibility, but is also often used when updated criteria is provided (e.g., expanded date range).
Step 4: Validate & Verify
This is when we ask questions like “What do we expect to get? Did we get what we expected? Did the client send us what they think they sent us?”
The goal here is to validate expectation versus reality. There are some very complex routines that can be used, but even the simple ones can help catch simple adjustments that need to be made prior to analysis. Consider the following when looking to validate a collection.
- Date Range
- Record Count
- Field Count
- Entity Count
If validation is ensuring that you properly conveyed the information from point A to point B, then verification is ensuring that you have the right data. Comparing data to contemporary (i.e., run an account balance) or historical reports (e.g., end-of-year financials) can provide confidence that you have not over or under collected data. This is also why it may be more valuable to collect in a scope sufficient to confirm against these types of reports, then conduct further filtering from the initial extraction corpus.
You need to have these expectations documented prior to collection. Remember … documentation is a key element of proper collections.
Step 5: Inventory & Store
All too often we see analysis start immediately, but first you should document the inventory of everything received, as well as maintain that inventory in your system of record with an “original” copy of the data stored for the file.
iDS provides consultative data solutions to corporations and law firms around the world, giving them a decisive advantage – both in and out of the courtroom. Our subject matter experts and data strategists specialize in finding solutions to complex data problems – ensuring data can be leveraged as an asset and not a liability.