Data Understanding is Required for Proper Analysis
This is the ninth installment in a blog series on Fact Crashing™, the acceleration of the consideration of ACTION data (Ambient, Contextual, Transactional, IoT, Operational, Navigational) to the benefit of resolving disputes.
There are 9 Principles of Fact Crashing™. Earlier blogs covered:
Now, let’s take a look at the seventh principle.
To truly understand data, you should think of it as a three-legged stool: System Admin (e.g., IT); Business Owner (e.g., Sales) and End User (e.g., Salesperson). Having these three perspectives will help uncover some nuances to the answer to questions like:
- Are users using the system as intended?
- Is IT aware of how the business is using the system (reality versus design)?
- What modifications have been made out of business necessity?
These are all simple and yet all too common situations we’ve seen happen at no fault of anybody, just a reality of business. For example, users decide themselves to use a field or workflow a different way than it wass intended because “it was just easier.” It’s important that this process focus on understanding the realities of the data, not judging the business as to the “why.”
It’s not uncommon to be in a situation where you are missing legs (e.g., the HR person who knows about this legacy system is no longer with the company). That does not mean you can’t proceed; it just means that you need to do so understanding that you are balancing on two legs.
As a quick side note, a client told me recently, “I don’t have any legs. Heck, I don’t have a stool.” While not common, this does happen from time to time. Here, you may have to ask more questions of the data rather than people. (Hint: It’s always helpful to ask the questions below as relates to the data, which in turn will help to verify what the people say).
- Is the data in this field what I expected?
- Is the range/variety of data what I expected?
- Are there an unusual or unexpected amount of “NULL” values?
Profile the Data
As outlined in Principle 6, there are efficient ways to quickly establish baseline KPIs for validation. The same holds true for profiling a data source for better understanding. Some frequent data points that help establish this profile include:
- Record Count
- Date Range (and any gaps)
- Value Range
- Average Value (and Outliers)
- Common Value(s)
This provides the necessary knowledge to analyze the data, and also provides one more layer of validation via this “sanity check” of looking at the data from a few standard perspectives.
Continuous Active Learning
“Learn Continually – there’s always “one more thing” to learn! “ ~Steve Jobs
Every interaction you have with the data can teach you something (confirm what you expected or give indicators of the unexpected). I can’t tell you how often we hear the “Oh yea, I forgot to tell you…” because when you are looking back years into a set of data, those nuances are not always top of mind. If these nuances are not taken into consideration as part of analysis, your results can be skewed.
Look to learn about these types of scenarios within your data set.
There was a retention failure at one point because the data transfer to the archive server failed for a few weeks before we caught it.
We stopped using Field A for X and started using Field D and E last year.
We generally use Field M to report and run payroll, but raw data is in Field Q, R, S, and T.
There are so many nuances that happen to data in the regular course of business, it’s safe to assume that there will be some in your data. Seek to learn and understand it throughout your endeavor.
iDS provides consultative data solutions to corporations and law firms around the world, giving them a decisive advantage – both in and out of the courtroom. Our subject matter experts and data strategists specialize in finding solutions to complex data problems – ensuring data can be leveraged as an asset and not a liability.