There are few people on the market today who don’t understand the importance of basing your decisions on solid data. Big Data and methods of its management and utilization are the talk of the town. Companies are building complex “client models” based on statistical analysis of client profiles; recruiters systematically analyze their pools of specialists in order to provide the best possible match for every position; and presentations during corporate meetings are full with tables and graphs.
However, does this make our decisions any better? Are we able to utilize the potential of data to its fullest? What should we be working with: the data itself or the information hidden within it?
Let’s try to answer those questions together in order to better understand the relations between data and information and their influence on the quality of our decisions.
First of all, let’s try to define those two key concepts and see if the difference between them has any real meaning. Their basic definition is really simple:
Data is a piece of reality perceived by our sensors. For example:
a) Driver sees a red light with his eyes
b) Police officer sees the results of alcohol breath test of a driver he pulled off
c) Manager sees a graph showing a downward trend of sales
Information is an interpreted piece of Data. For example:
a) Seeing a red light, the driver interprets it as “Stop right now”
b) Comparing the results of the alcohol test with the allowed threshold level, police officer defines the specific driver as “drunk”
c) Seeing the downward trend, manager interprets it as “our sales are slowing down”
However, we can immediately see several problems with these definitions in regard to decision-making:
- Quality of the data is directly dependent on the quality of our censor apparatus. (How good do we see? How good is our measuring device?)
- Turning data into information requires some very specific preliminary knowledge (The meaning of the red light? The allowed alcohol level?). This means that the creation of new information is dependent on interpretation of existing information, which, in its turn, is dependent on additional data.
- Even when the decision-maker understands the data correctly, there is nothing to prevent him from choosing the opposite course of action (Not stopping for red light; letting the drunk driver go because “he looks OK”).
- The quality of data interpretation is dependent on the choice of data to interpret (What is the exact period for presented sales trend? May be the trend looks different when other periods are taken into account?)
- There is no distinction between the primary data (the chemical process inside the breath test tube, the Excel file containing the numbers used to create a sales graph) and the secondary data (the numerical results of the breath test, the sales graph)
Naturally, these problems invite the intervention of biases into our decision-making, even when we try to base it on actual data.
Is there a solution to these problems? Let’s consider a comprehensive example to explore this question:
A company builds a model of its product’s sales based on the day of the week in order to charge full price for high-demand days, while coming up with “special” offers during the low-demand days. A decision is made to base the sales policy on the results of the model. As a result, sales drop on all days of the week. What went wrong?
Let’s say that we may sincerely believe that the measurement itself was correct and the data is genuine. Still, many other problems we pointed out before could be at play here: hours of the day and seasons of the year were not taken into account; the graph itself is only a secondary representation of the actual data; a preliminary knowledge is missing about “why” the sales distribute this way etc. Even if the decision-makers are willing to base their decisions on data, most of its practical meaning is lost during its transgression into information.
This brings us to a very uncomforting conclusion: working with information is much closer to speculation, than to real analysis. The information flow is too chaotic, with too much noise to be interpreted smoothly. If you want to analyze the situation properly, you should work the opposite way: NOT TURNING DATA INTO INFORMATION, BUT INFORMATION INTO DATA! We’re not surrounded by data, but by information, because our mind cannot disregard the flow it receives from its sensors and keeps flooding our judgment with interpretations. However, by carefully coding all this constant measurement, its interpretations and context, we may still turn them into reliable data we can work with.
So instead of the primitive process we perform everyday inside our minds:
DATA->INFORMATION->KNOWLEDGE->ACTION,
we should go for a much more deliberate process of:
INFORMATION->DATA->SYSTEMATIC ANALYSIS->DECISION-MAKING PROCESS.
And only then we may turn to ACTION (when it’s needed).
Related articles: