From digital transformation to data culture
In a recent post, we recapitulate some questions about the current state and lessons learned from the years that have passed since the wave of digital transformation began and spread globally.
We mentioned before, that the constant or exponential increase in the volume of data is an organic counterpart of every business digitization.
To give a really illustrative example, according to a ProjectPro study, Walmart collects 2.5 petabytes of unstructured data per hour, the equivalent of 167 times all the books in the United States Library of Congress. It is one of the biggest retailers in the world, but it is just one of them. If we broaden and extend this idea to all companies, governments, and organizations in the world, the dimension becomes absolutely vast.
All these billions of data sources, cash registers, e-commerces, security cameras, temperature sensors, scanners, among other examples, permanently generate “raw” data that. But, in order for them to become “digestible” in some way, they must go through entire processing stages.
This data life cycle, very well explained in an excellent article in the Harvard Data Science Review, goes from collection, through processing, storage, to management. Then at a later stage will come analysis, visualization, and interpretation.
Let’s stop here for a few minutes. Not all the volume of data collected is useful nor has to be necessarily stored. But the one that is collected, processed and stored in databases, must go through processing so that it can become productive, that is, to be reliable and, therefore, make it possible to be incorporated into data-based strategies.
Most of us regularly consume dashboards, data visualizations of different kinds. From the dashboard on our cars, to something as widespread as the storage usage graph shown by our mobile phones when we are running out of space. But what we are not aware of is that, from storage to visualization, there are necessary and indispensable processes so that the information we see is accurate, reliable and structured enough to be displayed.
The most common of them are data cleanup (discard unnecessary information), or data wrangling, various processes that make raw information available to be analyzed: cleaning, structuring, and consolidation.
All these steps, clearly detailed in a Harvard Business School article, perform a key role in the validation process: the procedure of verifying that the data is consistent and has quality.
In the current state of digitalization of businesses, performing these processes manually is no longer a viable option. The global market already has tools such as Conciliac EDM that can perform these processes not only automatically but completely unattended through technologies such as RPA (Robotic Process Automation).
To illustrate this idea with an example, consider an apparel brand with multiple stores, generating sales and user data and collecting them from various data sources. On the one hand, the sales of each store, on the other hand the sales of its e-commerce, in a third source the customer data, on another source the payments entered by different digital means, all of them in the physical store and online.
Most likely, when optimizing the mix of payment options, the stored information is neither consolidated nor reconciled, nor can it yield accurate data on which payment option each consumer prefers in each channel, and which of them are the most convenient for the company. Therefore, the conclusion that can surely be arrived to is that the stored data, although comprehensive, is not reliable enough to support an informed decision.
The reality is that today, observing the market leaders, we can uphold that we are in an advanced state in digital transformation; nevertheless, still insipient when it comes to data productivization. More and more companies and organizations are joining the development of data management activities, but still very based on analytics from making the huge amounts of data collected displayable. If, as is happening more and more, the entire strategy of our companies will be based on the interpretation of this data, maybe it is time to address and focus on the processes that, like those mentioned above, can guarantee that what we see in our dashboards it is real and reliable because it has been validated.