A few years ago, I found a great research article from Gartner stating that most big data projects usually fail because of two reasons: lack of objective and miscommunication. This article was published in December 2015 by Gartner. The same article mentioned: “A combination of factors usually derails big data implementations. Problems and failures occur due to factors including strategy, people, culture, capacities, inattention to analytics details or the nuances of implemented tools, all exacerbated by the rapid advancement of the digital economy.”
It was also stated that “once strategy and skill priorities are addressed, then you can move on to big data analytics.” This is very true, I must say; a well-defined strategy and great combination of technical and non-technical skills are essential to improve the success of any data or big data analytics project, regardless of the organization or industry.
It’s difficult to answer the question of why data projects fail. After my experience in data science, I would like to add something that usually is not addressed in many articles that I see.
How Data Projects Differ From Other Types of Projects
In any project, an objective is set up with specific targets that define a scope. Boundaries about what is inside or outside of the scope, as well as a set of tasks and a timeline, are also defined. Resources — material, financial, and human — are commissioned and aligned, and then everything starts. With proper management, and if everything goes according to how it was planned at first, the project concludes. There may be difficulties and problems along the way, but one way or another, the project reaches a conclusion.
One of the worst things that may happen is that the project gets derailed and shifts in a direction that was not originally planned, consuming time and resources with a slightly different objective. This is called ‘scope creep’ and it is one of the biggest issues for project managers.
Any data project could follow the same high-level definition process as any regular project. However, a data project is not like any other project, because the development is based on data. Data is a digital asset that can be stored in many different ‘containers’ — databases, flat files, websites, etc. — and in many ways using different technologies. In addition, data could be stored much earlier than the either the data or leadership teams have been working with it.
As data is generated by many systems and humans, how it is stored is impacted by how each system and professional understand how data should be stored. This may change the organization’s assumptions about what data we have. Therefore, it is so critical to have proper data governance and data catalog in place.
Understanding the Past to Predict the Future
There is another angle to this: Data has never been stored to be analyzed as we are doing it today. Organizations and companies have been storing data as it is something that they produce. They have been making consolidated analyses in order to understand what happened in the past. So, the usual objective for storing data has been to understand ‘how my revenue was doing’ or ‘how my balance sheet is looking compared to the previous term.’ The bottom line is to understand ‘what happened in the past’ to make decisions for the future.
There are other reasons why we have been storing data, like compliance or because it is something that ‘belongs to the organization,’ for example. Organizational leaders use their human intuition and experience combined with those reports to figure out what should be done today, in order to anticipate a future scenario.
Nowadays, with the rise of big data and artificial intelligence, it is necessary to define an objective before doing any data project that will go beyond ‘what happened in the past.’ We need to figure out what can happen in the future based on historical events. Those events can be captured with the data that we have stored.
We are discovering that using data to figure out what can happen in the future can be automated and recommend actions in the present to anticipate future scenarios. So, we are mechanizing and automating human intuition with data and artificial intelligence. By doing so, organizational leaders can focus on high-level strategy and defining long-term targets.
The Big Challenge of Data Projects
Here is the big challenge: Data has never been stored and organized to answer ‘what can happen in the future.’ It requires a big effort to, what I like to call, ‘listen first’ to your data and then start asking questions. Very often, data will disclose what can or can’t be done, regardless of your objective.
Unfortunately, we treat data as a ‘crystal ball’ that we can ask anything and it will ‘speak’ telling us what to do, but that’s not how it works. In fact, it’s the opposite. First, examine and explore what type of questions your data can answer, and then develop the questions. Very likely, data will redefine the main objective of your data project, but be prepared. You may uncover other objectives that you never thought about and they may return more significant value than your original objective.