“Is your data ready for the AI tsunami being unleashed in your organizations?”
This was an insightful question posed on a Zoom call to a team of business technology leaders preparing to move ERP applications from on-premise to the Microsoft Azure cloud. Microsoft’s AI-powered Copilot is a substantial part of this ERP implementation. The promise of sales, product, inventory, and detailed customer insights for this retail manufacturer to power their real-time supply chain is a major motivating factor for the move.
A little later in the call, another critical question came up: “How are we addressing the data migration‘s ongoing data quality and governance requirements?”
Awkward pause.
AI doesn’t differentiate between good and bad input data — it works on logic. We all know the phrase “garbage in, garbage out” in reference to data. As business and technology leaders, we have mostly avoided it for years. With AI, it’s different. The machines automate and make decisions based on specific data sets, models, and inputs, often without human verification. Data integrity and quality are non-negotiable when it comes to AI.
The Starting Point: Addressing and Defining Data Quality
With AI as a driving force for investing in this company’s new system, we bantered on the issues surrounding poor quality, unreliable data that we needed to address. The Microsoft ecosystem partner who was on the team emphasized that AI requires a different level of scrutiny than a standard customer database or CRM cleanup.
This effort included critical sales data, product information, supplier pricing, and financial data that had to be delivered to the firm’s retailers and dealers. If this data is off, the company will not only lose margin but also its premium brand position with retailers. So, a screen share was initiated, and we started outlining what we would need to address.
Here are some important definitions before I dive into some of the key points:
- Inaccurate data: Erroneous data, out-of-date values, and data that is no longer valid; this includes data from AI models. It also includes the requirement of rooting out bias and hallucinations.
- Inconsistent data: Mismatched data from different systems or datasets that isn’t aligned or synchronized, and duplicate data due to repetitive entries.
- Incomplete data: Truncated data with missing values; only partial data is available, altering intelligence and actions that are taken.
- Non-compliant data: Data that was collected without permission and is not sourced or verified; data being used does not adhere to industry and government regulations or ethical standards.
- Unstructured data: Data that does not conform to a specified or expected format and does not adhere to pre-defined standards, making it difficult to analyze or interpret.
- Irrelevant data: Data that does not relate to the application or requirement is also known as “noise;” random information that distorts meaning and analysis.
- Damaged data: Data that has been tampered with or altered inappropriately, including security breaches that result in manipulation that impairs data integrity.
- Unverified data: data that lacks a credible source and has not been checked or validated for accuracy and reliability.
Weighing Data Quality, Governance for Cloud ERP Migration
We also discussed options for cleaning, migrating, and ensuring data quality as we rolled out the new system inside the company and across its supply chain.
The brainstorming discussion included the following options for maximizing quality:
- Have IT and data teams use templates from Microsoft and other data providers. This amounts to using spreadsheets and macros, which is fine for smaller jobs but not for the sophistication and ongoing data quality processes the company requires.
- Look in the Microsoft partners ecosystem for a data migration, quality, and governance provider with proven AI and data experience. Investigate data services offered by the current ERP implementation partner.
- Identify a focused data provider that offers data cleansing, validation, and enrichment services. This could be for cleansing before ERP data migration and/or an ongoing service.
- Evaluate data quality and governance services offered by our current data warehouse/data lake/database provider.
The team is currently researching options to ensure the data is ready to take full advantage of AI with the new ERP systems and processes.
The AI Ecosystem Q1 2024 Report compiles the innovations, funding, and products highlighted in AI Ecosystem Reports from the first quarter of 2024. Download now for perspectives on the companies, investments, innovations, and solutions shaping the future of AI.
The Cost of Bad Data and Lack of Governance for AI Apps
Next came the discussion of “When should we address the data quality and readiness component?” and “What happens if we don’t address data integrity now as part of the ERP migration?”
A quick ChatGPT prompt told us bad data costs businesses an estimated $15 million per year, according to Gartner. For perspective, this is about what the company is investing in the new ERP system.
Calculating the exact cost of bad data is complex, as there are many ways in which low data quality can impact an organization. Our ERP migration team was most worried about the impact of data quality on decision-making with so many data points across our supply chain, damaging customer relationships that the brand had spent over 20 years building, and introducing new inefficiencies as well as additional operational costs — the very reason the team was keen on moving to the new cloud-based ERP system.
Bad data was not evaluated just in terms of financial losses but also in missed opportunities. Below are some of the concerns identified in our planning conversations:
- Ineffective decision-making: Relying on incorrect sales, pricing, or shipping data will have a direct impact on what products are shipped where and how they’re shipped. Incorrect inventory data may result in overstocking or understocking of items. Inaccurate customer data can lead to failed deliveries or miscommunication, escalating operational costs.
- Increased operational costs: On this topic, the leaders were most concerned with bad or poor data causing delays, business and system downtime, and increasing workloads on our teams as they would need to stop and fix issues.
- Unpredictable financial forecasting: As a public company, revenue and margin predictability are a huge priority in reporting accurately to shareholders. Poor or inaccurate data quality would be a trust and financial killer, not only on the bottom line but also on the ability to raise capital when needed.
- Customer and partner trust: Low data quality could result in poor customer experiences, reduced customer loyalty, and increased churn rates. Partners, whose businesses are heavily reliant on ours, could resort to selling other brands in response to data quality issues.
The group also discussed missed opportunities caused by missing or incomplete data, the tech and human costs associated with cleaning up data later, and the risks of neglecting data regulatory and compliance penalties and issues.
It was eye-opening that everyone on that Zoom call had stories of how much a lack of data quality had previously cost their companies.
Editor’s note: Part 2 of this analysis will lay out specific ways to ensure data quality, integrity, and governance in an ERP migration.
Ask AI Ecosystem Copilot about this analysis