As CIO, I ensure that business leaders have access to the information they need to make the important decisions that affect the company’s future. In its simplest form, this job entails providing reports that summarize various data points.
In the acceleration economy, the thirst for data has grown exponentially. The need has gone well beyond financial statements and sales reports and now includes extensive requests for complicated data analyses that require sophisticated tools including predictive modeling, machine learning, deep learning, and business process automation.
Join us on October 27, 2022 for Acceleration Economy’s Data Modernization Digital Battleground, a digital event in which four leading cloud vendors answer questions on key considerations for updating data strategies and technology. Register for free here.
Additionally, many companies are realizing that data is one of their most valuable assets. Customer data can provide insights into existing, as well as potential, customers. Manufacturing data can optimize the supply chain, improve product quality, and reduce costs. Financial data provides critical information on revenue, expenses, debt, profitability, and growth. Not only is all this data critical to the growth and success of the company itself, but also, in many cases, it is valuable to other companies and as such can be monetized as an additional revenue stream.
Three Key Elements of Data Modernization Success
Unfortunately, most companies have amassed large quantities of data in an array of different, disconnected systems, many of which have been built on legacy platforms and technology. The prospect of attempting to pull all that data together to make good use of it can be quite daunting. Also, the various locations of the data — database servers, file systems, applications, spreadsheets, and even paper — make it difficult for people or tools to be granted access to it.
I recommend a three-pronged approach to effect successful data modernization.
Data inventory — Catalog the various data collections. Begin with questions including:
- What data do you have?
- Where does the data live?
- Who needs access to the data?
- How sensitive is the data?
- What is the value of the data?
Data migration — Here the focus is on moving the data to the cloud. Ask:
- What kind of results are you looking for?
- What tools will be needed to analyze or manipulate the data to get those results?
- What platforms or cloud providers are most compatible with the tools of choice?
Iterate — You don’t have to attempt to bring all the data together at once. Start by prioritizing the data based on perceived value. For instance, you might decide that analyzing customer data to optimize marketing efforts is the best place to start. A pilot project to pull together the necessary pieces and re-assemble them in a cloud-based data solution might provide a quick win, as well as valuable insight into future challenges when you tackle other data silos, such as manufacturing or financial.
The first two facets of data modernization — taking inventory and migrating to the cloud — can happen simultaneously, with different teams approaching the problem. One team could be focused on the technical challenges of choosing a cloud data ecosystem that best fits the company’s needs and the other on a deep dive into the data itself.
When it comes to executing an iterative process, a pilot project to pull together the necessary pieces and re-assemble them in a cloud-based data solution might provide a quick win, as well as valuable insight into future challenges when you tackle other data silos, such as manufacturing or financial.
Choosing a Cloud Service Provider
The big three cloud providers — Amazon Web Services, Microsoft Azure, and Google Cloud — all offer compelling platforms with data management, analytics, machine learning, application tools, and pay-as-you-go pricing. Ultimately, the best product will largely depend on your unique needs. For the small to medium-sized companies I have been involved in, we have generally chosen one specific platform, and stuck with it. For us, it didn’t make sense to learn how to manage multiple platforms, as each can be quite different in its approach. However, larger companies may benefit from spreading their solutions across multiple cloud providers, leveraging the best of each.
Final Thoughts
Here are a few more things to be mindful of as you embark on the data modernization journey:
- Data quality — When you start to bring together data from various siloed systems, one challenge will be to ensure that the data is trustworthy, and that there is a common understanding of its use and meaning. You don’t want data to get misused because terminology and keywords are used differently in the organization’s various parts. Be sure you’re not comparing apples to oranges.
- Data policies — Your company may or may not have policies defined regarding data storage and retention. Be sure that the data migration doesn’t violate any of these policies, keeping in mind that the policies may differ across the organization. Also, be aware of any government restrictions on the use or movement of data within or across state or national boundaries.
- Data security — Many disparate systems manage data security within the applications that create the data. When you migrate that data into cloud-based data facilities, you may lose the security that had previously been assumed. Make sure not to inadvertently expose sensitive data to view or analysis by unauthorized individuals or systems.
The Top 20 Data Modernization Providers
I touched on the cloud service provider selection process above, but I want to provide more details on specific products that facilitate data modernization. In the table below, I’m presenting the companies that Acceleration Economy analysts have identified as the best candidates to help you, their product/service and key benefits they offer. These companies, of course, are the subject of ongoing analysis at the Acceleration Economy site.
Company | Key data modernization product / service | Why they are in the Top 20 |
Alibaba | PolarDB | PolarDB is a cloud-native relational database compatible with MySQL, PostgreSQL, and Oracle. PolarDB provides the performance and availability of traditional enterprise databases and the flexibility and cost-effectiveness of open-source databases. |
Amazon | Amazon Aurora | Amazon Aurora is an affordable and efficient option for running small and medium instances on cloud servers. Aurora offers a choice of additional features designed to make app testing and development easy and efficient. Tens of thousands of customers within many industries rely on Amazon Aurora |
Cloudera | Cloudera Data Platform | Cloudera quickly became a leader in the big data market after it launched in 2008. They turned Hadoop into an enterprise data hub. |
Cockroach Labs | CockroachDB | CockroachDB is the world’s most evolved cloud SQL database — providing scale, resilience and low latency. Among their customers: Comcast, Lush, NUBank, Bose or Form3 Financial. |
Couchbase | Capella Cloud Database | Impressive roster of customer wins with the likes of Home Depot; primary focus on cloud for modernization; company is growing fast in a challenging economy. |
Databricks | Databricks Lakehouse | More than 5,000 organizations worldwide — including Comcast, Condé Nast, H&M, and over 40% of the Fortune 500 — rely on the Databricks Lakehouse Platform to unify data, analytics and AI. |
Datastax | Astra DB | Gives enterprises and developers the simplicity and cloud economics to deploy massive data that powers rich interactions through modern apps. DataStax and Apache Cassandra enable over 90 percent of the Fortune 100 to create transformational outcomes with data. |
Google Big Query | AI-powered easy apps. Easy to use, plus there’s a free tier to explore and choose the ideal services that you need. You can start with cloud computing without worrying about complex user interfaces, dashboards or confusing manuals. | |
IBM | IBM Cloudant | IBM Cloudant is a distributed database that is optimized for handling heavy workloads that are typical of large, fast-growing web and mobile apps. Available as an SLA-backed, fully managed IBM Cloud service, Cloudant is also available as a downloadable, on-premises installation, and its API and replication protocol are compatible with an open source ecosystem and libraries for the most popular web and mobile development stacks. |
Informatica | Cloud Data Integration for Cloud ETL and ELT | Combines all your business data into a trusted, unified view with zero-code data integration tools; customers include Grant Thornton, British Telecom and Unilever. |
MariaDB | MariaDB | Open Source database management system hasstrong focus on security. Cluster database architecture which it brings multi-master replication. Customer include Nokia, Samsung, and Red Hat. |
Microsoft | Microsoft Azure Synapse | Azure Synapse is an analytics service that brings together enterprise data warehousing and Big Data analytics. It gives you the freedom to query data on your terms, using either server-less or provisioned resources. Azure Synapse delivers a unified experience to ingest, prepare, manage, and serve data for BI and machine learning applications. |
MongoDB | MongoDB | MongoDB has cultivated a reputation as a versatile, flexible NoSQL database and is currently used as the backend data store of many high-profile businesses and organizations including Forbes, Facebook, Google, IBM, Twitter, and many more. |
Oracle | Oracle Exadata Database Service | Oracle Exadata is software and hardware engineered to support high-performance Oracle databases. These applications are delivered on an integrated development and deployment platform with additional tools to create and extend new services quickly. Users include Dell Technologies, Duracell and Mazda. |
RedHat | OpenShift Database Access | Open hybrid cloud is Red Hat’s recommended strategy for architecting, developing, and operating a hybrid mix of applications, delivering a flexible cloud experience with the speed, stability, and scale required for digital business transformation. |
Redis | Redis Stack | Redis Stack can handle millions of read/write operations per second at sub-millisecond latencies, running on any cloud platform. Seamlessly and automatically scale data across hybrid, cloud regions, and multiple clouds. |
SAP | SAP BW/4HANA | SAP HANA Cloud offers seamless integration of existing solutions and hybrid systems, rapid deployment, simple development of native business applications, and consistent user experience. |
Snowflake | Snowflake | Snowflake has out-of-the box features like separation of storage and compute, on-the-fly scalable compute, data sharing, data cloning, and third party tools support. |
Teradata | Teradata Database | Teradata supports hybrid and multi-cloud capability using the same database across all deployment modes. Customers include Vodafone, Volvo, Groupon and American Airlines |
VMware | VMWare Private Cloud | VMware Private Cloud is well suited to enterprises considering moving away from on-premises infrastructure to a cloud environment. It lets customers allocate resources on-demand and have complete security controls. VMware is broadly considered the ‘go to’ partner for virtualization. |
Want more insights into all things data? Visit the Data Modernization channel: