When it comes to data, one fact has endured from the origin of mankind: it is inextricably linked to the decision-making process. The more data that we can include in our analysis, the more we can understand the past and navigate the future effectively.
Practices in the capture and storage of business data — often from diverse global sources — must evolve in response to the skyrocketing quantity of data that businesses produce and their need to act on it faster than ever. One research firm, Statista, forecasts that there will be 181 zettabytes of data by 2025, up from 97 zettabytes this year. A zettabyte is one billion terabytes. The chart below depicts this growth trajectory.
Data Proliferation
Companies can only store, manage, and act on data at the required speed by modernizing their data infrastructure. To do so, they need to move past the legacy construct of monolithic systems, which store a single type of data in siloed fashion — with no movement of data between them. By modernizing such systems in the cloud, companies enable unification of data with robust new functionality and services that don’t exist in legacy systems.
Why Modernize Now
To understand the value of modernizing data in the cloud, it’s helpful to start with this baseline, data-oriented definition of the cloud, a vast network of remote servers hooked together and meant to operate as a single ecosystem. These servers store and manage data, run applications, or deliver content or a service such as streaming videos, web mail, office productivity software, or social media. Users can access files and data from any Internet-capable device — making information available anywhere, anytime.
Because of complexity, silos, and the need to have vast amounts — and sources — of data accessible at high velocity, the need to modernize data infrastructure takes on more urgency every day. Moving data to the cloud is the most compelling option because the cloud will deliver (at least) three critical benefits:
- Reduces cost of ownership – This is achieved by avoiding investment in on-premises hardware and software, as well as the requisite real estate, maintenance, and security. Since the billing model is based on usage, traditional CapEx transitions to OpEx.
- Provides elasticity and an ecosystem of services – The virtual nature of the cloud means that storage capacity can easily scale up or down at the fast pace that’s required. In addition, the cloud not only provides storage, it also offers a plethora of applications available to handle any data-related work or project (examples from Google Cloud include object storage, data warehouse, virtual machines, managed Kubernetes, and many more), avoiding the need for ongoing maintenance and software licensing maintenance
- Supports collaboration across companies or departments – Because all infrastructure, applications and data can be centralized when you migrate to the cloud, collaboration beyond organizational borders is possible, contextualizing internal data and delivering solutions and insights at higher speed.
The cloud allows any organization to ingest, analyze and contextualize data at high speed. And we all know that fast decision-making and real-time actions are key to capitalizing on business opportunities in the Acceleration Economy.
In addition, the cloud requires low to no maintenance on the part of the customer, improving security and protection of data and systems, as well as data recovery in case of any threat or incident. This is especially important for highly regulated industries that require large volumes of historical data and regulated compliance by implementing business rules that apply to many systems and tools at once.
How to Make Modernization Happen
There is not a magic recipe for any organization to transition from traditional or monolithic data systems to a cloud data system. That entails moving from a physical infrastructure that has been designed as a reflection of a traditional, hierarchical organization towards something that is more flat, horizontal, and collaborative — with fewer boundaries and barriers.
However, there are some cloud data modernization recommendations that should hold true in virtually all industries and use cases:
- Prioritize educating people about modernizing data infrastructure – why you’re doing it and the benefits they will experience.
- Evaluate multiple cloud data providers to determine the best fit.
- Identify a small data stack to start as a proof of concept.
- Monitor performance of the migration and be ready to course-correct if needed.
- Identify gaps and issues; implement remediation measures that were created in the planning process.
- Repeat the process with bigger data stacks.
While the points above are ordered based on a logical sequence, the first point, relating to people, must be addressed at the outset. First, moving to the cloud challenges the status quo (data ownership, silos, org structure) of many organizations. With cloud technology, we are moving from a practice of ‘data to report to decision’ to a more streamlined practice of ‘data to decision’; the implications of this new paradigm can be highly impactful.
Join us on October 27, 2022 for Acceleration Economy’s Data Modernization Digital Battleground, a digital event in which four leading cloud vendors answer questions on key considerations for updating data strategies and technology. Register for free here.
So, when embarking upon modernization of the data stack, a company should start by educating (or re-educating) the entire workforce, starting from the top of the hierarchy, about being open and transparent, practicing collaboration among teams (which team generates and analyzes specific types of data), delegating more decisions to others, and learning about new technologies and tools. Once the cultural element has been addressed, let engineers and technical people handle the technical aspects of cloud data modernization.
Once migration and modernization have happened, the tech team must stay in close contact with the cloud infrastructure vendor(s) and have a clear understanding about the responsibilities of each party. It is very important to actively monitor cloud performance, storage, and applications usage as well as vendor billing practices — known as FinOps. Close internal monitoring of billing, combined with good communication with the provider(s), facilitates solid operational results and keeps the cloud provider(s) fully engaged on your behalf.
Technical Considerations
There are numerous vendors offering cloud solutions, but again, each and every organization is unique, with a different vision, strategy, and goals. It is easy to understand, therefore, why each vendor is more suitable for certain use cases, industries, and businesses, so a deep understanding of each vendor’s product offering is critical before adopting one solution over others.
An evaluation of vendor strengths — and alignment with your business goals and culture — must include:
- security architecture, technology, and practices
- definition of responsibilities in case of an incident
- billing system and how it operates
- how scalability and applications usage operate from a financial perspective.
- training and education on the company products and services
The Top 20 Data Modernization Providers
In the analysis above, I’ve focused on the why and how of data modernization in the cloud and shared important technical considerations.
There’s one more critical technology factor to consider, and that’s the vendor or partner you select to execute on your data modernization goals. In the table below, I’m presenting the companies — from my direct, hands-on experience and ongoing engagement — that are the best candidates to help you, and some key strengths they offer. These companies, of course, are the subject of ongoing analysis at the Acceleration Economy site.
Company | Key data modernization product / service | Why they are in the Top 20 |
Alibaba | PolarDB | PolarDB is a cloud-native relational database compatible with MySQL, PostgreSQL, and Oracle. PolarDB provides the performance and availability of traditional enterprise databases and the flexibility and cost-effectiveness of open-source databases. |
Amazon | Amazon Aurora | Amazon Aurora is an affordable and efficient option for running small and medium instances on cloud servers. Aurora offers a choice of additional features designed to make app testing and development easy and efficient. Tens of thousands of customers within many industries rely on Amazon Aurora |
Cloudera | Cloudera Data Platform | Cloudera quickly became a leader in the big data market after it launched in 2008. They turned Hadoop into an enterprise data hub. |
Cockroach Labs | CockroachDB | CockroachDB is the world’s most evolved cloud SQL database — providing scale, resilience and low latency. Among their customers: Comcast, Lush, NUBank, Bose or Form3 Financial. |
Couchbase | Capella Cloud Database | Impressive roster of customer wins with the likes of Home Depot; primary focus on cloud for modernization; company is growing fast in a challenging economy. |
Databricks | Databricks Lakehouse | More than 5,000 organizations worldwide — including Comcast, Condé Nast, H&M, and over 40% of the Fortune 500 — rely on the Databricks Lakehouse Platform to unify data, analytics and AI. |
Datastax | Astra DB | Gives enterprises and developers the simplicity and cloud economics to deploy massive data that powers rich interactions through modern apps. DataStax and Apache Cassandra enable over 90 percent of the Fortune 100 to create transformational outcomes with data. |
Google Big Query | AI-powered easy apps. Easy to use, plus there’s a free tier to explore and choose the ideal services that you need. You can start with cloud computing without worrying about complex user interfaces, dashboards or confusing manuals. | |
IBM | IBM Cloudant | IBM Cloudant is a distributed database that is optimized for handling heavy workloads that are typical of large, fast-growing web and mobile apps. Available as an SLA-backed, fully managed IBM Cloud service, Cloudant is also available as a downloadable, on-premises installation, and its API and replication protocol are compatible with an open source ecosystem and libraries for the most popular web and mobile development stacks. |
Informatica | Cloud Data Integration for Cloud ETL and ELT | Combines all your business data into a trusted, unified view with zero-code data integration tools; customers include Grant Thornton, British Telecom and Unilever. |
MariaDB | MariaDB | Open Source database management system hasstrong focus on security. Cluster database architecture which it brings multi-master replication. Customer include Nokia, Samsung, and Red Hat. |
Microsoft | Microsoft Azure Synapse | Azure Synapse is an analytics service that brings together enterprise data warehousing and Big Data analytics. It gives you the freedom to query data on your terms, using either server-less or provisioned resources. Azure Synapse delivers a unified experience to ingest, prepare, manage, and serve data for BI and machine learning applications. |
MongoDB | MongoDB | MongoDB has cultivated a reputation as a versatile, flexible NoSQL database and is currently used as the backend data store of many high-profile businesses and organizations including Forbes, Facebook, Google, IBM, Twitter, and many more. |
Oracle | Oracle Exadata Database Service | Oracle Exadata is software and hardware engineered to support high-performance Oracle databases. These applications are delivered on an integrated development and deployment platform with additional tools to create and extend new services quickly. Users include Dell Technologies, Duracell and Mazda. |
RedHat | OpenShift Database Access | Open hybrid cloud is Red Hat’s recommended strategy for architecting, developing, and operating a hybrid mix of applications, delivering a flexible cloud experience with the speed, stability, and scale required for digital business transformation. |
Redis | Redis Stack | Redis Stack can handle millions of read/write operations per second at sub-millisecond latencies, running on any cloud platform. Seamlessly and automatically scale data across hybrid, cloud regions, and multiple clouds. |
SAP | SAP BW/4HANA | SAP HANA Cloud offers seamless integration of existing solutions and hybrid systems, rapid deployment, simple development of native business applications, and consistent user experience. |
Snowflake | Snowflake | Snowflake has out-of-the box features like separation of storage and compute, on-the-fly scalable compute, data sharing, data cloning, and third party tools support. |
Teradata | Teradata Database | Teradata supports hybrid and multi-cloud capability using the same database across all deployment modes. Customers include Vodafone, Volvo, Groupon and American Airlines |
VMware | VMWare Private Cloud | VMware Private Cloud is well suited to enterprises considering moving away from on-premises infrastructure to a cloud environment. It lets customers allocate resources on-demand and have complete security controls. VMware is broadly considered the ‘go to’ partner for virtualization. |
Want more insights into all things data? Visit the Data Modernization channel: