Cloud Wars
  • Home
  • Top 10
  • CW Minute
  • CW Podcast
  • Categories
    • AI and Copilots
    • Innovation & Leadership
    • Cybersecurity
    • Data
  • Member Resources
    • Cloud Wars AI Agent
    • Digital Summits
    • Guidebooks
    • Reports
  • About Us
    • Our Story
    • Tech Analysts
    • Marketing Services
  • Summit NA
  • Dynamics Communities
  • Ask Copilot
Twitter Instagram
  • Summit NA
  • Dynamics Communities
  • AI Copilot Summit NA
  • Ask Cloud Wars
Twitter LinkedIn
Cloud Wars
  • Home
  • Top 10
  • CW Minute
  • CW Podcast
  • Categories
    • AI and CopilotsWelcome to the Acceleration Economy AI Index, a weekly segment where we cover the most important recent news in AI innovation, funding, and solutions in under 10 minutes. Our goal is to get you up to speed – the same speed AI innovation is taking place nowadays – and prepare you for that upcoming customer call, board meeting, or conversation with your colleague.
    • Innovation & Leadership
    • CybersecurityThe practice of defending computers, servers, mobile devices, electronic systems, networks, and data from malicious attacks.
    • Data
  • Member Resources
    • Cloud Wars AI Agent
    • Digital Summits
    • Guidebooks
    • Reports
  • About Us
    • Our Story
    • Tech Analysts
    • Marketing Services
    • Login / Register
Cloud Wars
    • Login / Register
Home » How to Implement a Data Lakehouse to Maximize ROI
Data

How to Implement a Data Lakehouse to Maximize ROI

Wayne SadinBy Wayne SadinApril 20, 2023Updated:April 20, 20235 Mins Read
Facebook Twitter LinkedIn Email
Share
Facebook Twitter LinkedIn Email

So, here you are, faced with the fundamental question for a data engineer/data scientist: How do I provide the secure, available, scalable, flexible, accessible, reliable, and cost-effective data ingestion, storage, transformation, reporting, and analytic environment my organization needs to compete in today’s acceleration economy?

Wait . . . go back and read that again, slowly. Look at all those requirements! How the heck do you deliver on all those — sometimes conflicting — demands?

Way back when I was a data engineer, the answer was to license and implement (and patch and upgrade and support and train people to use) a plethora of specialized tools that collectively provided the needed features (except for “flexible” and “cost-effective,” in most cases). And, of course, most of those tools ran on-premise, requiring lots of additional work and cost.

Today, your solution might be to implement a software-as-a-service (SaaS) data lakehouse that combines most, if not all, the above features. Data lakehouse products can be licensed as a stand-alone toolset, as is the case with Acceleration Economy’s Cloud Wars Top 10 vendor Snowflake’s product, or as part of an analytics toolset, like the products offered by Data Modernization Top 10 Shortlist vendor Qlik. Whether or not you license the data lakehouse separately from the analytics product mostly depends on the scale and complexity of your needs.

OK, full disclosure: This class of product has a number of stock keeping units (SKU)s that can be licensed separately, depending on your needs, so you’ll spend some time working through your configuration. And many of the products have “application stores” that allow customers to license additional capabilities from affiliated vendors. Pay close attention to your needs versus wants or “cost-effective” can go out the window . . . but a SaaS data lakehouse suite is still far cleaner than any multi-vendor tool amalgam can be.

Which companies are the most important vendors in data? Check out the Acceleration Economy Data Modernization Top 10 Shortlist.

A Proper Data Lakehouse Implementation

Since a data lakehouse combines the features of a data lake with those of a data warehouse — with an analytics and reporting capability, perhaps — its use can dramatically speed decision-making by improving data access and analytics. A proper data lakehouse implementation depends on a set of thoughtful decisions (including, but not limited to):

  • Data Governance. What data are you collecting? How long do you need to store it? Who should have access, and what kind of access should they have (this incorporates both role-based controls and “data classification” into categories like “internal use only”)? Who can grant access to each data element, and what kind of audit trails are needed? How should data be described so people can find what they need (which gets into taxonomy and metadata)?
    One important part of data governance is data lineage (or data provenance), which means demonstrating (to auditors and perhaps regulators) where data originated and how it was copied and transformed into its end products (reports, dashboards, etc.). Some data must be pristine: It’s in-scope for Sarbanes-Oxley Act (SOX) audits and for external financial reporting. But data quality comes at a cost — especially for “big data” — so not all data needs to be perfect (see “Data Engineering” below).
  • Data Security/Availability. This is your next decision. Start with encryption (in today’s world the answer is “Yes, encrypt” — don’t overthink this). Then layer in data access controls (to implement the governance decisions made above). If you do it right, you’ll find that this is where data security intersects with zero-trust principles. What level of redundancy is needed?
  • Data Engineering. Here you’ll face another set of decisions that are related to the earlier decisions. Data engineering is largely about cost, and the trade-offs that are needed to balance cost against every other objective. FYI, users always desire three things from an information technology (IT) system: that it be free, instant, and all-encompassing. What kind of performance is needed from, for example, real-time data ingestion for Internet of Things (IoT) applications, analytics performance for trading-floor applications and industrial controls, or archival storage for historical comparisons? What is the desired redundancy cost in license fees, bandwidth, latency (for dual-commit transactions), and FTEs (full-time equivalents)?
  • Tool Access. This used to be easy; IT got to access IT tools, and end-users consumed the output of the tools. Then the pendulum swung — too far, in my opinion — and shadow IT flourished as users got access to powerful tools and huge datasets without necessarily being subject to, or even being aware of, data governance and security controls. This tool/control mismatch created many problems for organizations as multiple sources of truth were created and maintained — with needless cost and inadequate security. As a CIO, I’ve spent years stamping out most shadow IT, but data lakehouse products finally allow IT to embed security and governance controls right in the lakehouse, thereby making it easier to enforce important organizational standards and harder for users to inadvertently cause problems. Data lakehouse tools can also bridge the gap between “citizen developers” (users with cool tools) and “pro developers” (IT specialists with cooler tools) and thus facilitate collaboration among groups that heretofore used different tools and had different controls. Effective tool deployment, governance, and training aren’t automatic. Data lakehouse tools should operate within the organization’s overarching data security and governance frameworks and be deployed following best practices that make it easier for all users to “do the right thing” with data and analytics (which means getting rid of spreadsheets almost everywhere).
Insights into the Why and How of Data and Business Modernization featured image
Guidebook: The Why and How of Data and Business Modernization

Conclusion

Data lakehouse technology combines powerful tools with access to treasure troves of data. Proper implementation and use of a data lakehouse and its associated analytics tools empowers everyone from top executives to customer-facing employees to make decisions faster and more accurately than ever before. Making smart decisions when designing and implementing the data lakehouse is critical to maximizing return on the organization’s big investment in technology, and its even bigger investment in generating and acquiring data.


Want more insights into all things data? Visit the Data Modernization channel:

Data Modernization Channel Logo

CIO CXO data data cloud data tools data warehouse featured governance Internet of Things leadership Qlik SaaS scalability Snowflake software
Share. Facebook Twitter LinkedIn Email
Analystuser

Wayne Sadin

CIO, PriceSmart
Cloud Wars Advisory Board Member

Areas of Expertise
  • Board Strategy
  • Cybersecurity
  • Digital Business
  • Website
  • LinkedIn

Wayne Sadin, a Cloud Wars analyst focused on board strategy, has had a 30-year IT career spanning logistics, financial services, energy, healthcare, manufacturing, direct-response marketing, construction, consulting, and technology. He’s been CIO, CTO, CDO, an advisor to CEOs and boards, Angel Investor, and independent director at firms ranging from start-ups to multinationals.

  Contact Wayne Sadin ...

Related Posts

Snowflake Follows 34% RPO Spike with AI Data Cloud New-Product Blitz

June 5, 2025

How ServiceNow and EY Use AI to Merge Brand and Demand in B2B Marketing

June 5, 2025

AI Agent Interoperability: Community Project Details MCP Vulnerabilities, Enterprise Security Measures

June 5, 2025

Snowflake’s 1-2 Combo: RPO Jumps 34%, Then AI/Data Product Blitz

June 5, 2025
Add A Comment

Comments are closed.

Recent Posts
  • Snowflake Follows 34% RPO Spike with AI Data Cloud New-Product Blitz
  • How ServiceNow and EY Use AI to Merge Brand and Demand in B2B Marketing
  • AI Agent Interoperability: Community Project Details MCP Vulnerabilities, Enterprise Security Measures
  • Snowflake’s 1-2 Combo: RPO Jumps 34%, Then AI/Data Product Blitz
  • AI Agent & Copilot Podcast: Security, Microsoft Copilot Partnership Insights from Zenity’s Michael Bargury

  • Ask Cloud Wars AI Agent
  • Tech Guidebooks
  • Industry Reports
  • Newsletters

Join Today

Most Popular Guidebooks

Accelerating GenAI Impact: From POC to Production Success

November 1, 2024

ExFlow from SignUp Software: Streamlining Dynamics 365 Finance & Operations and Business Central with AP Automation

September 10, 2024

Delivering on the Promise of Multicloud | How to Realize Multicloud’s Full Potential While Addressing Challenges

July 19, 2024

Zero Trust Network Access | A CISO Guidebook

February 1, 2024

Advertisement
Cloud Wars
Twitter LinkedIn
  • Home
  • About Us
  • Privacy Policy
  • Get In Touch
  • Marketing Services
  • Do not sell my information
© 2025 Cloud Wars.

Type above and press Enter to search. Press Esc to cancel.

  • Login
Forgot Password?
Lost your password? Please enter your username or email address. You will receive a link to create a new password via email.