Simon Nuss, VP of Data and Analytics, Hitachi Solutions, Canada, speaks with Jason Brandt, Managing Partner and Commercial Officer at Stagwell Technologies, in a video interview about data security, explanation of data platforms, the shifts in data warehousing, the different roles of data engineers and scientists, and how data lakehouse platforms are changing the industry dynamics.
Hitachi Solutions is a global cloud-services systems integrator. Stagwell Technologies builds digital platforms and products - for platform and product companies.
Nuss begins by sharing his opinion on industries or verticals that are better or worse at capturing and securing data. He says that investment funds, SaaS product-driven organizations, and organizations with heavy infrastructure like nuclear power plants are usually very data-driven.
He notes that an organization with data-driven operations naturally creates a data culture.
When asked to shed light on data platforms, Nuss starts with Databricks. He mentions it as one of the most advanced end-to-end platforms on the market. He also cites an instance of being able to write three different programming languages at once and have them execute efficiently under one query plan. He also highlights the ability to use a non-Databricks engine to query Databricks data.
Next, Nuss discusses data mesh, referring to it as a decentralized approach to data and analytics. He states that data mesh is a decentralized way to manage data analytics within strict guardrails. It embraces Shadow IT (use of IT systems, devices, software, applications, and services without IT department approval) and establishes guardrails for business users to do what they want with their tools in the way they want it, to achieve their objectives.
Moving on to microservices, Nuss states that it is a way to deploy tools in a compartmentalized way. He adds that it is not done on the analytics side, but the application side and data mesh draws heavily from microservice approaches.
Next, he discusses the three big shifts in data warehousing.
In the first generation, there was a source system, and through ETL, the results were loaded to an SQL server or data warehouse. The approach faced challenges while handling unstructured data like images or videos.
In the second generation, a data lake was added that could store and process unstructured data.
Lakehouse is the third generation of data warehousing.
According to Nuss, a data lake is the end of the line, where once the data is loaded, it cannot be copied or duplicated to another database. He states that data lake is the real database and data lakehouse is an open field storage of all the data, which implies that one can use any tool to query the data and it is not vendor locked.
Thereafter, Nuss emphasizes data engineers and data scientists. While data engineers are process-oriented and platform-focused, data scientists are project-focused and have a business-facing role.
Furthermore, Nuss speaks about the evolution of Microsoft Fabric and mentions it as the only end-to-end open lakehouse platform on the market.
Simon Nuss | VP of Data and Analytics, Hitachi Solutions, Canada
In conclusion, he adds that lakehouse is dominating the industry, and data mesh, although a little conceptual at the moment, will have a slow burn in a decade. However, he affirms that data mesh has fantastic principles for data governance and advises people to be aware of what it is and how the story unfolds.
CDO Magazine appreciates Simon Nuss for sharing his invaluable insights with our global community.