Opinion & Analysis
Written by: Julia Bardmesser | CEO of Data4Real, LLC
Updated 11:00 AM UTC, Mon July 10, 2023
In our previous two articles in this series, we discussed the emerging transformation that innovative businesses are undertaking right now. They are evolving from data-enabled to data-driven. We described what this means (hint: it’s more than just hiring more data scientists), as well as the business case and drivers for taking the journey. We further discussed new technologies and architecture patterns that enable the data-driven model.
However, becoming a data-driven business involves more than just introducing new data policies or technology. Putting data into the center of the business requires rethinking some basic ideas. Who owns data if it’s meant to be shared by everyone? Who is responsible for cleaning it? How will you keep data safe and secure? Data-driven organizations optimize the processes, roles, accountabilities and responsibilities for how data is created, stored, protected, shared and maintained. It becomes both safer and easier to share data across the company, as well as with partners, suppliers and customers.
In this article, we:
Section 1. Defining Data Governance
Since this is an article about data governance written by a data professional, let’s start with a definition. What exactly is data governance?
Definition:
Data governance is a business discipline whose goal is to ensure that data across the enterprise is usable for any business purpose, current or future.
Figure 1. The components of a holistic data governance model.
Data management encompasses the technology platforms, operations processes and business stewardship of the key data disciplines depicted in the illustration above.
Data governance is embedded at the center of the data management discipline. It provides the oversight to ensure implementation and measurable success of data management practices. Data governance’s role is to orchestrate the pieces within the data management ecosystem. It provides policies, standards and business processes that tie together data management technology and operations with the business processes required to enable data to become useful.
In our first article, we identified the four stages of data maturity in organizations, from data-aware to data-driven. We summarize the stages below again in Figure 2.
Figure 2. The four stages of maturity and readiness inside organizations.
As you can imagine, data governance did not exist as a discipline in “data-aware” enterprises. It first appeared as organizations started to progress toward the “data-informed” stage and mostly concentrated on data privacy and access. However, as organizations have moved into a “data-enabled” stage, they have acknowledged that when data is created as a byproduct of a business process, there needs to be a discipline that makes the data usable for other purposes, especially the ones that require cross-silo data. To accomplish that, data governance creates rules and processes to:
Section 2. Data Governance in Data-Enabled Enterprises
In most data-enabled organizations today, data is governed via a delicate interplay between three types of actors in a company:
This organizational model, while much more effective than the access-focused governance of yesterday’s data-informed organizations, is still subject to significant friction between the main actors. There are two main sources of friction. One is the misalignment of priorities between consumers and producers of data. The other is the lack of real enforcement power in the data governance team.
In the data-enabled, application-centric model:
Let’s take a look in detail how these three groups interact across all four areas of data governance:
Section 3. How is Data Governance Different in Data-Driven Enterprises?
Let’s review the definition of the data-driven organization from the first article in our series, then extrapolate how data centricity would inform changes in data governance, structure and practices.
In a data-driven organization, data:
The adoption of data-centric architectures and new collaborative database technology, which we covered in our second article, enables a successful transformation to a data-driven model. The collaborative database is a common platform where (1) producers can save data for their own operational use; (2) consumers can find and use data for their own operational or analytic needs; and (3) governance can apply common enterprise standards and controls.
There are two major implications for data governance:
When data becomes the company’s main asset and product, data quality becomes a priority. The goal of ensuring quality changes how businesses prioritize and fund data, as well as build business and technology architectures. Data governance evolves from a separate, often audit-enforced process into a pervasive business and technology practice. Some current, common data governance practices become obsolete.
Let’s see how this shift in understanding, incentives and technology changes the data governance discipline:
• Producers and consumers utilize data from the same repository as everyone else. They are measured and compensated by the quality and usability of data, which is now one of the enterprise’s main products. Business executives, product managers and supporting technology teams are incentivized to make data useful and usable in ways beyond the immediate application of the producer business. Additionally, for the enterprise that has fully implemented the collaborative database architecture, data quality, data integrity, and data centricity are encoded via self-harmonization properties that autonomously discover and remediate data discrepancies.
In a data-driven organization, the clear line of demarcation between producers and consumers blurs. All functions produce and consume each other’s data, governed by policies around who is allowed to see and change what. Smart contracts, directly embedded in the data itself, enforce policies to prevent one function from accidentally viewing or overwriting data that is critical to another function. Furthermore, all data changes and lineages are natively preserved in collaborative database. If you need to understand the provenance of where data came from, or revert to an earlier snapshot, this can be more easily accomplished.
• Governors are still accountable for and measured by how effectively data is leveraged in an organization — how good, available, and well-protected it is. They will continue to coordinate and work with producers and consumers to ensure that data is always available and of high quality. However, in a data-driven organization, they also play a more active and hands-on role for some data management functions, especially around the creation and enforcement of data access and retention policies. In essence, one of the major implications of data-driven transformation is the direct responsibility of the governors over some governance functions that used to live exclusively within the domain of the producers.
Let’s give an example. Let’s assume that a country has just issued a new data privacy directive. No personally identifiable information about citizens can be exposed to any personnel outside of the country without legal and regulatory approval. A company’s Regulatory Compliance & Legal team has raised the issue, and the governors within the company (the data governance council) modify the company’s data privacy policy. The governors also issue an internal notification socializing the change to the policy, and set an enforcement date. They are accountable for ensuring that the company is compliant with the policy by the enforcement date.
How would this policy be implemented and enforced in a traditional, data-enabled organization? Usually, the task would fall on the producers to execute the change to make the policy real. First, the business leaders in the organization need to know which producers will be impacted by the policy change. Then, the producers will need to request the funding to make changes to their software in how they store or provide access to data about customers in the affected country, so that they can provide the necessary protective controls. They will need to prioritize the effort to implement the change on top of other business priorities. Finally, if originating sources of data about individuals in that country were copied to other operational or analytic data stores, the owners of those systems would also need to invest in changing the access controls in their respective systems to be compliant with the new policy. Those changes will go through the typical software development lifecycle, meaning that producers will modify, test and release code to production systems in line with their respective application lifecycle management processes. Consumers may scan some of their systems or reports to ensure that they flush any affected data based on the policy, but they generally depend on the producers to comply with the policy. Depending on the size and complexity of the organization (and how much personally identifiable information about affected citizens has diffused through the organization), it may take years for the company to be fully compliant.
Let’s contrast this with a data-driven organization that has adopted a data-centric architecture (with a collaborative database). Based on the principles of “defensible data,” governors can directly create a data access policy that limits access and embed it into blocks of data that contain personally identifiable information about individuals from the affected country. Producers and consumers can continue to use the collaborative database, and their applications will be restricted from accessing the data blocks if they do not meet the policy requirements. Governors not only set the enforcement date of the policy—they can directly set the enforcement of the policy itself!
With this example in mind, let’s now take a detailed look at how these three groups interact across all four areas of data governance:
Summary
In new data-centric models, data is treated as the main asset of the enterprise and the driver for its growth and sustainable development. With that shift in focus, incentives and corresponding shift in culture, the data governance/data management function evolves from being overseer and an enforcer of (often) unpopular rules into a business function directly accountable for the success of the various business lines. The concept of producers and consumers is blended into a model where business functions participate in a community. Everyone equally creates and consumes data in a virtuous loop. Each business function may still “own” its composable blocks of data. Data governance, however, oversees the policies that bind composable blocks and organize them into a cohesive chain where data can be discovered and consumed by other parties within and across functions.
Over the course of this series, we (1) described the business value for the transformation to a data-driven culture and model, (2) described the underlying technology architecture and innovations that make this possible, and (3) have described the governance and management implications of managing data in a data-centric way.
So where do you begin? How do you finance a data-driven business transformation initiative? What skills and readiness must exist within your organization? How do you introduce new technologies in a world where you may already have hundreds of databases, mainframes, warehouses or spreadsheets housing data today?
These subjects will be covered in the fourth and final article in this series.
About the Author
Julia Bardmesser, in her most recent role, was Voya Financial’s Senior Vice President and Head of Data, Architecture, and CRM. She is a board advisor for technology startups The Medici Project and Polymer, and a Women Leaders in Data and AI (WLDA) founding member. Before joining Voya, Bardmesser was Global Head of Data Integration at Deutsche Bank.
With over 20 years of experience in cross-platform solution delivery in the financial industry, architecture, data management, and data governance, she is a much sought-after speaker and mentor. Bardmesser received the 2022 WLDA Changemaker in AI award; was named to CDO Magazine’s List of Global Data Power Women three years in a row (2020-2022); named one of the Top 150 Business Transformation Leaders by Constellation Research in 2019; and was recognized as the Best Data Management Practitioner by A-Team Data Management Insight in 2017. She holds a Master of Arts in Economics from New York University.