Opinion & Analysis

Data Governance by Design: 11 Commandments for Architecting Future-Proof Transformations

Author Willem Koenders, Global Data Strategy Leader at ZS Associates, highlights the importance of data governance by design on the backdrop of the challenges of retrofitting data governance.

Written by: Willem Koenders | Global Data Strategy Leader, ZS Associates

Updated 12:43 PM UTC, Tue August 15, 2023

Data is an invaluable resource in today’s digital world. Properly governed and utilized, it can guide business strategies, enhance operations, and spur innovation. Organizations have tried to implement data governance for their existing systems and processes, but faced with mounting costs and frustration, many simply give up.

Instead, they turn their focus to the future to ensure that at least any new system, process, or modernization will embed data governance by design, straight into the fibers of the data architecture. This article will outline how this can be done by enhancing your organization’s transformation lifecycle methodology with 11 data governance commandments.

The pain of retrofitting data governance

Most data governance challenges stem from systems and processes that were designed without explicit and consistent data management principles. The larger, older, and more complex the organization is, the larger the problem.

The result?

Issues with data findability, sharing, quality, security, and regulatory compliance that undermine trust and hinder the effective use of data.

Organizations have tried to retroactively implement data governance for legacy systems and processes. This approach involves extensive efforts to understand and untangle convoluted data flows, working with a scattered set of (sometimes no longer supported) technologies, and imposing governance standards on outdated or non-compliant systems. Engineers and consultants that were responsible for said systems in days past have long gone, further complicating efforts.

I spent several years in the trenches of retroactive data governance programs, trying to track down elusive system owners, and digging through outdated ETL scripts and technical metadata. A few years from now, generative AI might be able to analyze unstructured data such as e-mail, meeting recordings, and other sources to get access to so-called “tribal” data and insights.

For the moment, this remains highly manual and resource-intensive work. The return on investment is virtually never provided, except for highly specific cases of regular compliance.

Embracing a future-oriented perspective

Impact-oriented data leaders have begun to focus on “designing in” data governance from the start of new transformation projects, marking a move from remediation to prevention, and from reactive to proactive.

During transformation initiatives, organizations assemble a team of experts comprising project managers, solution and data architects, business analysts, data engineers, infrastructure specialists, data scientists, and more. Together, they possess deep knowledge about the data landscape, its sources, quality, lineage, and the intended usage.

As such projects conclude and experts move onto new roles and contractors leave the organization, the related metadata grows stale rapidly. There is a short window to preserve this in-depth knowledge in data management tooling such as a business glossary, data dictionary, and data catalog, so that it can be updated and maintained over time.

Embedding data governance into the transformation lifecycle

Migrating data to the cloud, building a new omnichannel capability, and using external data sources to create a new customer segmentation — there are myriad possible transformations where data governance should be considered. It is to be avoided to try to reinvent the wheel for each possible transformation.

The key?

Focus on the organization’s transformation lifecycle or methodology.

You don’t want to create an entirely separate transformation assessment and approach, just for data governance – find whatever exists already, and work with that. If none exists (rare for larger organizations but possible) then create one. Having a centrally supported transformation life cycle ensures that any transformation of a certain impact or size is automatically brought into scope.

In the last few years, while interviewing a few transformation program managers within large technology companies about how they incorporate data governance into solution requirements and design, they indicated that they had to figure this out themselves. Long, legal-sounding standards were available, and different opinions existed around how to practically implement them. If only there were a checklist of sorts that they could use.

In Figure 3 above, 11 specific data governance commandments are mapped against the transformation lifecycle. For the purpose of this point of view, I’d like to highlight 4 of my personal favorites, because they drive the bulk of data governance by design while when done right, actually accelerating transformation:

1. Data products and assets: Data products are the ultimate accelerators as they are ideal, strategic locations to enforce data governance while driving maximum impact across use cases. At the very beginning of any transformation, there is an opportunity to confirm what existing products or assets can be used or created to drive impact, and to contribute to a wider, functioning enterprise data mesh.

2. Interoperability standards: With data products and assets focusing on where data is stored, interoperability standards streamline the exchange of data. Typically, between 3-6 dominant ‘patterns’ can be defined that describe, depending on the type, frequency, and intended usage of the data, what technologies and standards are to be used.

A powerful “trick” is to tweak data integrations so that “by design” they become discoverable and documentable, for example in an API catalog. This is welcomed by solution architects and program managers, provided they come with a ‘menu’ of integration patterns and tools to choose from, as it then accelerates development.

3. Data quality by design: During the design of the target-state solution architecture, someone with expertise in data quality and integrity can prevent an enormous amount of future pain by advising on how data quality can be integrated by design.

There are many ways this can be done, for example by locking down data entry fields, adopting reference data, and incorporating automated reconciliation checks in data pipelines. The more automated and the closer to the data, the better.

4. Minimally required metadata: A clear list of minimally required metadata drives efficiency within the transformation program while at the same time ensuring that the most impactful data governance is enabled. From experience, I recommend keeping this list as short as possible, focusing on those attributes that drives the highest impact across business (e.g., classification, ownership, business requirements), technical (e.g., table names, properties), and operational (e.g., access log, error log) metadata.

Managing change

Adopting the lifecycle will not happen overnight. The data governance considerations above might sound heavy and time-consuming. However, the key is to realize that for the most part, these are not new responsibilities – they are merely reinforced more consistently. For example, data quality requirements should always be incorporated into technical specifications, but it can be done much quicker with help from a specialist.

Using an existing enterprise data model can prevent time being spent to create a new model from scratch. If upstream data is not of the required quality for the intended solution, having an identified domain owner can alleviate data quality concerns for the development team. And a lot of work that is now done “additionally,” for example, logical data modeling, can be offered “as a service” by a central data team.

Trust me, the members of your data team will be much happier when engaged to enable new projects, rather than solve the mess caused by previous ones.

Focusing on the transformation lifecycle not only prevents the occurrence of new data governance issues but also yields a higher ROI. Designing systems and processes with data governance in mind from the outset reduces the costs and complexities of retrofitting governance later. Moreover, when data governance is embedded in design, the business users can leverage high-quality, well-governed data as soon as the transformation is complete.

About the Author:

Willem Koenders is a Global Leader in Data Strategy at ZS Associates. He has 10 years of experience advising leading organizations on leveraging data to build and sustain a competitive advantage.

Koenders has served clients across Europe, Asia, the United States, and Latin America. He previously served as Deloitte’s Data Strategy and Data Management Practice Lead for Spanish Latin America. Passionate about data-driven transformation, Koenders firmly believes in “data governance by design.”