Ashish Haruray

Top 10 Ways to Harness your DG Platform for Data Privacy

Confession: for many years, I subscribed to the theory that if you share data, you can’t protect it’s privacy. However, as I have had the opportunity to work closely on Data Privacy over the years, I now think the right answer is two infamous words. “It depends”. I know that’s not the reason you are reading this article. I want to be very specific and point out that you certainly can have your data cake and eat it too.

One of the key expectations from CDO Organization is to democratize data and make it accessible to consumers of data. At the same time, the regulatory requirements, specifically for data privacy, make it appear that data needs to be made available only to those who are entitled to see data. A common question we field is if these two concepts contradict each other. As it turns out, the two are quite complementary to each other. While I’ve spent years building Data Governance platform, there are lessons learned in the process that I am sharing here on how to leverage it to build a strong Data Privacy framework.

Here are the top 10 ways I think your Data Governance Framework can help give you a jump start on your Data Privacy Compliance journey:

1. Cataloging of data


First things first, cataloging data is the basic activity almost every organization undertakes as one of the first Data Governance initiatives. Essentially, it entails creating an enterprise data landscape that helps understand where your data is. The basic expectation is that all business terms are identified and clearly defined so that everyone speaks the same language. Similarly, for business metrics, the calculations and derivation logic should be clearly documented to drive consistency across the organization. On the technical side, you can take advantage of sophisticated Data Catalogs of today to explore and discover your existing systems and to catalog your technical metadata. Finally, linking business metadata (stated terms) with technical metadata (current state) gives you a clear understanding of where data can be found. This helps enormously with understanding where your data lives.

2. Classification of data


Once the data is in the catalog, the next logical step is to classify data according to your organization’s data policy, which is subject to privacy and other regulations. With most advanced Data Catalogs today, you can leverage Machine Learning to auto-classify data if the rules are easy enough to train Machines to learn. Some of the more mature organizations take advantage of workflows (see bullet number 8 below) to help putting a structure around classifying data with approvals from appropriate authorities (compliance, security etc.) Ensuring that you have a process for classifying data will score well with Data Protection Office.

3. Defining the Roles and Responsibilities for data


This is where you assign people or groups with specific responsibilities for data such as an owner or a steward. Admittedly, it is perhaps the hardest task to get people to sign up for these responsibilities. It might be best to start by defining RACI for your organization. This requires a lot of effort and being creating in finding incentives for those who you are asking to take the burden of responsibilities. The most common question to be prepared for – What’s in it for me?

4. Lineage and Impact of data

One key duty of any organization is to understand the origin of data and the journey data has made


Your Data Catalog should be able to provide this information easily. Not only is this critical for Data Transparency, but your Data Protection Office might also require that you are able to provide the source of the data.

Like Lineage which looks at where data is coming from, it is equally important to understand the areas of impact. Impact assessment can help you see what happens to your data landscape when you make changes to data. Thus, cataloging Report Metadata and linking it to Business and Technical Metadata makes it possible to see the full impact of the changes. You can score bonus points if you capture and catalog your data issues and also link them to other metadata, providing the user transparency about the impact those issues have on your data products.

5. Data Policy

Data Landscape for large organizations can feel a bit like a wild forest. It’s hard to put Data Landscape under a central policy, but unfortunately, that’s exactly what the regulators want. Many organizations are beginning to realize the importance of having a central place to manage and govern Data Policies for effective control of their Governance processes. Your Data Policies should include Security, Regulatory and Compliance Policies, which allow you to show what data they govern. This makes your first, second, and third lines of defense feel comfortable about the effectiveness of governance. As mentioned later in this article (see bullet number 8), automating the onboarding of new data policies or making changes to existing ones via Approval Workflow can further bolster the case for a well-coordinated Data Governance and Data Privacy Framework.

6. Cataloging and Documenting Business processes


While it may not be obvious, if you can document and catalog your business processes, you can satisfy some critical regulatory requirements such as GDPR’s Records of Processing Activities (ROPA). Additionally, you can define workflows for onboarding new processes to ensure the catalog of business process stays updated as your organization implements new processes. The benefits of documenting business processes can go a long way as you can ensure every new business process or project goes through the required Data Privacy Impact Assessment (DPIA) which is another pillar for GDPR.

7. Understanding and documenting the usage of data

One of the most common questions from regulators is about who can access your data and what controls are in place to restrict access. One way to answer this question is to document the different types of users and datasets and then connect what datasets are accessible to which types of users and are applicable to additional dimensions of permissions such as geography and data privacy levels. This alone can make your Data Privacy Framework score very high in the eyes of regulators as you can “show” the access control of data and not simply “tell”. The maintenance of users and access to data adds another layer of comfort for the Data Protection Office as they can see the control points for themselves. Access Rights Management is a key component of most Information Security policies and this capability will add another feather in your hat. 

8. Developing Workflows to automate processes


Many Data Governance Platforms can provide a workflow engine that can help automate your data activities such as onboarding data elements. Why is this important? Because it shows that your processes are structured, controlled, and most importantly, repeatable with known outcomes.

One of the most commonly used workflows include onboarding data elements through a controlled approval process flow. Another good example is to create a certification process for the end-user dashboard/reports that can proudly wear the seal of approval from the appropriate groups who can vouch for the completeness, accuracy, and validity of data. This creates a sense of trust in the mind of the users. Having trust in the data is the goal of every organization and this can be accomplished through workflows.

9. Define a Process for Managing Data Sharing Contracts


One strong case can be made for the comprehensiveness of your Data Privacy Framework if you document SLA (Service Level Agreements) between Data Consumers and Data Producers. This shows that you have a good handle on data availability and accuracy. This also speaks to the maturity of your organization as some of these are required capabilities to score high on the Data Maturity Assessments. Besides it helps with identification and communication between departments that are producing or consuming data. 

10. Documenting and Linking items 1-9 for better traceability

Last, but not least, the ability to link all the above capabilities in one place as a one-stop-shop experience provides comprehensive 


visually intuitive traceability diagrams between different data activities. Sophisticated Data Governance Platforms of today offer the ability to link all types of metadata so you can visually comprehend the impact in one place without the need to jump from application to application.

Now that we have reviewed these important points on how the two concepts are not mutually exclusive, it is important to realize that they in fact complement each other. Data Governance Framework bolsters Data Protection Framework. Data Democratization should not hamper your Data Privacy initiatives. For those organizations who have embraced the concept of “Privacy by Design” and are actively thinking about privacy from the start of each initiative, a strong Data Governance foundation can help achieve the necessary balance.

Data Intelligence or Transparency speaks to the ability to better understand data. Similar to peeling back layers of an onion to reach the inside, Data Intelligence can help you answer questions like what the data means, where it comes from, who owns it, what the quality is, and what issues are affecting it. The ability to answer these questions is also helpful for your regulatory compliance to be able to show what policies govern data, what business processes use it, where the sensitive data lies, what the risks associated with the data are based on the classification of data, and more. 

In fact, GDPR requires organizations to be able to provide consumers transparency on how their personal data is being used and to delete personal data if asked to do so by the consumer. It’s impossible to honor these right-to-information requests without having a robust framework where all this information is accessible to the authorized users.

As you can see from the points made above, Data Transparency and Data Protection work hand-in-data to complement each other rather than working against each other.

The points made above are based on real life experiences that I have seen in my long professional career. I saw this as a challenge where creating a catalog of all types of metadata was very difficult and even harder to maintain it.  The major challenges have stemmed from the fact that until fairly recently there were no real single product/platform that could allow operationalization of all 10 points I made above. I have learned that technology can help tremendously if you can leverage sophisticated data intelligence platforms such as Collibra and Informatica. Not only these platforms can help organize and structure the data transparency and data privacy programs but they can also allow automating processes using the workflow capabilities. Yes, these technologies cannot do things on their own and a strong top-down approach is still needed to operationalize data management with support from C-suite.

From Cataloging data to being able to visualize the links between all types of metadata have helped our organization support the data transparency efforts that complement data privacy programs.




Ashish Haruray, Sr. CoE Leader, Office of CDO, Data Intelligence & Analytics at AXA XL, is an influential and transformational data strategist with over 25 years of experience, including operationalizing data governance, business engagement through innovative strategy, and exceptional team leadership. An Industry thought leader in data governance and data privacy, he is a regular speaker at leading conferences, and he has authored articles on data management and data privacy for print and digital magazines. He has led data strategy and implementation for large organizations utilizing his deep expertise on senior executive buy-in for data projects.