The explosion of available datasets has sparked a new frontier in efficiency. Business leaders have dreamt of using data to better target customers, develop unique product features, drive investment intelligence, and efficiently price products – as well as adopting infonomics principles and practices for managing data as an actual asset.
Unfortunately, nearly all companies lack a regimented way to systematically discover, test, procure, and document their interactions around data. The knowledge and best practices are being siloed between a mix of spreadsheets, homegrown tools, and employees’ memories. That way, what could have been an efficiency boom ultimately becomes a net cost for a company.
Data Relationship Management (DRM) to the rescue
A new concept called ‘Data Relationship Management’ aims to solve these enterprise issues and offer a fresh product category to help ensure data value is being optimized. Data Relationship Management is the process of capturing the details of all the relationships around the usage of data to drive a firm’s continued growth in knowledge.
These relationships include those between a firm and all the data vendors it has interacted with, all the products they sell, all the derivative products created based on them, and the internal stakeholders and knowledge workers that touch the data. It also captures all the knowledge gained through each step in the data journey from sourcing the data to putting it into production.
A Data Relationship Manager collects and catalogs it all, and addresses a wealth of issues that arise from different groups and employees not talking to each other. Nomad Data is among the first vendors to provide a solution in what is becoming a critical area of infrastructure for many organizations.
The lack of a software-driven approach to Data Relationship Management has led most companies to underperform in their use of data, which is typically the main goal of the Chief Data Officer (CDO). The software helps codify a process. Without a strong process, you see issues like double and triple spending on data, wasted time from employees repeating past efforts, huge setbacks from employee departures, and an inability for a firm to achieve any of its data-driven goals around driving sales, cost efficiencies, and intelligence gathering.
Searching for data relationships and use cases
The process of searching for data has become highly siloed and difficult for most non-expert users. A key feature of digital relationship management is facilitating a more efficient process for discovering new data vendors. Lists of data vendors, akin to Yellow Pages, aren't accessible for most users to discover data. It’s oftentimes difficult to know specific things about a complex dataset from merely a description.
The data relationship solution provides a simple plain language search. Users can be very descriptive as to what information they are seeking. It enables users in the company’s central data team to action the searches with their own intelligence, search through all previous searches for clues, or send the search into the solution provider’s network of data vendors. The simplified search flow means users can go from data-driven ideas to finding the right solutions in a matter of hours.
By providing access to past search details, data relationship management becomes essential to enable firms to grow their knowledge. People throughout an organization often want or need to see the specific data use cases others have pursued. By understanding the past searches along with the data vendors that were evaluated, it becomes much simpler for average employees to make the connection between their needs and the data that can help.
In practice, there is a significant amount of similarity between data requests. Being able to quickly find a past request that matches your own allows you to skip the search process and the evaluation process. Seeing past searches also helps spur the imagination around what is possible and also provides company-wide visibility on the impact of data on businesses.
Data testing is another major pain point — users at a company invest significant time in testing datasets. They go through the legal and compliance processes to get access to the data, investigate it, and design and execute a test or a series of tests to understand the efficacy of the data.
In many cases, information regarding the execution of tests and their results is stored in a restricted manner that precludes others from accessing it. In such a scenario, enterprises can end up rerunning the same data tests over and over again in different departments, and sometimes in the same department because no one is aware it has already happened.
Duplicate data relationships
Double and triple purchasing is another symptom of a lack of process. When employees lack a simple way to see what their firm is already purchasing, and how it is being used, they often purchase the data again. This repeats itself across departments and leads to more waste.
We also see a phenomenon where there is some accounting of what is being purchased, but the firm ends up buying 10 different datasets that serve a nearly identical function. This problem is further exacerbated at renewal time in a situation where a central procurement team oftentimes has little to no visibility on who is actually using a dataset and for what purposes.
Implementing a dataset in an analysis or a product also comes with a significant amount of learning that is also being lost.
What are the best practices for cleaning the data?
What challenges did colleagues run into in the past?
How did they address them?
Which colleagues are most up to speed and who can offer advice to save time?
What are the use cases for a dataset which have been most successful?
Rarely is this knowledge even written down, let alone shared.
Beyond just a data catalog
One question that is often asked is how full-flavored data relationship management differs from a data catalog. A data catalog is focused on tracking the structure of datasets, columns, headers, and data types. It does this almost exclusively around internal data for the benefit of highly technical data engineers.
Data relationship management is about capturing experiences and knowledge surrounding interactions with the data, both internal and external. Whether this is data stored in a spreadsheet, a research report, or a database. It covers both a vendor you’ve spoken to once and one that you’ve implemented in a core product for a decade.
A good Data Relationship Management solution connects all parties in a company around its datasets, both internal and external. It’s a place where employees of all personas who touch data can discover and quickly see a firm’s relationships around this increasingly important resource. Most importantly, it’s the central knowledge hub for the firm around all data assets.
As individuals unlock new use cases and learnings around a dataset, that knowledge benefits the entire firm. With the solution in place, it becomes much simpler to get new employees up the learning curve around data best practices and also reduce the negative impact of key employee departures. Most companies I have spoken with have a single spreadsheet on a single employee’s computer that catalogs all of their data relationships. Data relationship management removes these single points of failure from an organization’s path to success around data.
Data relationship management is poised to formalize the way organizations learn about data, making them smarter and more process driven. The software provides the skeleton around which companies build their data processes. This will help usher in a wave of success around leveraging data as an asset, ultimately leading firms that adopt this approach to outcompete those who do not.
Doug Laney is the Data and Analytics Strategy Innovation Fellow at West Monroe. He consults with business, data, and analytics leaders on conceiving and implementing new data-driven value streams.
Laney originated the field of infonomics and authored the best-selling book, “Infonomics,” and the recent follow-up, “Data Juice: 101 Real-World Stories of How Organizations Are Squeezing Value From Available Data Assets.”
He is a three-time Gartner annual Thought Leadership Award recipient, a World Economic Forum advisor, and a Forbes contributing author. Laney co-chairs the annual MIT Chief Data Officer Symposium, is a visiting professor at the University of Illinois and Carnegie Mellon business schools, and sits on various high-tech company advisory boards.