Opinion & Analysis

Top 5 Data Management Priorities to Build Trust in AI

Written by: Priyanka Kharat | VP, Product and Engineering, ScienceLogic

Updated 11:58 AM EST, December 13, 2024

AI adoption continues to surge as self-learning algorithms, generative AI (GenAI), and other advanced processes for computer-driven learning, reasoning, and problem-solving grow more sophisticated. Yet as capabilities expand and AI use cases grow more complex and autonomous, how can we trust that AI results are reliable, explainable, and accurate? The answer lies at the granular level of how we structure and work with data management systems. Organizations must prioritize data management tactics to enhance accuracy and build trust in AI systems and outputs.

As AI capabilities grow, so do trust concerns

Organizations are rightly skeptical when it comes to trusting what their AI systems are telling them since faulty inferences can lead to inaccuracies and sub-standard decision support from AI-derived outputs as well as reputational and security exposures. Much of the problem stems from challenges around data quality.

Information gaps created by incomplete, inconsistent, biased, or outdated data sets can mislead AI systems and LLMs, resulting in “hallucinations” and other poor outcomes.

These issues are magnified as AI becomes more autonomous and scalable. As just one example, OpenAI creates 100 billion words every day with its GenAI systems. Ensuring the accuracy and quality of such outputs becomes a tremendous challenge at such a scale. Compounding the challenge is the fact that, in searching for training data, many AI systems end up training on outputs that were previously created by other AI. This can lead to a deteriorating cycle of accuracy and output quality.

Addressing these challenges requires trust-building measures such as retrieval-augmented generation (RAG), data-protected private AI, prompt engineering, and other steps. These steps can enhance accuracy and increase business context so AI systems can adapt to evolving information and ensure that outputs are relevant and reliable enough for informed decision-making.

But CDOs and their teams are learning that standing up to these measures relies on having the right approach to managing the underlying data management systems that support AI in the enterprise.

Top 5 data management priorities

AI systems feed on data — lots of it, from multiple databases across the enterprise and external sources. This is why proper data management is so important to ensure AI is doing what it’s designed to do and that its outputs are accurate. The key to ensuring this trust in AI lies in building the right architectures and processes at the granular and operational level of data management.

Against this backdrop, here are five key data management priorities to ensure AI is working properly and that it’s generating trustworthy results:

Ensure data availability: Making data truly available for AI requires more than just breaking down data silos and rendering all data accessible to AI systems. The data must also be contextualized for business relevance and protected with authentication protocols that allow for seamless and secure access by AI systems and analysts alike. Given the scale of most operations, these processes will most often need to be highly automated.
Maintain consistent unbiased data standards: Since AI systems are pulling from multiple databases and sources, CDOs and their teams must ensure standards are clear and unchanging, regardless of where the data originated.

Key steps involve consistent data standards and metadata across various databases and systems so that AI can work with common definitions and data attributes regardless of where the data is coming from or when in the lifecycle of the development was collected.
Support AI with real-time data: AI capabilities are typically expected to be highly proactive and responsive to changing operational conditions. Ensuring this algorithmic agility requires making data rapidly and reliably available in real time.

And this requirement for real-time goes beyond just availability. Whether processing is happening on-prem, in the cloud, or on the edge, there’s a need for continuous ingestion, transformation, and analysis of data from diverse sources in real time.
Monitor for the unexpected with unsupervised AI: The power of self-learning algorithms is a double-edged sword. Predictive and generative AI allow computers to act autonomously to address novel situations and independently come up with effective solutions. But this starkly limits the usage of traditional monitoring approaches that rely on upfront data labeling and other data modeling steps that are often manual.

Unsupervised AI can use algorithms that analyze unlabeled data, without human supervision, to derive emerging structures and patterns as autonomous AI processes unfold.
Ensure traceability and “chain of logic”: Accurate monitoring of how and why AI systems are analyzing data and coming up with recommendations – particularly if they’re doing so autonomously – can help clarify and document “chain of logic,” or traceability into the provenance of AI outputs.

This builds trust around new AI use cases, helps ensure accuracy and quality and supports regulatory compliance and reporting around business processes and data usage.

Conclusion

High-quality data is table stakes to build trust in AI, and CDOs typically contend with disparate and incomplete data sources as the chief obstacles to such quality. While every organization will need to customize its approach to the unique needs of the business and the use case at hand, the most successful data modernization efforts tend to include the above five priorities to drive stronger outcomes and more value from their AI investments.

About the Author:

As VP, Product and Engineering, Priyanka Kharat leads ScienceLogic’s Product Experience Group responsible for delivering the next generation of intuitive insights that will bring a powerful, single-pane-of-glass experience to transform complex IT operations for the company’s growing customer base. Building service-contextualized AI/ML-driven workflows has been at the forefront of her work.

When she was a CIO and VP, Enterprise Solutions of a medical device company, NovaSignal and Parkland Health System, Kharat bootstrapped and commercialized cloud native, predictive systems to save lives and taxpayers’ money.

Earlier in her career, at Qualcomm and Intel, Kharat led global teams to deliver full-stack solutions for the automotive and consumer products business groups. She has progressive experience in understanding customers’ journeys and building solutions that address multi-fold business problems while improving efficiency and quality of life of everyone associated with the endeavors she leads.