AI News Bureau

How to Measure AI Success — An Actionable Guide from a Fortune 500 Responsible AI Leader

Written by: CDO Magazine

Updated 1:58 PM UTC, March 11, 2026

As enterprises pour money into generative and agentic AI, leadership conversations are shifting from “Can we do this?” to “What should we expect and what could go wrong?” Fortune 500 data governance and responsible AI leader Patrick McQuillan argues that the most productive executive discussions start with discipline: anchoring AI to customer outcomes, defining “good” before deployment, and building the governance and infrastructure that make success repeatable.

In the final part of this three-part interview series, McQuillan continues the conversation with Peter Geovanes of Juris Tech, and lays out how he guides executive teams on setting expectations, managing skepticism, future-proofing investments, and avoiding the mistakes that cause early wins to erode over time.

In Part 1, McQuillan discussed how governance maturity is uneven, often treated as optimization until regulation demands rigor, and is expanding from privacy into metadata, cataloging, and discoverability.

In Part 2, he presented how AI governance and guardrails cut waste, speed delivery, reduce failure rates, and sustain ROI.

Start where value starts: The customer, not the scoreboard

McQuillan says the core leadership task is positioning AI against real problems and measuring success in a way that connects to the business rather than internal optics. He urges executives not to confuse stakeholder pressure with revenue reality.

That is why he pushes a basic, uncomfortable question early: “We have to remind ourselves, do we need this, and why are we doing it?”

For McQuillan, AI expectations become real only when teams commit to measurable outcomes and acceptable risk thresholds in advance, before reporting begins and before narratives harden.

“We need to ensure the right guardrails are in place so that we know what good looks like,” he says. It is essential that the organization can avoid retroactively justifying an investment with selective metrics.

McQuillan encourages teams to decide upfront what they will tolerate and what they will not: “What is the outcome that we want to see? What is the revenue number? What is the level of toxicity or hallucination margin of error that we are comfortable with? What is the scalability plan?”

He also flags security and adversarial pressure as part of that baseline definition. The goal, McQuillan says, is operational reliability and the ability to measure outcomes in business terms: “Making sure that there is stability and predictability in this project once it rolls out and that we can measure those outcomes back to the business from a risk, performance, and scalable and sustainable architecture standpoint.”

Move away from “sticky” short-termism

McQuillan’s advice is to stop treating pilots as disposable experiments and start planning as if they will live long enough to create consequences. Even when it’s a proof of concept, he urges teams to think like operators. “Even if it’s a POC, let’s assume it’s going to live through a year. What’s going to happen in six months and nine months, and in 12 months?” he says.

He wants leaders to connect it with financial accountability: “Let’s pull it back and make sure this is hitting our P&L where we need to.”

Questioning standards, measurement, and “secret” answers

When asked what a doubting executive should be skeptical of, McQuillan states: “For emerging technologies like agentic AI, there’s no industry-standard way to measure its performance or its risk, yet there are a lot of recommended ways.”

That gap creates a market for overconfident certainty. “A lot of people will come knocking on your door and say, ‘I have the secret,’” he warns, adding, “You’ll notice, across the board, the answers are inconsistent.”

McQuillan contrasts this with more stabilized toolsets: “We’ll see with a lot of our machine learning or our generative tools, which are starting to become a little more stabilized.”

But for the cutting edge, he advocates transparency about unknowns and deliberate deployment choices. He also urges restraint in promotion until use cases are proven. “We have to be smart about how heavily we’re promoting this until we have a proven use case,” he says.

McQuillan is comfortable acknowledging uncertainty. “I’m very comfortable saying we don’t have all the answers, but we know where to go, and we know how to get there,” he says.

Future-proofing AI spending

Moving forward, McQuillan says “future proofing” is less about predicting model trends and more about building infrastructure and talent plans that prevent reactive chaos when pilots succeed. “Ultimately, it comes down to infrastructure.”

McQuillan argues many companies “overfocus on POCs” and then get blindsided by the operational demands of scaling. He points to recent history as a cautionary pattern. “We saw a lot of this happen from 2020 to 2022, where there was the big surge of data science hiring and then a massive reduction in force we’re seeing across the business,” he says, adding, “It affected a lot of folks’ bottom lines, and a lot of those models aren’t really in play anymore.”

His prescription is a medium-to-long-term investment approach that includes POCs but doesn’t stop there. “If you want long-term return for your value, you need to have a medium to long-term investment plan,” he says.

That means, incremental, proactive changes to the underlying system: “Small but meaningful adjustments to your underlying data pipeline and architecture,” and “small and meaningful adjustments to your AI R&D teams, and doing it preemptively.”

McQuillan frames this as both insulation and credibility, proof to stakeholders that AI is becoming part of the company’s operating makeup. “We’ve begun to shift where our talent is placed in the business, the type of talent we’re bringing in, and the underlying infrastructure to support those data and AI pipelines to make this more of a fixture in our business.”

Plan for unknowns, avoid short-term traps, and audit your reporting

In his final guidance, McQuillan urges leaders to plan explicitly for uncertainty rather than assuming smooth adoption. “Understand that there are always going to be unknowns, and map those out and have contingencies for what to do if something doesn’t work out,” he says.

His philosophy is preparedness without pessimism: “I always expect the worst. I’m an optimist, but I expect to be prepared for the worst possible scenario.”

He reiterates, “Make sure that you don’t over-index on short-term goals with short-term gains and short-term wins,” he says, warning, “You are going to get short-term results and short-term value that will be eroded by longer-term patterns and investment.”

Instead, McQuillan recommends planning horizons that match the maturity and nuance of what’s being deployed. His final caution is about measurement integrity, especially when the builders of a system also control the story of its performance. “I have seen use cases fail where the team developing the tool is also accountable for the reporting,” he stresses.

McQuillan argues for independent checks and deeper scrutiny to validate results. In his view, that kind of rigor, consistent evaluation methods, comparable assumptions, and objective verification are what keep AI programs stable over time, even amid uncertainty, risk, and rapid innovation.

CDO Magazine appreciates Patrick McQuillan for sharing his insights with CDO Magazine.