Key Principles
Nov 10, 2025
12 minutes
How to Turn AI Dabbling into Measurable Results with the AI for ROI Framework
Join the 5% of successful GenAI pilots with these 3 phases
TL;DR
A new MIT Report finds that 95% of enterprise GenAI (Generative AI) pilots show no measurable financial impact, while just 5% deliver value at scale.
The issue isn’t the model — it’s how companies choose use cases, integrate them into workflows, and scale what works.
GrowthUP’s AI Scaling Framework, with 3 phases: Align → Integrate → Scale gives you a steps that bridges the pilot-to-production gap and delivers measurable ROI (Return on Investment).
Fortune displayed headline on August 18th, 2025: MIT report: 95% of generative AI pilots at companies are failing.” And if makes you pause, this is a good thing. The Financial Times added further market sentiment, noting the new MIT report findings spooked investors and quoting the core finding: “Just 5 percent of integrated AI pilots are extracting millions in value, while the vast majority remain stuck with no measurable [P&L] impact.”
Primary Report Finding: "Adoption is high, but transformation is rare. Only 5% of enterprises have AI tools integrated in workflows at scale and 7 of 9 sectors show no real structural change." -MIT NANDA Report, The GenAI Divide: State of AI in Business 2025, (July/August 2025).

What the MIT report really means for business leaders?
First, definitions. Enterprises have used Artificial Intelligence (AI) for decades— since the 1956 in fact. Think "IF–THEN" rules, forecasting models, and traditional machine learning that would classify, predict, and/or optimize data. Generative AI (GenAI) is different: it creates new content (text, images, code, audio) rather than just analyzing existing data. In the mainstream, GenAI “arrived” when OpenAI publicly released ChatGPT on November 30, 2022.
When most people say "AI" now, they are collectively mean both traditional AI and GenerativeAI.
So after solid years of GenAI pilots at the Enterprise level, MIT's report combines a review of 300+ initiatives, structured interviews with leaders, and survey data and makes their findings very clear that GenAI itself is not the problem. The problem is that companies treat AI as a side-show—something built in innovation labs as a “demo” or proof of concept—rather than embedding it into the messy, high-friction workflows that actually drive business value. These siloed ideas, or also called "showcase pilots" look great in a boardroom presentations, but then get instantly parked because they're not embedded in the live workflow, have no baseline metrics, no SOPs (Standard Operating Procedures), or real no accountable owner(s).
"Showcase pilots" can’t move a P&L line because they never touch the real work.
From studying over 300 companies, “just 5% of integrated AI pilots are extracting millions in value… the vast majority remain stuck with no measurable [P&L] impact.” — MIT NANDA, 2025 (FT)
There’s another wrinkle: widespread AI adoption does not equal business impact. McKinsey’s most recent survey shows 71% of respondents say their organizations regularly use GenAI in at least one business function, up from 65% in early 2024 (McKinsey 2025 PDF, p.16). That’s real momentum—but as McKinsey also notes, more than 80% still don’t see material enterprise-level EBIT impact yet (same report, pp. 21–22). In other words, usage is up; value is still concentrated among teams that embed AI into everyday processes.
Meanwhile, the Wharton–GBK 2025 enterprise study shows a countervailing trend. ROI tracking has become standard practice and roughly three in four organizations already report positive ROI, with four in five expecting positive returns within two to three years. This is where momentum is headed: from novelty to accountability.
Wharton underscores that daily use is up sharply across industries, with repeatable office tasks like data analysis, doc summarization, and editing rated highly by users. That pattern matters because these are exactly the areas where baselines and changes are easiest to measure.
And the stakes keep rising. Gartner forecasts $644 billion in worldwide GenAI spending in 2025, which means capital will chase the teams that can prove outcomes, not pilot slides. Gartner
Why do most GenAI pilots not move the needle?
Furthermore, McKinsey’s latest State of AI survey shows 65% (no rounding) of organizations report regular GenAI use—proof that GenAI usage does not equal impact. So while usage is skyrocketing, up from one-third the year before, but if you still have half your employees just playing around with AI tools—not tied to cost savings, revenue lift, or efficiency metrics—your CFO will still see “zero ROI.”
If you’ve ever watched a promising proof-of-concept fade after the demo, you already know the pattern. A small group builds something clever, but frontline teams don’t adopt it because it’s not in the tools they live in (CRM, help desk, suites); legal freezes it due to governance gaps; or leaders lose interest because there’s no clean before/after baseline to prove results. The lesson from MIT (and from real programs that work) is simple: start where friction is highest, define success in numbers, and embed the change.
That’s exactly why we built the GrowthUP AI for ROI™ Framework—a practical way to do the one thing MIT says separates the 5% from the rest: integration into real workflows with measurable outcomes.
The fix: GrowthUP’s AI for ROI™ Framework
We designed the AI for ROI™ Framework to make measurement the default from day one. It has three phases that any function can run in 30 to 60 days.
Phase 1: Align
Clarify goals and governance, map the current process step by step, and pick one workflow that touches revenue, cost, or risk. Capture the baseline for cycle time per unit, error rate, volume, and review effort minutes. Create a SMART target like “reduce synthesis time 40 percent while holding error rate under 1 percent.” Set guardrails and approval paths so your pilot is safe by design from the beginning.
Phase 2: Integrate
Activate embedded AI features in your existing stack first or add one AI tool for the gap. Build a one-page pilot plan with inputs, prompts, review checkpoints, and success criteria tied to your baseline. Run for 14 to 30 days and document everything—what worked, what didn't.
Phase 3: Scale
Update SOPs, publish a before-after summary, and secure buy-in from leadership to extend the same workflow to adjacent teams or with new variations. Identify opportunities to scale with new features, new metrics, automations. Then repeat the loop.
Key insight: Three out of four enterprises who measure ROI of their AI initiatives, already see positive returns. The winners move from usage metrics to business-linked metrics and scale only when acceptance criteria are met. Source: Wharton–GBK 2025.
How Successful Companies Are Internally Testing AI
Companies are seeing the most quick wins with identifying repeatable, document-heavy, tedious work where baselines already exist for AI to speed up. Decision-makers report daily use climbing in these tasks, and they rate them highly because they slot neatly into existing tools and reviews.
Three patterns show up repeatedly across the research.
They measure what the work team actually feels. Time on task and rework rate change inside the week, not the quarter. That is why high performers prove value fastest in office-task workflows. McKinsey & Company
They treat R&D as a portfolio. IT respondents report about one third of tech budgets going to internal R&D for custom capability where it matters.
They pair training with process rewiring. Programs that only “teach prompts” underperform. Programs that redesign the workflow and define acceptance criteria at the start show durable ROI. Deloitte
Key Insight: Budgets are rising again, but dollars are shifting from experiments to performance-proven programs. Source: Wharton–GBK 2025 and Gartner 2025 forecast. Gartner So start in support functions like Sales, HR, Finance, and IT where measurement is the norm and mature, and run a 14 day pilot to test savings.
If you want templates and facilitation, the AI for ROI Toolkit packages everything from process map to risk assessments to pilot results scorecard.
FAQs
1. Why do 95% of generative AI pilots fail?
Most generative AI (GenAI) pilots fail because they are not integrated into real workflows or measured against baseline metrics. The 2025 MIT NANDA Report found that 95 percent of enterprise GenAI pilots show no measurable P&L (Profit and Loss) impact. The issue is not with the model’s accuracy or sophistication, but with execution. (MIT NANDA Report, The GenAI Divide: State of AI in Business 2025; Fortune (Aug 2025) coverage.)
Common pitfalls include:
No baseline to compare “before vs. after” results
Isolated proof-of-concept projects without operational owners
Lack of workflow redesign (teams run AI beside the process, not inside it)
Missing governance, data policies, or measurable success criteria
How long does it take to see ROI from generative AI?
Most enterprises report measurable results within three to six months, when projects start with baselines and focus on repeatable workflows, but GrowthUP clients are seeing results in as little as 14 days.
The Wharton–GBK data shows:
74% of companies already see positive ROI
80% expect full ROI within two to three years
ROI arrives faster in document-heavy, digital workflows (finance, HR, IT) than in physical operations (manufacturing, logistics)
A short pilot phase (14–60 days) should show measurable cycle-time reductions or quality improvements. Scaling across functions typically takes 6–18 months, depending on governance maturity and data readiness (Wharton–GBK 2025; Gartner Worldwide GenAI Spending Forecast 2025)
What use cases produce the fastest measurable gains?
Document and data workflows such as analysis, meeting and document summarization, and drafting show the highest adoption and performance ratings in enterprise studies, which makes baselines and measurable results the easiest to capture.
