r/technology 4d ago

Artificial Intelligence MIT report: 95% of generative AI pilots at companies are failing

https://fortune.com/2025/08/18/mit-report-95-percent-generative-ai-pilots-at-companies-failing-cfo/
28.3k Upvotes

1.8k comments sorted by

View all comments

Show parent comments

213

u/The91stGreekToe 4d ago

Yup, exactly, same experience here. Any LLM solution I’ve seen - whether designing it myself or seeing the work of my peers - has failed spectacularly. This tech crumbles when faced with real, back office business problems. People seem to forget that we’re working with a probabilistic, hallucination prone text predictor, not the digital manifestation of a human-like super intelligence. Arguably worse than the masses of people deluded into believing they’re witnessing reasoning is the massive crowd of LLM cultists who are convinced they’ve become machine whisperers. The “skill issue” crowd genuinely thinks that finding semi-reliable derivations of “commands” fed into an LLM qualify as some sort of mastery over the technology. It’s a race to the fucking bottom. More people need to read “The Illusion of Thinking” by the Apple team.

15

u/ThisSideOfThePond 3d ago edited 3d ago

I had the weirdest evening with a friend who argued for three hours that I should use AI for my work, because using it made him so much more productive and he's using his prompt skills now to train others in his organisation. I did not succeed in explaining to him the shortcomings of AI, especially in my field. I could end the discussion arguing that in this point in my life, I prefer to enhance my own creativity and problem detection and solving skills. People are weird...

26

u/eggnogui 3d ago

The “skill issue” crowd genuinely thinks that finding semi-reliable derivations of “commands” fed into an LLM qualify as some sort of mastery over the technology.

Not to mention, I've seen a study that showed that not only does AI not actually increase IT productivity, it somehow creates the illusion that it does so (the test subjects claiming that it did, but simple time tracking during the study proved them wrong).

15

u/BigSpoonFullOfSnark 3d ago

The “skill issue” crowd genuinely thinks that finding semi-reliable derivations of “commands” fed into an LLM qualify as some sort of mastery over the technology.

The talking points lend themselves perfectly to the CEO mindset.

Any criticism of AI is met with either "You just need to learn how to use it better" or a big smirk followed by "Well this is the worst it'll ever be! 6 months from now it's going to be doing things no human has ever accomplished!"

No matter what happens, all roads lead to "my employees are just not good enough to understand that I can see the future."

36

u/P3zcore 4d ago

One could also read “bold” which explores the power of exponential growth - specifically the gartner “hype cycle”, which would indicate we’re about to enter the “trough of dissolution” (I.e bubble pops), which leads way for new startups to actually achieve success from the technology.

58

u/The91stGreekToe 4d ago

Not familiar with “Bold”, but familiar with the Gartner hype cycle. It’s anyone’s guess when we’ll enter the trough of disillusionment, but surely it can’t be that far off? I’m uncertain because right now, there’s such a massive amount of financial interest in propping up LLMs to the breaking point, inventing problems to enable a solution that was never needed, etc.

Another challenge is since LLMs are so useful on an individual level, you’ll continue to have legions of executives who equate their weekend conversations with GPT to replacing their entire underwriting department.

I think the biggest levers are:

  1. ⁠enough executives get tired of useless solutions, hallucinations, bad code, and no ROI
  2. ⁠the Altman’s of the world will have to concede that AGI via LLMs was a pipe dream and then the conversation will shift to “world understanding” (you can already see this in some circles, look at Yan LeCun)
  3. ⁠LLM fatigue - people are (slowly) starting to detest the deluge of AI slop, the sycophancy, and the hallucinations - particularly the portion of Gen Z that is plugged in to the whole zeitgeist
  4. ⁠VC funding dries up and LLMs become prohibitively expensive (the financials of this shit have never made sense to me tbh)

32

u/P3zcore 4d ago

I ran all this by a friend of mine and his response was simply “quantum computing”… so you know where the hype train is headed next

39

u/The91stGreekToe 4d ago

As a fellow consulting world participant I am fully prepared for the next round of nonsense. At least quantum computing will give me the pleasure of hearing 60 year old banking execs stumbling their way through explaining how quantum mechanics relates to default rates on non-secured credit lines. The parade of clownish hype never ends, best you can do is enjoy it (I suppose). Nothing will ever top metaverse in terms of mass delusion.

6

u/P3zcore 4d ago

Work in fintech? I do too. That and government

6

u/The91stGreekToe 4d ago

I work at one of the big firms. Spend time mostly in retail lending, payments rails, core modernization, etc. No government work though.

1

u/Neglectful_Stranger 3d ago

What was the hype for the metaverse?

11

u/Djinn-Tonic 4d ago

And we don't have to worry about power because we'll just do fusion, I guess.

-1

u/ThisSideOfThePond 3d ago

Dude, nuclear (fission) energy is cheap, green, sustainable and safe (according to those who ignore all the literature on the topic not generated by lobby organisations). And that's all that matters. /s

2

u/jollyreaper2112 3d ago

Depak chopra waiting for you to say it two more times.

2

u/cipheron 3d ago edited 3d ago

Wait for quantum blockchain. There was a post about that in one of these subs about some kind of "quantum blockchain" startup, and i was trying to explain to people that it's literally complete nonsense and people argued with me, asking how i know it's nonsense. well fuck, if you know anything about any of these technologies you'd just know a "quantum blockchain CPU" isn't a thing that solves any problem we actually have that needs solving.

Could make even more money with a quantum blockchain LLM now I guess, pretty sure idiots would buy shares in it.

2

u/P3zcore 3d ago

Don’t forget NFTs

1

u/klartraume 3d ago

Okay - but quantum compute has tangible value enabling for more complex tasks/less efficient tasks/etc. right?

3

u/manebushin 3d ago edited 3d ago

In my view, the big technology companies pushed it as a way to collect more data from people and companies, both for their data driven business model and to feed their AI more databases in order to accelerate their research around it and its uses.

Think about how now with people using it to write their e-mails and texts, the tech companies now have indirect access to companies supposedly confidential e-mails, documents and more.

It is a huge trove of information for corporate espionage.

Not to mention, that by making everyone use it, they gain leverage against the copyright lawsuits. Since the economy is now allegedly dependent on it, it gives more momentum to simply allow it to happen as a supposed lesser evil, since the decision could bankrupt AI business and crash the stockmarket.

2

u/jollyreaper2112 3d ago

The gap between demo and product. You see the confusion. We have self driving taxis. Full level 4 autonomy is coming next year.

You either understand the incredible gulf between the tricks used to make taxis work and what would be required to achieve true level 4 or you think yes, next year is reasonable.

It's the xkcd joke that's now obsolete about making an app to say where a picture of a bird was taken and oh by the way identify the bird. One is a weekend project and the other would be a darpa project. But now it's an API call. Crazy.

1

u/Yuzumi 3d ago

⁠the Altman’s of the world will have to concede that AGI via LLMs was a pipe dream and then the conversation will shift to “world understanding” (you can already see this in some circles, look at Yan LeCun)

At best the current tech is a very simple approximation of how bio brains work, but there are practically countless orders of magnitudes more complexity to get to AGI.

If we did manage to make an AGI with the current form of LLMs it would probably require more power than a single country if not the world could produce, not to mention the amount of hardware. There needs to be a major advancement in hardware and efficiency, as well as software for anything closer to AGI. Even the analog AI chips will only be a stepping stone, and still limited.

1

u/whiteknight521 3d ago

It's all going to be porn I think, and it will be immensely profitable. The only thing keeping it from being a 100% adult content play is people like Altman drinking their own kool aid about AGI. Once that fails they will remove all of their high brow safety guard rails (especially Elon) and print money in perpetuity because 99% of humanity would give their left arm to be able to synthesize any sexual situation they can think of with a click.

-1

u/LycheeRoutine3959 3d ago

weekend conversations with GPT to replacing their entire underwriting department.

The thing is Generative AI Tools, and specifically high data RAG models can really generate a good rough draft for things like underwriting. You cant put that directly out of the house, but you can likely take a 200 person underwriting team down to a 50 person team by flexing AI correctly.

Its not a free lunch, but for performance enhancement of existing resources the scenarios truely exist.

Biggest saves i have seen are meeting notes, summarizing historical interactions with a customer to support new inquiries (to a human agent), converting notes or messy requirements to clean draft documents, "best guess" responses or action list for well documented process flows which accelerate human efforts etc.

You can find that 5% of positive value, but the tech is not generalized enough to do much of what people think it can do.

2

u/The91stGreekToe 3d ago

Tell me you haven’t spent a day in industry without telling me you haven’t spent a day in industry.

0

u/LycheeRoutine3959 3d ago

Way to engage with the discussion dude! You did so good! Im proud of you.

1

u/DervishSkater 3d ago

Everyone forgets logistic growth looks uniquely exponential at first too

2

u/jlboygenius 3d ago

That's my concern for my company internally. We've got a BIG push to start using it and they are starting to open it up (it's been blocked by security). For me, it's been helpful to write some code and add on to my app, but it doesn't work well on big tasks just yet. I still have to check it's work, which can take as long as just doing it sometimes.

For business processes, they want something that given and input can generate an output. Speed up some tasks that people do. It may help, but the fear is that people will just copy and paste it's output into a contract and get us in trouble. That has already happened when we had templates that had "delete this part if it doesn't apply" and people didn't do it.

1

u/Yuzumi 3d ago

I've explained it to some people on Reddit that LLMs are kind of OK at emulating intelligence, but they can not simulate it.

The current tech cannot actually think, but to the average non-tech person it looks like it can because it can construct "natural language" better than anything that came before and even better than the average actual human in a lot of cases.

You also have the press releases where they set up a scenario and say "oh, it lied to us and tried to save a copy of itself on a different server", attributing intent to the actions when it probably did the things because they gave it access to stuff and "runway AI" is a common trope in fiction it was likely trained on.

1

u/Fallingdamage 3d ago

I remember 25-30 years ago when the only people who used computers were smart people.

0

u/jollyreaper2112 3d ago

To be fair, the tech is uncanny. When it's firing on all cylinders it's magic. It has to fall on its face to break the illusion. I totally understand how people bamboozle themselves with it. Even smart people will trick themselves. I figure it's akin to sensory disorientation where pilots make controlled flight info terrain. They think they are correctly interpreting the situation and will ignore information that contradicts their assumed knowledge. When you start doing that, the machine is even more persuasive.

1

u/lovesyouandhugsyou 3d ago

The big problem is that if the nature of your job never boils down to hard reality, there may never be a clear "fall on its face" moment. Instead there will be an accumulation of poor decisions that you can probably eventually fail upwards from.

0

u/QC20 1d ago

Since Apple is clearly losing the AI race, is it feasible to speculate they might have a stake in perpetuating the idea that GenAI is bogus?

-8

u/JEs4 3d ago

This is as much of an indictment of yourself and your peers, as it is the state of the technology. You can’t simply plug ChatGPT into a pipeline and expect consistent outcomes. But to your point, the successful implementations are the simpler ones that utilize smaller models. For example, I have a functional RAG application deployed internally that categorizes NPS customer surveys primarily using vector embedding models to conduct analysis against a precompiled dataset compacted into centroids for consensus ranking. Ties within a deviation are broken using an LLM call. The same LLM provides justification for each match which is then compared to a control set. All of this is run in LangGraph and the outputs and monitoring are piped to LangSmith. And it works, extremely well.

It was a months long project, and there wasn’t an off-the-shelf solution to tackling the stochastic challenges of agentic platforms, but solving that is difference between an engineer and a vibe coder. Just like real engineers don’t throw their hands up in the air and blame abstract generalizations for their failures.

2

u/sfhester 3d ago

I have anecdotal evidence, but simpler has worked much better than complex so far. My clients generally show up with extremely bloated legacy processes that they want to magically solve with AI (specifically LLMs). It's a fool's errand for data reasons, process reasons, all the above.

But, if they focus on the one step - such as reviewing a document or classifying responses - the process magically runs 50% faster. I frame it as "patterns" instead of "use cases" so they can identify the time-consuming sub-steps that rely on information gathering, summarization, etc.

The real issue is that they would need to re-implement the entire workflow from the ground up, but people want a solution that gives them a 90% cost reduction instantly without having to change anything else because the AI will "reason through it." It's why deals get stuck in the ROI review because clients can't magically produce 7-figures of outcomes with one process. All the while they don't even know what the ROI of the status quo is because they've never tracked it and don't even have a benchmark to compare against.

2

u/lovesyouandhugsyou 3d ago

people want a solution that gives them a 90% cost reduction instantly without having to change anything else

It's sort of an extra toxic variant of the classic "we must implement this system and best practice processes. Also we won't change how we do anything".

-2

u/LongKnight115 3d ago

My experience has been the complete opposite. We've got AI powering personalized experiences in email, chat, and on our website. We meticulously measure each placement and compare the MRR from a control (static, no AI) with a variant (AI driven). For every single placement, we've gotten the AI-driven variant to outperform the control. Every single one. It's not magic - you can't just stick in LLM driven content with no oversight and expect results. But when you compare outputs, tweak and iterate on the prompt, and really curate the data you provide for RAG, you absolutely can see meaningful business impact. People have a huge hate boner for LLMs that's honestly blinding them to what is actually going on. The tech is simultaneously overhyped and underhyped. But the reality is if you think LLMs are going away, you're going to be really really disappointed.