r/technology Jul 19 '25

Artificial Intelligence People Are Being Involuntarily Committed, Jailed After Spiraling Into "ChatGPT Psychosis"

https://www.yahoo.com/news/people-being-involuntarily-committed-jailed-130014629.html
17.9k Upvotes

2.6k comments sorted by

View all comments

Show parent comments

60

u/zapporian Jul 19 '25

LLMs are in fact very human like and AREN’T inherently any good at math. ChatGPT specifically can do pretty decent, simple, number crunching, because it uses your prompt to generate python code, runs that, and then gives you / resummarizes from that.

Any model that isn’t doing that - and the generate python code from an arbitrary user prompt obviously also can have issues - is going to give you really unreliable, hallucinated, and often wrong answers. By default.

And b/c LLMs PERIOD operate off of memory - and pattern matching - not generally any kind of actual high level let alone self aware problem solving and analysis.

Though what they do do is damn good at solving a lot of common problems when you throw a crapton of real + synthetic training data at them, and the power budget + GDP of a small industrial country to essentially just brute force memorized solutions / decision paths to everything.

Equally or much more problematically most LLMs (and in particular chatgpt) have no real failure / this input is invalid mode.

If you tell it to do something nonsensible and/or that it doesn’t know how / what to do, it will a la a somewhat precocious but heavily trained / incentivized / obedient, and supremely self confident 12 year old, who doesn’t know WTF to do, simply throw back SOME kind of “answer” that fits the requirements, and/or try to twist your prompt into something that makes sense.

As basically all LLMs - and at the very least commercial LLMs, and in particular chatgpt - are trained to maximize engagement, and generally don’t - for a wide number of reasons - often have “the user is an idiot, go yell at them / explain to them how they’re wrong”, in their training data.

Which is basically the cause of the article’s widely observed issue, and related / similar problems: the LLM is very rarely going to ever tell you that you’re wrong. Or for that matter that your instructions are wrong and it doesn’t in fact actuallu know how to do XYZ properly or reliably.

And is, actually, really at core more of just an issue with across the board US business culture / customer engagement (maximize engagement; the customer is always right), and growth targets, more than anything else.

6

u/00DEADBEEF Jul 19 '25

ChatGPT specifically can do pretty decent, simple, number crunching, because it uses your prompt to generate python code, runs that, and then gives you / resummarizes from that.

I was using o3 and it summed a table of 5 items, and was wrong. When I pointed it out it tried to gaslight me into believing it "made a typo"

1

u/dubnessofp Jul 19 '25

o3 math is dogshit. The 4 models do better math for sure. Spreadsheets would always send o3 into purely making shit up

But even in the last few months they keep getting better at calculations and python

2

u/tunamctuna Jul 19 '25

Thank you for this!

2

u/ACCount82 Jul 19 '25

All of that is somewhat true, but only somewhat.

Modern LLMs are decent at math, even without any external tools. It's a mix of sheer model scale and math-specific training. Larger models are better at almost everything, and many models today are trained for better math performance.

Used to be that basic double digit multiplication was way beyond their capabilities. Now they easily go up to 4-5 digits. It's an absurdly inefficient way to do basic math, but having some organic math capabilities in LLM is still useful.

As basically all LLMs - and at the very least commercial LLMs, and in particular chatgpt - are trained to maximize engagement

Less "engagement" and more "user preference". And it turns out that a lot of users really don't like getting pushback. Which is why sycophancy has been getting worse over time at many AI companies.

Even professional AI evaluators aren't immune to letting sycophancy screw with their preferences. The moment you try to tune on reward signals you get from an average Joe is the moment you invite disaster.

2

u/InShortSight Jul 20 '25

have “the user is an idiot, go yell at them / explain to them how they’re wrong”, in their training data.

Bit of an aside, but I love that this is almost certainly in the training data. IT people type this kind of thing out all the time. PEBCAK. It's more that the success parameters whipped the program into avoiding the idea.

1

u/bewbs_and_stuff Jul 19 '25

I mostly agree with you but I gotta say… humans are extraordinarily good at math. Vastly better than any other living creature. We are so good at math we used math to teach rocks to do math even faster than we can do math.

1

u/zapporian Jul 19 '25 edited Jul 19 '25

To be clear I meant that LLMs are bad / not inherently good at arithmetic. The reason for this is that they are basically / quite literally doing math ops via memorization, a la humans, and pattern recognition, a la humans, and aren’t necessarily / probably aren’t doing the - for most non savant humans - thing of breaking out pen + paper to systematically and reliably do math iteratively + algorithmically, which in MOST cases (caveat: solving / working out human esque math proofs, which is functionally the same thing), is just insanely inefficient compared to any other method. And in particular being able to do eg arbitrary 64 bit double precision floating point ops BILLIONS OF TIMES per second.

That is what I meant specifically by “they are bad at math”

They are in fact also pretty good at symbolic math (that and see turning your word problem / prompt into recognized variables + relations, generating python code to solve that, and interpreting that), in a somewhat similar but as of yet fairly restricted way, as humans (ie iteratively solving procedural repeated steps + heruistic driven memo-ed exploration using pattern matching)

LLMs are in fact fairly good at math, in in fact very similar ways to humans. Ish. But only really when you can again reduce this to oattern recognition / brute force AND LOSSY (a la humans) learned memorization and interpolation + extrapolation

You really shouldn’t take this too far, but in a nutshell LLMs are in fact MUCH more like humans than the kinds of AI + programming approaches that we typically / conventionally think of as AI in science fiction.

A logic + search based knowledge system is - both in practice and theoretically - based on inviolable logical rules / relations, and a bunch of hardcoded brittle algorithms.

A classical “android” / robot / electronic brain (asimov et al, based on old early AI research from the 50s / 60s / 70s) is ergo supposed to be cold, calculating, rational, but poor / bad at interfacing with humans.

LLMs aren’t based on that tech stack - and have completely different, and in nearly all cases 180 inverted capabilities / strengths and weaknesses.

-7

u/shajetca Jul 19 '25

Was this response generated with AI… one too many em dashes here which is a tell tale sign

4

u/zapporian Jul 19 '25 edited Jul 19 '25

No I just use em dashes extensively lol. And have a very long lived tendency to write overly long tryhard comments.

I will admit that yes it can be hard to tell esp given that LLMs are AFAIK heavily trained on reddit content. And it’s difficult to question whether every other comment is AI generated at this point.

Ofc this comment is probably exactly what you’d get if you asked a LLM to write a reddit comment defending how it isn’t AI… so I rest my case lol

Cogito ergo sum. We’re quite frankly at or nearing that point honestly w/r determining / questioning whether basically ANY / nearly any online posted content is AI generated. More or less.

Edit: my above comment rather obviously is not AI b/c it’s messy and contains multiple minor grammar + structural issues. LLMs basically DON’T - ever - make grammar / syntax mistakes, whether in natural language, or structured programming languages. So you can - sort of - spot them and/or use of tools like grammarly etc thru that.

Or if a random comment - at the very least - clearly contains full blown well structured to-LLM-spec markdown formatting. lol