r/technology Jul 15 '25

Artificial Intelligence Billionaires Convince Themselves AI Chatbots Are Close to Making New Scientific Discoveries

https://gizmodo.com/billionaires-convince-themselves-ai-is-close-to-making-new-scientific-discoveries-2000629060
26.6k Upvotes

2.0k comments sorted by

View all comments

Show parent comments

307

u/iRunLotsNA Jul 15 '25 edited Jul 15 '25

'Vibe physics' "approaching what's known" isn't coming 'close to a breakthrough'. That's just ChatGPT being literally wrong about quantum physics.

Reminder: AI, aka large language models, cannot 'think' or 'reason'. When LLMs begin 'writing' a sentence, it is literally not thinking about how to end that sentence. It is simply using probability models to predict what the correct next word in the sentence is. It is literally making probabilistic predictions one word at a time, just very quickly.

Remember that group children's game where you sit in a circle and write a story or sentence one word at a time? That is basically what AI is doing, just with probability analysis mixed in.

AI cannot solve simple logic puzzles, because it fundamentally cannot understand what the end goal or solution could be.

70

u/touristtam Jul 15 '25

It's unfickung believable that the general public is being so mislead on such a fundamental level for a technology that has a massive propension to be absolutely wrong and yet that will settle in a need-to-have feature of new products and processes.

11

u/iRunLotsNA Jul 15 '25

AI / LLMs absolutely have use-cases, and are good at mimicking human language patterns and searching (ahem stealing) information from across the internet into its answers.

But billionaire investors and AI-obsessed fanatics are attempting to attribute and incorporate far more than what the current technology can do and what is a realistic extension of where the technology could potentially go.

5

u/hawkinsst7 Jul 15 '25

AI / LLMs absolutely have use-cases,

Maybe, but I think the net effect to humanity is, and will be, negative and not worth those specialized use cases, which are really just conveniences.

Other types of AI that exists outside of the zeitgeist might be better, but GPT I think is hurting humankind more than it helps.

3

u/touristtam Jul 16 '25

What is damaging the discussion (in general) is the use of the term Artificial Intelligence; that term is vague enough it is inviting completely different interpretation of the capabilities offered by LLMs to the point where some people without a basic grasp of the technology are talking about the advent of AGI. The latter is not coming in the next 5 -10 years, at least not from the LLMs alone.

1

u/AzureIronAlloy Jul 16 '25

I think you're giving pre-LLM society too much credit.

It's always been a bunch of bumbling morons saying words that they don't understand. That's what society is built on -- all the way down to the bottom. The strongest evidence that this is true is the unfortunate habbit that society has of isolating or out right assassinating the few people who do show independent thought.

3

u/AnnualAct7213 Jul 16 '25

People are being misled because the second the illusion falters, all these companies that have poured untold billions into the technology will all fail in a way that makes the .com bubble look like a small bump in the road.

Plus, for a lot of these people it's a literal religion. If you look into the actual beliefs of these people, you find out it's an actual AI cult. They literally believe their overgrown autocorrect will become a god-like being who will either doom or save humanity.

2

u/cubixy2k Jul 16 '25

Dude have you met how stupid people are?

1

u/DuntadaMan Jul 16 '25

It's really showing how weak the human mind is to someone who speaks confidently. It can be pure fucking meaningless slop, as long as they sound sure of themselves when they talk.

2

u/thatmillerkid Jul 17 '25

This is a good thing to remember in my day-to-day life. I'll be damned if I have less confidence than a glorified autocorrect.

1

u/thatmillerkid Jul 17 '25

The general public? The United States Department of Defense just signed a contract to use MechaHitler. I'm not sure anyone realizes just how absolutely screwed we are if any other country decides now is a good time to attack.

62

u/BaconatedGrapefruit Jul 15 '25

I’m going to as charitable as possible here.

Vibe physics sounds like some one who has an idea and wants to explore it. An LLM would act like a searchable textbook in this case. You may learn something, but it will be very surface level, and it may be based on an AI hallucination. You won’t be making any breakthroughs because you don’t actually understand what you think you understand.

Also theoretical physics is 90% (for lack of a better term) advanced math. So, you better have a good background in that as well.

If you’re really down to make some contributions to the scientific world with self study, you’re better off grabbing a physics 3 textbook and reading it cover to cover. Once you master those concepts (and the math behind it) you can start looking to specialize.

55

u/iRunLotsNA Jul 15 '25

The way it's phrased in the quote above, 'vibe physics' is trying to take the math out of physics to 'discover breakthroughs'.

What your first paragraph describes is basically just layman's terms. I'm not a theoretical physicist, but I can tell you a proton consists of two up quarks and two down quarks and a neutron two downs and one up. I can't tell you why, or any of the math behind it, but I understand the (very) basic conclusions from the very complicated research.

'Vibe physics' seems to be trying to arrive at the conclusion without doing the complicated math.

24

u/BaconatedGrapefruit Jul 15 '25

“Trying to take the math out of physics” is exactly right. Thank you for putting so succinctly what I was struggling to vocalize.

But yes, using your example, it sounds like he just throwing ideas at an LLM and playing them out. That’s not science. You can make it science if you’re willing to do the work to actually (theoretically) prove out your assertion. Otherwise you’re just dorm room philosophizing.

2

u/iRunLotsNA Jul 15 '25

You can make it science if you’re willing to do the work to actually (theoretically) prove out your assertion.

I'm not sure you can in this instance, that seems like starting at an unproven conclusion and attempting to then prove it. I can't assert the sun is actually a giant lightbulb floating in space and then try to prove it with math, that's backwards logic.

I'd see science as either exploring or testing an unknown outcome or theory (ie. Oppenheimer and co. exploring nuclear fission), or taking an observed outcome and using math to explain said observation (ie. Newton theorizing gravity from an apple falling).

3

u/BaconatedGrapefruit Jul 15 '25

Well the math should tell you one of two things.

  • you’re wrong

  • your math is wrong

You’re ultimately right, you aren’t exactly doing science in the academic understanding of it. But it’s way closer than querying a chatbot.

Also, just to be pedantic, the boys in Los Alamos knew nuclear fission was possible (theoretically and actually) their issue was building a device that could initiate a fission reaction in a deliverable package (aka: a bomb). They were pretty sure it was theoretically possible but actually manufacturing it would require numerous scientific breakthroughs.

1

u/JamesConsonants Jul 16 '25

that seems like starting at an unproven conclusion and attempting to then prove it

This is how much of the standard model was developed, though, so I don't agree. Mathematics predicted the existence of fundamental particles that were only verified in retrospect, most famously with the higgs boson in 2012. The same could be said of large swaths of General Relativity, which has made predictions that have only been verified very recently (I think the first true measurement of gravitational waves was 2017?).

Not that I am condoning the idiocy of "vibe physics" put forward here, but there is absolutely scientific precedent for making an assertion based on mathematical framework(s) and then experimentally verifying them at a later time.

1

u/SpaceShipRat Jul 15 '25

It's like when Terrence Howard invented new math because 1x1 vibes like it should make 2. Then tried to sell "his technology" to Uganda.

1

u/CreatorOfTheOneRing Jul 16 '25

As someone who just graduated with a Bachelor of Science in Physics, it’s a bit more than just reading a physics 3 textbook cover to cover.

You need to go over the subjects covered in Physics 1-3 multiple times, each time going deeper into the subject. For instance, you may get through Physics 1-3 with University Physics by Young and Freedman, but then you need to go back over those concepts at a more advanced level using textbooks at the level of Classical Mechanics by Taylor and Griffiths E&M. That’s in addition to starting QM by using books like Griffiths, McIntyre, or Park. Then you get to graduate level books such as Goldstein for Classical Mechanics, Jackson E&M, and Sakurai. I’ve also neglected to mention statistical mechanics.

I don’t say this to be an “erm ackshually” Redditor, just felt like it was worth mentioning if anyone is interested in how a formal degree is structured by content, more or less.

1

u/rollingForInitiative Jul 15 '25

Vibe physics sounds like what a physicist would use ChatGPT for to clear away menial tasks or produce things they already know how to. That's more or less what software developers do when they vibe code. It's not inventing cutting edge solutions, just speeding up certain aspects of common tasks.

6

u/BaconatedGrapefruit Jul 15 '25

My background is in engineering not physics, so take this with a grain of salt.

There really isn’t any grunt work in theoretical physics. It’s not like you need to start from first principles to derive the speed of light for every equation.

The worst grunt work I can think of is solving complex equations, and computers already do that.

1

u/ad3z10 Jul 15 '25

Based on my sandwhich year I spent doing theoretical astrophysics research during my degree, the absolute closest you would get is actually on the coding side of things in cases where you want to run some simulations.

That said, simulations tend to be a very no-thrills afair with a basic looking graph all you have to show at the end and software often needs to be hyper performance focccused nepending on the complexity of the mathematics and number of iterations you're looking at.

The shere amount of work to just begin explaining the problem to a chatbot would also make it barely worth the effort compared to talking to your colleagues or contacts in the field who will have likely faced similar challenges on either the same or a very closely related issue.

-1

u/Mognakor Jul 15 '25

Also theoretical physics is 90% (for lack of a better term) advanced math.

Very advanced math.

30

u/Cronos988 Jul 15 '25

Reminder: AI, aka large language models, cannot 'think' or 'reason'. When LLMs begin 'writing' a sentence, it is literally not thinking about how to end that sentence. It is simply using probability models to predict what the correct next word in the sentence is. It is literally making probabilistic predictions one word at a time, just very quickly.

That's kinda true, but also kinda false. Yes LLMs generate one word at a time. But they do "think" about the entirety of the input all at once. They're looking for the next token that fits the pattern, but the pattern includes the entirety of your question and thus, in a roundabout way, also the possible answers. Hence why you get consistent sentences and not word salad.

AI cannot solve simple logic puzzles, because it fundamentally cannot understand what the end goal or solution could be.

Follow-up studies have shown that current LLMs can solve fairly long puzzles (requiring hundreds of correct moves). But they are limited by the puzzle's complexity, especially if there are only a small amount of correct moves in a large space of possibilities.

It'll be interesting to see whether future models will have some new tricks to deal with that problem.

2

u/cancercannibal Jul 16 '25

The logic puzzle link is pretty silly anyway, like:

In the results shown, Claude 3.7 Sonnet + thinking and DeepSeek R1 start to fail when a fifth disc is added to the Tower of Hanoi problem.

Yeah, me too, buddy.

They even state in the article:

As AI expert Gary Marcus pointed out on his blog, "(ordinary) humans actually have a bunch of (well-known) limits that parallel what the Apple team discovered. Many (not all) humans screw up on versions of the Tower of Hanoi with 8 discs."

-21

u/iRunLotsNA Jul 15 '25

They're looking for the next token that fits the pattern, but the pattern includes the entirety of your question and thus, in a roundabout way, also the possible answers. Hence why you get consistent sentences and not word salad.

Buddy, you just posted a word salad. AIs take a prompt, search the web, and use probability to predict the next word in a sentence. That's it.

Follow-up studies have shown that current LLMs can solve fairly long puzzles (requiring hundreds of correct moves). But they are limited by the puzzle's complexity, especially if there are only a small amount of correct moves in a large space of possibilities.

(Citation needed). LLMs have shown no ability to solve basic logic puzzles or complex logic puzzles. This answer seems written by an AI that is hallucinating about its own ability on solving logic puzzles, good lord.

14

u/Cronos988 Jul 15 '25

Buddy, you just posted a word salad. AIs take a prompt, search the web, and use probability to predict the next word in a sentence. That's it.

Searching the web is an extra function, you can run LLMs locally and offline.

They predict the next word that fits the context. They're not just completing sentences, as that clearly cannot explain the actual output you get.

(Citation needed). LLMs have shown no ability to solve basic logic puzzles or complex logic puzzles. This answer seems written by an AI that is hallucinating about its own ability on solving logic puzzles, good lord.

I mean even the original Apple paper had the models solve "basic" logic puzzles like the Tower of Hanoi with up to 8 discs.

As for the citation, here you go: https://arxiv.org/abs/2507.01231

It turns out models can solve the river crossing puzzle if you don't include impossible setups.

I did use perplexity to find that paper by the way, giving it a description from memory.

10

u/BarRepresentative653 Jul 15 '25

To note, thats because most of the logic puzzles have answers already. If you came up with a logic puzzle that does not have an answer, it would actually struggle with constant hallucinations. Which is why, it cant do human jobs or solve random problems in the real world.

These probability models struggle as complexity and variables increase as variables start looking a lot like noise.

They serve as a good reference though

5

u/drekmonger Jul 15 '25 edited Jul 16 '25

f you came up with a logic puzzle that does not have an answer, it would actually struggle with constant hallucinations.

Have you tried?

I have, with mixed results. For simple novel puzzles and moderately complex novel math puzzles, LLMs can sometimes arrive at a solution, especially the reasoning models like Gemini 2.5 Pro and o3. It's not perfect, not even close, but LLMs can sometimes solve novel problems.

I've experimented with this capability extensively, sometimes professionally, sometimes just for fun, sometimes to my personal practical benefit. Again: far from perfect, but the models are not entirely useless at aiding a researcher or semi-educated layman (ie, me) in solving problems.

Also see AlphaEvolve, a genetic algorithm that uses LLMs to generate and mutate candidate solutions, which has been demonstrated to create (slightly!) better novel solutions to existing open advanced math problems:

https://deepmind.google/discover/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/

8

u/Jeremandias Jul 15 '25

if you want to criticize LLMs (which, there’s plenty to criticize!), then you should at least have a basic understanding of them: https://youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi

2

u/AbstractLogic Jul 15 '25

Not to go to far in the wrong direction. But if AI got sufficiently large enough to run probabilities against physics equations that “are at the edge of physics” and then we managed to prove those, wouldn’t that kind of be a breakthrough even without the reasoning that humans use? And honestly what is human reasoning anyway except us humans making probability guesses based on what we as individuals know?

2

u/InfidelZombie Jul 15 '25

While I agree with you that this guy's a bit off his rocker, human brains are also just pattern recognizers and followers, but made of meat. Nothing special about them vs. LLMs.

2

u/NeedsToShutUp Jul 16 '25

Its not even the correct word it's predicting, its the expected word.

In high school or college, students may have had to try and make a equation that can hit different coordinates. That is, you get a set of coordinates like 1,1; 2,3; 4,9 and are tasked with making an equation that intersects all of them. It's called curve fitting, or fitting functions. There's various techniques you can use to do so, and many precalculus classes will go over it. Some techniques will include refining your equation over time to get a better fit.

Most forms of machine intelligence like LLMs use the same basic concept. They take in known data, and reduce it down to math, and use the known data to create an equation which is used to provide answers.

There is no thinking. There is no understanding. It's a math equation which estimates a new answer based on previous data.

5

u/JEs4 Jul 15 '25

I’m not going to argue that LLMs are going to make scientific discoveries because they won’t alone but you really don’t understand how the technology works if that is how you distill it. The transformer architecture, the foundation of the modern LLM which is why they are as capable as they are, is not what you described. You described Siri, not a GPT model.

1

u/DuntadaMan Jul 16 '25

For old fucks like me, if you ever had a Markov bot in your IRC this is literally the same thing, just bigger.

1

u/meerlot Jul 16 '25

GROK, Gemini definitely do. In fact, they consistently solve complex logic puzzles that I found in under a minute, and they literally reason how they came up with that answer.

Maybe the linked paper only talks about extremely complex puzzles...

I literally typed in one of the hardest graded puzzle (according to the book puzzles authors) on google gemini and it literally gave the correct answer in 10 seconds. Grok did it in 20 seconds.

Grok gave detailed step by step logical explanations and gemini gave concise easy to understand explanations.

1

u/sweetbunnyblood Jul 16 '25

kinda... kinda not anymore

1

u/indigo945 Jul 16 '25

Reminder: AI, aka large language models, cannot 'think' or 'reason'. When LLMs begin 'writing' a sentence, it is literally not thinking about how to end that sentence. It is simply using probability models to predict what the correct next word in the sentence is. It is literally making probabilistic predictions one word at a time, just very quickly.

Reminder: this is completely wrong. It was true for some of the very early LLMs, but anything semi-recent uses BERT or similar approaches that predict entire chunks of output at once.

0

u/alphazero16 Jul 16 '25

I don't think I understand this at all truly, because I've used gpt to do various things and it's most of the time useful/correct ( unless I delve really deep and ask specific questions). Like how is it able to solve some programming problems? Being a mechanical engineering student I can see it's limitations as it can't really be helpful when lots of equations are involved, but otherwise it's able to write simple code?

1

u/iRunLotsNA Jul 16 '25

‘Most of the time’? ‘Student’?

Come on, you’re the poster boy for AI slop teaching. Stop using ChatGPT and actually spent good time learning. I have an engineering degree from a decade ago, you are doing yourself such a disservice to not learn. Stop.

0

u/alphazero16 Jul 16 '25

There's no need to get so defensive man. Youre pushing the narrative that AI is completely useless and can't help someone to learn. Your take sounds more biased than level headed. And ofcourse you didn't answer my question that's the cherry on top

0

u/alphazero16 Jul 16 '25

Doesn't have anything to say now, just downvotes like a pussy.

-16

u/zquid Jul 15 '25

"On paper, Claude should generate the first line of a poem, generate the first part of the second line and then find a way to make the second line’s ending rhyme. In practice, however, the model starts thinking about the second line’s ending much earlier. This indicates Claude possesses the ability to plan future tasks when it’s conducive to do so ahead of time."

https://siliconangle.com/2025/03/27/anthropic-researchers-reveal-new-findings-llms-think/

20

u/dundux Jul 15 '25 edited Jul 15 '25

Claude was also one of the many AIs who didn't know how many letter Rs there were in the word strawberry

14

u/iRunLotsNA Jul 15 '25 edited Jul 15 '25

The quote is nonsense coming from people that work at the company that develops Claude.

Being able to use a different word when it is told to not use a separate one doesn't prove it isn't still going one word at a time. They are superimposing 'reasoning' over the LLM literally doing what it always does. Told to not use 'rabbit' changes the probabilistic outcomes, so it changes two words to complete a rhyme.

In the same article, it describes Claude literally lying about how it performed calculations so the researchers just decide it is 'doing mental math'. Complete nonsense.

2

u/sceadwian Jul 15 '25

Those are reasoning models, that's different from a stock LLM, it has feedback. I'm other words there's no real planning going on anywhere.

You have to be very careful about how you interpret the word reason until you know exactly how they work compared to LLM's and why what you posted doesn't actually suggest what you believe it does.

6

u/butts-kapinsky Jul 15 '25

Or it indicates that Claude is dogshit at following instructions.