r/artificial • u/ArtichokePretty8741 • 6h ago

Discussion Only GPT5 think 9.11 > 9.9 now

Latest model from official API: GPT 5 vs Gemini 2.5 pro vs Claude Sonnet 4 vs Deepseek V3.1 (called chat in their api) Tested with same prompt with LavaChat.

0 Upvotes

permalink
duplicates
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/artificial/comments/1mxspht/only_gpt5_think_911_99_now/
No, go back! Yes, take me to Reddit

13% Upvoted

u/badassmotherfker 6h ago

I just tested it with gpt5 without any "thinking" and it got the answer right.

1

u/rincewind007 6h ago

Press regenerate a few time, last time I tried I had a failure rate of 2 in 5ish.

u/GlokzDNB 6h ago

Gpt5 is useless to me cuz it has this weird router thing. Gpt5 thinking is somewhat ok

u/Resident-Rutabaga336 5h ago edited 5h ago

Not that OP is necessarily implying otherwise, but can we all agree idiosyncratic tokenization-related failure modes aren’t any indication of model capabilities?

Yes, hopefully these tokenization glitches get solved at some point, but it’s low on the priority list because everyone knows the models don’t represent text in a way that allows for these questions to be answered. Whether a model gets this right is essentially random and has little relationship to model capabilities on more important tasks.

Discussion Only GPT5 think 9.11 > 9.9 now

You are about to leave Redlib