welcome
Ars Technica

Ars Technica

Technology

Technology

New study shows why simulated reasoning AI models don’t yet live up to their billing

Ars Technica
Summary
Nutrition label

85% Informative

AI models that purport to "reason" can solve routine math problems with impressive accuracy.

But when faced with formulating deeper mathematical proofs, they often fail.

Researchers presented simulated reasoning models with problems from the 2025 US Math Olympiad .

Most models scored below 5 percent correct on average when generating mathematical proofs.

VR Score

91

Informative language

95

Neutral language

17

Article tone

semi-formal

Language

English

Language complexity

60

Offensive language

not offensive

Hate speech

not hateful

Attention-grabbing headline

not detected

Known propaganda techniques

not detected

Time-value

long-living

Affiliate links

no affiliate links