Italy🔬 Science18 days ago

AI beaten by humans in a difficult math test

In a rigorous mathematical test, four AI models, including ChatGPT 5.5 Pro, were evaluated against human performance. None of the models correctly answered all 10 questions. The best-performing model was developed by ETH Zurich, solving six out of ten problems. The test, part of the independent project First Proof, aimed to assess AI capabilities in mathematical research. Questions were previously unpublished to prevent models from relying on prior training data. A group of 30 mathematicians verified the responses. Only publicly available models participated, which limited involvement to OpenA

Go to the primary sources (1)

The official sources this coverage is built on. Read them directly to bypass framing.

Source documentNature

1 reports

ANSAIndependentCenter18 days ago

AI beaten by humans in a difficult math test

Bias read (Center): The article presents factual results of an AI benchmarking test without overtly favoring any side. It describes the methodology, participants, and outcomes neutrally.

Keep the news honest.

ObjectiveNews is reader-funded and ad-free — we show you the bias instead of hiding it. Support independent journalism for €5/month.

Become a Supporter

AI beaten by humans in a difficult math test

Go to the primary sources (1)

1 reports

Keep the news honest.

Related stories

Three things to watch amid Anthropic’s latest feud with the government

NSW government ‘absolutely thrilled’ to welcome OpenAI ... until someone mentioned the Terminator films

Journalist Karen Hao on Sam Altman, OpenAI & the "Quasi-Religious" Push for Artificial Intelligence

Why Wall Street thinks US memory maker Micron is the next Nvidia

OpenAI: Greece, Germany and Italy are most at risk from AI

OpenAI has reportedly offered to sell a 5% stake in the company to the U.S. government to reduce political pressure