ChatGPT and other AI chatbots based on large language models are known to occasionally make things up, including scientific and legal citations. It turns out that measuring how accurate an AI model’s citations are is a good way of assessing the model’s reasoning abilities.
Popular AIs head-to-head: OpenAI beats DeepSeek on sentence-level reasoning
Reader’s Picks
-
When Grandma and Grandpa are in charge, the children are likely staring at a screen—a long-standing parental complaint now supported [...]
-
Researchers at the University of Tsukuba have demonstrated that intensified environmental variability (EV) can promote the evolution of cooperation through [...]
-
Using Major League Baseball as a case study, Cornell research highlights potential shortcomings in diversity metrics that could obscure inequities [...]