Noam Brown, who leads AI reasoning research at OpenAI, says “reasoning” AI models like OpenAI’s o1 could’ve arrived 20 years earlier had researchers “known [the right] approach” and algorithms.

“There were various reasons why this research direction was neglected,” Brown said during a panel at Nvidia’s GTC conference in San Jose on Wednesday. “I noticed over the course of my research that, OK, there’s something missing. Humans spend a lot of time thinking before they act in a tough situation. Maybe this would be very useful [in AI].”

Brown is one of the principal architects behind o1, an AI model that employs a technique called test-time inference to “think” before it responds to queries. Test-time inference entails applying additional computing to running models to drive a form of “reasoning.” In general, so-called reasoning models are more accurate and reliable than traditional models, particularly in domains like mathematics and science.

Brown stressed, however, that pre-training — training ever-larger models on ever-larger datasets — isn’t exactly “dead.” AI labs including OpenAI once invested most of their efforts in scaling up pre-training. Now, they’re splitting time between pre-training and test-time inference, according to Brown — approaches that Brown described as complementary.

Brown was asked during the panel whether academia could ever hope to perform experiments on the scale of AI labs like OpenAI, given institutions’ general lack of access to computing resources. He admitted that it’s become tougher in recent years as models have become more computing-intensive, but that academics can make an impact by exploring areas that require less computing, like model architecture design.

“[T]here is an opportunity for collaboration between the frontier labs [and academia],” Brown said. “Certainly, the frontier labs are looking at academic publications and thinking carefully about, OK, does this make a compelling argument that, if this were scaled up further, it would be very effective. If there is that compelling argument from the paper, you know, we will investigate that in these labs.”

Brown’s comments come at a time when the Trump administration is making deep cuts to scientific grant-making. AI experts including Nobel Laureate Geoffrey Hinton have criticized these cuts, saying that they may threaten AI research efforts both domestic and abroad.

Brown called out AI benchmarking as an area where academia could make a significant impact. “The state of benchmarks in AI is really bad, and that doesn’t require a lot of compute to do,” he said.

As we’ve written about before, popular AI benchmarks today tend to test for esoteric knowledge, and give scores that correlate poorly to proficiency on tasks that most people care about. That’s led to widespread confusion about models’ capabilities and improvements.

By sapbeu

Leave a Reply

Your email address will not be published. Required fields are marked *