Ex-OpenAI DeepMinders secure $150M for technology tackling AI hallucinations

An AI research lab has secured $150 million in new funding to develop tools that make artificial intelligence systems less opaque and more controllable, addressing a core challenge in the field. The Series B round values San Francisco-based Goodfire at $1.25 billion, as reported by Tech Funding News.
The investment, led by B Capital, will fuel the creation of a “model design environment.” This platform is intended to allow developers to understand, debug, and deliberately design AI systems at scale, moving away from guesswork about how changes affect a model’s behaviour.
The Interpretability Challenge
Goodfire’s work targets a fundamental issue: most current AI models operate as “black boxes.” They can perform tasks like writing and prediction, but even their creators often cannot see why a specific output is generated. This lack of visibility makes AI difficult to control, repair, and deploy safely.
The company, led by Eric Ho, is a research-focused entity aiming to build powerful AI by prioritising interpretability—making systems understandable and adjustable, much like conventional software—rather than focusing solely on increasing their size. Goodfire plans to continue its research into fundamental model understanding and new methods for interpretation.
Participating investors in the round include existing backers Menlo Ventures, Lightspeed Venture Partners, South Park Commons, and Wing Venture Capital. New supporters are DFJ Growth, Salesforce Ventures, and former Google CEO Eric Schmidt. In total, the firm is now backed by over $200 million.
Targeting AI From the Inside
Goodfire’s technical approach involves targeting specific internal components within an AI model that drive its behaviour, instead of retraining entire systems from scratch. In one demonstrated application, this method nearly halved the occurrence of hallucinations—factually incorrect or nonsensical outputs—in a large language model by directly adjusting these internal mechanisms.
The same technique is being applied to scientific research. By reverse-engineering scientific AI models in collaboration with partners including the Mayo Clinic and the Arc Institute, Goodfire recently helped identify a new class of biomarkers for Alzheimer’s disease.
According to the company, it is part of an emerging group of “neolabs”—research-first AI companies pursuing breakthroughs in areas like model training that have been overlooked by larger “scaling labs” such as OpenAI and Google DeepMind.
“Interpretability, for us, is the toolset for a new domain of science: a way to form hypotheses, run experiments, and ultimately design intelligence rather than stumbling into it,” explained Goodfire CEO Eric Ho.
The sentiment was echoed by investor Yan-David “Yanda” Erlich, General Partner at B Capital, who highlighted the widespread struggle among machine learning teams to understand why their models behave as they do. “Bridging that gap is the next frontier,” he said.
Goodfire’s team comprises researchers with neural network interpretability experience from organisations like OpenAI and DeepMind, academics from institutions including Harvard and Stanford, and engineering talent from OpenAI and Google. Key members include Nick Cammarata, a core contributor to OpenAI’s early interpretability work; co-founder Tom McGrath, who founded the interpretability team at Google DeepMind; and Leon Bergen, a professor at UC San Diego currently on leave.



