AI2 closes the gap between closed-source and open-source post-training


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


The Allen Institute for AI (Ai2) claims to have narrowed the gap between closed-source and open-sourced post-training with the release of its new model training family, Tülu 3, bringing the argument that open-source models will thrive in the enterprise space. 

Tülu 3 brings open-source models up to par with OpenAI’s GPT models, Claude from Anthropic and Google’s Gemini. It allows researchers, developers and enterprises to fine-tune open-source models without losing data and core skills of the model and get it close to the quality of closed-source models. 

Ai2 said it released Tülu 3 with all of the data, data mixes, recipes, code, infrastructure and evaluation frameworks. The company needed to create new datasets and training methods to improve Tülu’s performance, including “training directly on verifiable problems with reinforcement learning.”

“Our best models result from a complex training process that integrates partial details from proprietary methods with novel techniques and established academic research,” Ai2 said in a blog post. “Our success is rooted in careful data curation, rigorous experimentation, innovative methodologies and improved training infrastructure.”

Tülu 3 will be available in a range of sizes. 

Open-source for enterprises

Open-source models often lagged behind closed-sourced models in enterprise adoption, although more companies anecdotally reported choosing more open-source large language models (LLMs) for projects. 

Ai2’s thesis is that improving fine-tuning with open-source models like Tülu 3 will increase the number of enterprises and researchers picking open-source models because they can be confident it can perform as well as a Claude or Gemini. 

The company points out that Tülu 3 and Ai2’s other models are fully open source, noting that big model trainers like Anthropic and Meta, who claim to be open source, have “none of their training data nor training recipes are transparent to users.” The Open Source Initiative recently published the first version of its open-source AI definition, but some organizations and model providers don’t fully follow the definition in their licenses. 

Enterprises care about the transparency of models, but many choose open-source models not so much for research or data openness but because it’s the best fit for their use cases. 

Tülu 3 offers enterprises more of a choice when looking for open-source models to bring into their stack and fine-tune with their data. 

Ai2’s other models, OLMoE and Molmo, are also open source which the company said has started to outperform other leading models like GPT-4o and Claude. 

Other Tülu 3 features

Ai2 said Tülu 3 lets companies mix and match their data during fine-tuning. 

“The recipes help you balance the datasets, so if you want to build a model that can code, but also follow instructions precisely and speak in multiple languages, you just select the particular datasets and follow the steps in the recipe,” Ai2 said. 

Mixing and matching datasets can make it easier for developers to move from a smaller model to a larger weighted one and keep its post-training settings. The company said the infrastructure code it released with Tülu 3 allows enterprises to build out that pipeline when moving through model sizes. 

The evaluation framework from Ai2 offers a way for developers to specify settings in what they want to see out of the model. 



Leave a Comment