ETH Zurich and EPFL will release a fully open-source large language model in late summer 2025, trained on the “Alps” supercomputer at the Swiss National Supercomputing Centre. The model represents a significant milestone in open AI development, offering multilingual fluency in over 1,000 languages and positioning European institutions as credible alternatives to closed commercial systems from the US and China.
What you should know: The LLM will be completely transparent, with source code, weights, and training data publicly available under the Apache 2.0 License.
- Unlike commercial models developed behind closed doors, this approach enables high-trust applications and supports regulatory compliance under frameworks like the EU AI Act.
- The model was developed through collaboration between EPFL, ETH Zurich, CSCS (the Swiss National Supercomputing Centre), and other Swiss universities as part of the Swiss AI Initiative.
- “Fully open models enable high-trust applications and are necessary for advancing research about the risks and opportunities of AI,” says Imanol Schlag, research scientist at the ETH AI Center leading the effort.
The big picture: This release marks Europe’s most ambitious effort to create sovereign AI infrastructure independent of US and Chinese commercial systems.
- The Swiss AI Initiative, launched in December 2023, represents the world’s largest open science effort dedicated to AI foundation models, involving over 800 researchers across 10+ Swiss institutions.
- Training on the “Alps” supercomputer demonstrates how public research institutions can leverage advanced infrastructure for open innovation rather than proprietary development.
Multilingual by design: The model’s defining feature is its fluency across more than 1,000 languages, trained on a dataset spanning over 1,500 languages.
- The training data consists of approximately 60% English and 40% non-English languages, plus code and mathematics data.
- “We have emphasised making the models massively multilingual from the start,” says Antoine Bosselut from EPFL’s AI Center.
- This global language representation ensures the model maintains the highest international applicability.
Technical specifications: The model will be released in two versions to meet diverse user needs.
- An 8 billion parameter version for general use and a 70 billion parameter version ranking among the world’s most powerful fully open models.
- Training involved over 15 trillion high-quality tokens, enabling robust language understanding and versatile applications.
- The model was trained using 100% carbon-neutral electricity on “Alps,” equipped with over 10,000 NVIDIA Grace Hopper Superchips.
In plain English: Parameters are like the model’s memory capacity—more parameters mean it can learn and remember more complex patterns. Tokens are the building blocks of language processing, representing words or parts of words that help the AI understand and generate text.
Responsible development: The project adheres to Swiss data protection laws, copyright regulations, and EU AI Act transparency requirements.
- Recent research by the project leaders demonstrated that respecting web crawling opt-outs during data acquisition produces virtually no performance degradation for most everyday tasks.
- The transparent approach enables organizations to build applications while maintaining compliance with emerging AI regulations.
Strategic infrastructure: The “Alps” supercomputer enables Switzerland to develop sovereign AI capabilities through public-private partnerships.
- “Training this model is only possible because of our strategic investment in ‘Alps’, a supercomputer purpose-built for AI,” says Thomas Schulthess, Director of CSCS and ETH Zurich professor.
- The 15-year collaboration with NVIDIA and HPE/Cray exemplifies how partnerships between public institutions and industry can drive open innovation.
What they’re saying: Leaders emphasize the importance of public institutions driving open AI development.
- “As scientists from public institutions, we aim to advance open models and enable organizations to build on them for their own applications,” says Antoine Bosselut.
- Martin Jaggi, EPFL professor, notes that “by embracing full openness — unlike commercial models that are developed behind closed doors — we hope that our approach will drive innovation in Switzerland, across Europe, and through multinational collaborations.”
Global impact: The initiative positions European research institutions as leaders in trustworthy AI development.
- Both the ETH AI Center and EPFL AI Center serve as regional units of ELLIS (European Laboratory for Learning and Intelligent Systems), a pan-European network focused on fundamental AI research.
- The Swiss AI Initiative receives financial support from the ETH Board for 2025-2028, with access to over 20 million yearly GPU hours on the “Alps” supercomputer.
A language model built for the public good