German firm makes DeepSeek AI 200% faster with 90% of original performance

Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage

Join Now

German AI consulting firm TNG Technology Consulting GmbH has released DeepSeek-TNG R1T2 Chimera, a significantly faster variant of DeepSeek’s popular open-source reasoning model R1-0528. The new model delivers 90% of the original’s intelligence while generating responses with 60% fewer tokens, translating to 200% faster inference and dramatically lower compute costs for enterprises.

What you should know: R1T2 represents a breakthrough in AI model efficiency through TNG’s Assembly-of-Experts (AoE) methodology, which merges multiple pre-trained models without additional training.

The model combines three parent models: DeepSeek-R1-0528, DeepSeek-R1, and DeepSeek-V3-0324, creating what TNG calls a “Tri-Mind” configuration.
Unlike traditional training approaches, AoE selectively merges weight tensors from existing models, preserving reasoning capabilities while reducing verbosity.
R1T2 maintains 90-92% of R1-0528’s performance on reasoning benchmarks like AIME-24, AIME-25, and GPQA-Diamond while using only 40% of the output tokens.

How Assembly-of-Experts differs from Mixture-of-Experts: AoE is a model merging technique rather than an architectural design, setting it apart from the more common MoE approach.

MoE models like DeepSeek-V3 conditionally activate different expert components during inference, with only a subset of experts active per token.
AoE creates new models by interpolating weight tensors from multiple pre-trained models, focusing on merging the routed expert tensors responsible for specialized reasoning.
TNG’s implementation retains efficient shared and attention layers from faster models while incorporating reasoning strength from more capable parents.

In plain English: Think of MoE like a large company where different departments handle different tasks as needed—only the relevant teams work on each project. AoE is more like creating a new employee by combining the best skills from three existing employees, without having to train someone from scratch.

Performance benchmarks: The speed improvements come from dramatically reduced output verbosity rather than raw processing acceleration.

R1T2 generates responses using approximately 40% of the tokens required by R1-0528, directly reducing inference time and compute load.
The model is 20% more concise than the original DeepSeek-R1 while maintaining similar reasoning quality.
TNG measures “speed” in terms of output token count per answer, which serves as a practical proxy for both cost and latency.

What the AI community is saying: Early response from developers has been overwhelmingly positive, with industry leaders praising the technical achievement.

“DAMN! DeepSeek R1T2 – 200% faster than R1-0528 & 20% faster than R1,” wrote Vaibhav Srivastav, a senior leader at Hugging Face, a popular AI model sharing platform, on X.
“Significantly better than R1 on GPQA & AIME 24, made via Assembly of Experts with DS V3, R1 & R1-0528 — and it’s MIT-licensed, available on Hugging Face.”

Deployment considerations: The model is available under an MIT License with some important limitations and regulatory considerations.

R1T2 is not recommended for function calling or tool use applications due to inherited limitations from its DeepSeek-R1 lineage.
European users must assess compliance with the EU AI Act, which takes effect August 2, 2025.
U.S. companies operating domestically face no EU AI Act restrictions, though provisions may apply if serving EU users.

About TNG Technology Consulting: The 24-year-old German firm operates as a values-based consulting partnership with over 900 employees, including a high concentration of PhDs and technical specialists.

Founded in January 2001 and based in Bavaria, TNG serves major enterprise clients across telecommunications, insurance, automotive, e-commerce, and logistics.
The company actively contributes to open-source communities and research, with previous Chimera variants processing billions of tokens daily through platforms like OpenRouter and Chutes.
TNG’s unique structure, grounded in operational research and self-management principles, supports a culture of technical innovation.

Why this matters for enterprises: R1T2 offers tangible benefits for technical decision-makers looking to balance AI performance with operational efficiency.

Lower inference costs through reduced GPU time and energy consumption, especially valuable in high-throughput environments.
High reasoning quality without the overhead of verbose responses, ideal for structured tasks requiring concise answers.
Open MIT licensing allows full deployment control and customization within regulated or air-gapped environments.
The AoE approach suggests a future where enterprises can build specialized AI variants by recombining existing model strengths rather than training from scratch.

HOLY SMOKES! A new, 200% faster DeepSeek R1-0528 variant appears from German lab TNG Technology Consulting GmbH

VentureBeat

Menu

German firm makes DeepSeek AI 200% faster with 90% of original performance

Recent News

Researchers from 14 universities caught hiding AI prompts in academic papers

Watch out, Hallmark, Google’s Gemini AI writes personal birthday letters with user data

Shark AI Ultra robot vacuum drops to $298 in Amazon sale

Join the revolution

CO/AI

Resources

Join the revolution

Menu

Welcome

German firm makes DeepSeek AI 200% faster with 90% of original performance

Recent News

Researchers from 14 universities caught hiding AI prompts in academic papers

Watch out, Hallmark, Google’s Gemini AI writes personal birthday letters with user data

Shark AI Ultra robot vacuum drops to $298 in Amazon sale

Join the revolution

CO/AI

Resources

Join the revolution