Liquid AI's new vision-language models run 2x faster on smartphones

Liquid AI has launched LFM2-VL, a new family of vision-language foundation models designed for efficient deployment on smartphones, laptops, wearables, and embedded systems. The models promise up to twice the GPU inference speed of comparable vision-language models while maintaining competitive accuracy, addressing the growing demand for on-device AI that can process both text and images without relying on cloud infrastructure.

What you should know: LFM2-VL represents a significant step toward making multimodal AI accessible for resource-constrained devices through architectural innovations that prioritize efficiency.

The models can process images at native resolutions up to 512×512 pixels without distortion, using smart patching for larger images that preserves both fine detail and global context.
Two variants are available: LFM2-VL-450M with less than half a billion parameters for highly constrained environments, and LFM2-VL-1.6B for more capable single-GPU deployment.
Both models are built on Liquid AI’s foundation architecture based on dynamical systems and signal processing principles, moving beyond traditional transformer models.

How it works: The models use a modular architecture combining multiple specialized components to achieve their efficiency gains.

LFM2-VL integrates a language model backbone, SigLIP2 NaFlex vision encoder, and a multimodal projector with pixel unshuffle technology that reduces image tokens and improves throughput.
Users can adjust parameters like maximum image tokens or patches to balance speed and quality for specific deployment scenarios.
Training involved approximately 100 billion multimodal tokens from open datasets and synthetic data.

Performance benchmarks: LFM2-VL achieves competitive results across vision-language evaluations while delivering superior processing speeds.

The 1.6B model scores 65.23 in RealWorldQA, 58.68 in InfoVQA, and 742 in OCRBench, maintaining solid performance in multimodal reasoning tasks.
In inference testing, LFM2-VL achieved the fastest GPU processing times in its class when tested on a standard workload of a 1024×1024 image with short prompt.

The bigger picture: This launch builds on Liquid AI’s broader strategy to decentralize AI execution and reduce cloud dependency.

In July 2025, the company launched the Liquid Edge AI Platform (LEAP), a cross-platform SDK enabling developers to run small language models directly on mobile devices.
LEAP offers OS-agnostic support for iOS and Android with models as small as 300MB, accompanied by Apollo, a companion app for offline model testing.
The approach reflects growing industry interest in privacy-preserving, low-latency AI that operates independently of internet connectivity.

What they’re saying: Liquid AI co-founder and CEO Ramin Hasani emphasized the company’s core value proposition in announcing the release.

“Efficiency is our product,” Hasani wrote on X, highlighting the models’ “up to 2× faster on GPU with competitive accuracy” and “smart patching for big images.”

Company background: Liquid AI was founded by former MIT CSAIL researchers focused on building alternatives to transformer-based architectures.

The company’s Liquid Foundation Models are based on principles from dynamical systems, signal processing, and numerical linear algebra.
Their approach aims to deliver competitive performance using significantly fewer computational resources while enabling real-time adaptability during inference.

Availability and licensing: The models are immediately accessible through standard development channels with custom licensing terms.

LFM2-VL models are available on Hugging Face with example fine-tuning code in Colab, compatible with Hugging Face transformers and TRL.
They’re released under a custom “LFM1.0 license” based on Apache 2.0 principles, with commercial use permitted under different terms for companies above and below $10 million in annual revenue.

Liquid AI’s new vision-language models run 2x faster on smartphones

Recent Stories

DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment

Tying it all together: Credo’s purple cables power the $4B AI data center boom

Vatican launches Latin American AI network for human development