Mistral surcharges voice AI with new models | The Deep View

Created 2/4/2026 at 3:33:00 PM • Edited 2/4/2026 at 3:33:42 PM

https://www.thedeepview.com/articles/mistral-surcharges-voice-ai-with-new-models

Voxtral Realtime - A 4 billion parameter model aimed at live transcription, achieving “state of the art” transcription with 480ms latency across 13 languages. It can be configurable down to sub-200ms latency.

Performance on the FLEURS benchmark shows that Voxtral Mini Transcribe V2 performs competitively against models from Gemini and OpenAI, with the lowest diarization error rate.

speech-to-text llm/audio mistral

Public