Mistral surcharges voice AI with new models | The Deep View

Created 2/4/2026 at 3:33:00 PMEdited 2/4/2026 at 3:33:42 PM

Voxtral Realtime - A 4 billion parameter model aimed at live transcription, achieving “state of the art” transcription with 480ms latency across 13 languages. It can be configurable down to sub-200ms latency.

Performance on the FLEURS benchmark shows that Voxtral Mini Transcribe V2 performs competitively against models from Gemini and OpenAI, with the lowest diarization error rate.

Public