Voice emerges as AI’s next frontier as Deepgram raises $130M | The Deep View
One of Deepgram’s goals for the upcoming year is to pass the Audio Turing Test, which assesses how realistic and human-like AI-generated audio sounds.
Public notes from activescott tagged with both #llm/audio and #speech-to-text
One of Deepgram’s goals for the upcoming year is to pass the Audio Turing Test, which assesses how realistic and human-like AI-generated audio sounds.
Voxtral Realtime - A 4 billion parameter model aimed at live transcription, achieving “state of the art” transcription with 480ms latency across 13 languages. It can be configurable down to sub-200ms latency.
Performance on the FLEURS benchmark shows that Voxtral Mini Transcribe V2 performs competitively against models from Gemini and OpenAI, with the lowest diarization error rate.