Whisper Large v3 Turbo
Transcriptionswhisper-large-v3-turbo
Transcriptions
Whisper large-v3-turbo is a finetuned version of a pruned Whisper large-v3. In other words, it's the exact same model, except that the number of decoding layers have reduced from 32 to 4. As a result, the model is way faster, at the expense of a minor quality degradation.
Throughput
Not enough throughput data
Time-To-First-Token (TTFT)
Not enough TTFT data