View on GitHub

Dutch Open Speech Recognition Benchmark

Results of Dutch ASR models, collected by the community

Back to homepage

N-Best 2008 Dutch

The N-Best 2008 Dutch Evaluation corpus is a corpus designed to evaluate Dutch/Flemish Speech Recognition systems in 2008. The corpus consists of 4 subsets:

For more details about the corpus, click here.


Here is a matrix with WER results of the baseline model, Kaldi_NL, as well as different end-to-end models tested on this corpus:

Model\Dataset Broadcast News in NL Conversational Speech in NL Broadcast News in BE Conversational Speech in BE
Kaldi_NL 12.6% 38.6% 21.2% 59.4%
Whisper v2 12.7% 25.9% - -
Whisper v3 13.7% 28.1% - -
Whisper v2 w/ VAD 11.6% 25.3% - -
Whisper v3 w/ VAD 14.1% 26.5% - -
faster-whisper v2 10.6% 24.1% 13.0% 38.5%
faster-whisper v3 12.5% 25.5% 14.9% 38.4%
faster-whisper v2 w/ VAD 10.0% 23.9% 13.6% 37.9%
faster-whisper v3 w/ VAD 12.3% 25.1% 14.6% 36.9%
XLS-R FT on Dutch 14.8% 33.5% 17.0% 51.7%
MMS - 102 languages 23.4% 49.0% 25.0% 64.2%
MMS - 1162 languages 18.5% 42.7% 19.4% 57.7%


And here are results for the same models on Broadcast News in the Netherlands with the foreign speech lines removed from the dataset:

Model\Dataset WER
Kaldi_NL 12.1%
Whisper v2 12.3%
Whisper v3 13.6%
Whisper v2 w/ VAD 11.1%
Whisper v3 w/ VAD 13.9%

Something to note about this setup of the dataset is that Whisper v3 was already able to recognize foreign speech and ignore it. The only notable exception was for a German speaker, where I assume that it was still transcribed because German and Dutch are very similar as languages. So, Whisper v3’s hypothesis files were mostly unedited.


Here is also a matrix with the time spent in total by each model to evaluate the respective subset:

Model\Dataset Broadcast News in NL Conversational Speech in NL Broadcast News in BE Conversational Speech in BE
Kaldi_NL 0h:08m:58s 0h:14m:47s 0h:15m:57s 0h:20m:07s
Whisper v2 1h:11m:59s 0h:53m:55s - -
Whisper v3 1h:09m:00s 0h:40m:20s - -
Whisper v2 w/ VAD 0h:52m:03s 0h:40m:09s - -
Whisper v3 w/ VAD 1h:02m:13s 0h:37m:50s - -
faster-whisper v2 0h:11m:31s 0h:09m:30s 0h:10m:55s 0h:09m:39s
faster-whisper v3 0h:11m:21s 0h:09m:41s 0h:10m:36s 0h:09m:54s
faster-whisper v2 w/ VAD 0h:12m:13s 0h:09m:36s 0h:11m:01s 0h:09m:46s
faster-whisper v3 w/ VAD 0h:12m:25s 0h:09m:13s 0h:10m:45s 0h:09m:04s
XLS-R FT on Dutch 0h:07m:36s 0h:07m:52s 0h:08m:44s 0h:08m:21s
MMS - 102 languages 0h:04m:33s 0h:04m:23s 0h:04m:38s 0h:03m:54s
MMS - 1162 languages 0h:05m:26s 0h:04m:14s 0h:04m:37s 0h:03m:55s

Preprocessing, setup, and postprocessing

For more details, click here.