Jasmin-CGN
Corpus description
Jasmin-CGN is a Dutch/Flemish corpus that contains speech from less-represented groups, such as the elderly, children, or non-natives.
The corpus is split mainly according to 2 criteria:
- Dutch/Flemish
- Read/HMI speech
- Speaker groups based on age and native/non-native
Thus, we have:
comp_p
: Speech recordings where a human interacts with a machine (HMI) in a Wizard of Oz setupcomp_q
: Speech recordings of people reading a text
And the speaker groups:
- Native children (ages 7-11)
- Native teenagers (ages 12-16)
- Non-native children (ages 7-16)
- Non-native adults (ages 18-60)
- Native elderly (ages 65+)
For more details about the corpus, check the documentation. The corpus, including its documentation, can be downloaded from here. The paper released for the corpus can be found here.