Total Size: 4.2G

Dataset Overview

Dataset Type

ASR speech corpus

Language

English and Czech

Speech Style

spontaneous conversation

Content

themed conversations

Audio Parameters

File Format

WAV (PCM) TXT (UTF8)

Recording Equipment

Recording Environment

License

Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0 US)

Third Party

ASR Corpus

ASR-Vystadial: An English and Czech Telephone Conversational Corpus from the Vystadial Project

The data comprise over 41 hours of speech in English and over 15 hours in Czech, plus orthographic transcriptions.

About this resource:

This data is transcribed from telephone conversation data, in English and Czech.

The data collection process and development of these training scripts were partly funded by the Ministry of Education, Youth and Sports of the Czech Republic under the grant agreement LK11221 and core research funding from Charles University in Prague.

You can cite the data using the following BibTeX entry:

@inproceedings{korvas_2014,
  title={{Free English and Czech telephone speech corpus shared under the CC-BY-SA 3.0 license}},
  author={Korvas, Mat\v{e}j and Pl\'{a}tek, Ond\v{r}ej and Du\v{s}ek, Ond\v{r}ej and \v{Z}ilka, Luk\'{a}\v{s} and Jur\v{c}\'{i}\v{c}ek, Filip},
  booktitle={Proceedings of the Eigth International Conference on Language Resources and Evaluation (LREC 2014)},
  pages={To Appear},
  year={2014},
}

Dataset Overview

Dataset Type

ASR speech corpus

Language

English and Czech

Speech Style

spontaneous conversation

Content

themed conversations

Audio Parameters

File Format

WAV (PCM) TXT (UTF8)

Recording Equipment

Recording Environment

License

Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0 US)

备案号: 京ICP备18008050号-6号

京公网安备 11010802035822号

Your IP is: 216.73.216.147

SIGN IN

SIGN UP

Total Size: 4.2G

Dataset Overview

Dataset Type

Language

Speech Style

Content

Audio Parameters

File Format

Recording Equipment

Recording Environment

License

Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0 US)

ASR-Vystadial: An English and Czech Telephone Conversational Corpus from the Vystadial Project

Dataset Overview

Dataset Type

Language

Speech Style

Content

Audio Parameters

File Format

Recording Equipment

Recording Environment

License

Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0 US)

京公网安备 11010802035822号

SIGN IN

SIGN UP

Total Size: 4.2G

Dataset Overview

Dataset Type

Language

Speech Style

Content

Audio Parameters

File Format

Recording Equipment

Recording Environment

License

Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0 US)

ASR-Vystadial: An English and Czech Telephone Conversational Corpus from the Vystadial Project

Dataset Overview

Dataset Type

Language

Speech Style

Content

Audio Parameters

File Format

Recording Equipment

Recording Environment

License

Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0 US)

京公网安备 11010802035822号

Verifying Email