Total Size: 4.2G

Sign In to Download.

Dataset Overview

Dataset Type

ASR speech corpus

Language

English and Czech

Speech Style

spontaneous conversation

Content

themed conversations

Audio Parameters

File Format

WAV (PCM) TXT (UTF8)

Recording Equipment

Recording Environment

/

Popular Datasets

Open Source
5.22 hours
ASR Corpus
421.95 MB
Open Source
4.54 hours
ASR Corpus
322 MB
Open Source
4 hours
ASR Corpus
308 MB
Open Source
3 hours
ASR Corpus
158.12 MB
Open Source
NLP Corpus
8 KB
Open Source
5.2 hours
ASR Corpus
202 MB
ASR Corpus

English and Czech telephone converation data from Vystadial

The data comprise over 41 hours of speech in English and over 15 hours in Czech, plus orthographic transcriptions.

Dataset Overview

Dataset Type

ASR speech corpus

Language

English and Czech

Speech Style

spontaneous conversation

Content

themed conversations

File Format

WAV (PCM) TXT (UTF8)

Recording Equipment

Recording Environment

/

This open-source dataset consists of telephone conversations in English and Czech, developed for training acoustic models for automatic speech recognition in spoken dialogue systems. It ships in three parts: Czech data, English data, and scripts.

Comments

{{ reviewsTotal }} Review
{{ reviewsTotal }} Reviews
{{ options.labels.newReviewButton }}
{{ userData.canReview.message }}