Total Size: 4.2G

Sign In to Download.

Dataset Overview

Dataset Type

ASR speech corpus

Language

English and Czech

Speech Style

spontaneous conversation

Content

themed conversations

Audio Parameters

File Format

WAV (PCM) TXT (UTF8)

Recording Equipment

Recording Environment

/

Popular Datasets

ASR Corpus

English and Czech telephone converation data from Vystadial

The data comprise over 41 hours of speech in English and over 15 hours in Czech, plus orthographic transcriptions.

Dataset Overview

Dataset Type

ASR speech corpus

Language

English and Czech

Speech Style

spontaneous conversation

Content

themed conversations

File Format

WAV (PCM) TXT (UTF8)

Recording Equipment

Recording Environment

/

This open-source dataset consists of telephone conversations in English and Czech, developed for training acoustic models for automatic speech recognition in spoken dialogue systems. It ships in three parts: Czech data, English data, and scripts.

Comments

{{ reviewsTotal }} Review
{{ reviewsTotal }} Reviews
{{ options.labels.newReviewButton }}
{{ userData.canReview.message }}