MagicData
SIGN IN

Total Size: 21G

Dataset Overview

Dataset Type

ASR speech corpus

Language

English

Speech Style

scripted monologue

Content

daily use sentences

Audio Parameters

16 kHz

File Format

Recording Equipment

Recording Environment

License

Creative Commons BY-NC-ND 3.0

Third Party
ASR Corpus

ASR-TEDLIUM: An English Speech Corpus from TED-LIUM

About this resource:

The TED-LIUM corpus (mirrored here) is English-language TED talks, with transcriptions, sampled at 16kHz. It contains about 118 hours of speech.

The original page requests that you cite the following paper if you make use of this corpus:

A. Rousseau, P. Deléglise, and Y. Estève, "TED-LIUM: an automatic speech recognition dedicated corpus",
in Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), May 2012.

Dataset Overview

Dataset Type

ASR speech corpus

Language

English

Speech Style

scripted monologue

Content

daily use sentences

Audio Parameters

16 kHz

File Format

Recording Equipment

Recording Environment

License

Creative Commons BY-NC-ND 3.0

{{ reviewsTotal }}{{ options.labels.singularReviewCountLabel }}
{{ reviewsTotal }}{{ options.labels.pluralReviewCountLabel }}
{{ options.labels.newReviewButton }}
{{ userData.canReview.message }}

Verifying Email