Total Size: 21G

Sign In to Download.

Dataset Overview

Dataset Type

ASR speech corpus

Language

English

Speech Style

scripted monologue

Content

daily use sentences

Audio Parameters

16 kHz

File Format

Recording Equipment

Recording Environment

Popular Datasets

ASR Corpus

English Speech Corpus from TED-LIUM

Dataset Overview

Dataset Type

ASR speech corpus

Language

English

Speech Style

scripted monologue

Content

daily use sentences
16 kHz

File Format

Recording Equipment

Recording Environment

The TED-LIUM corpus is English-language TED talks, with transcriptions, sampled at 16kHz. It contains about 118 hours of speech.

The dataset is provided on an “As Is” basis, and no warranty, either expressed or implied, is given. Your use of the dataset is at your sole risk. You expressly understand and agree that MagicHub and/or Beijing Magic Data Technology Co., Ltd. shall not be liable for any direct, indirect, incidental, special or consequential damages; including but not limited to, damages for loss of profits, goodwill, use, data or other intangible losses related to the datasets.

Copyright © 2021 Beijing Magic Data Technology Co., Ltd. All rights reserved.

Similar datasets are available! Please feel free to CONTACT US if you have any questions or data requirements.

Comments

{{ reviewsTotal }} Review
{{ reviewsTotal }} Reviews
{{ options.labels.newReviewButton }}
{{ userData.canReview.message }}