MagicData
SIGN IN

Dataset Overview

Dataset Type

ASR speech corpus

Language

en-CN

Speech Style

scripted

Content

daily-use sentence, digits, phrase, vocabulary, and letter-by-letter spelling

Audio Parameters

16 kHz, 16 bits, mono

File Format

WAV (PCM)
TXT (UTF8)

Recording Equipment

mobile

Recording Environment

indoor environment
Proprietary
ASR Corpus
1172 hours

ASR-BigSCEDuSC: A Scripted Chinese English Daily-use Speech Corpus

MDT-ASR-A004 | MDT-ASR-E042 | 1,172 hours of transcribed Chinese English scripted speech

This dataset portfolio consists of 1,172 hours of transcribed Chinese English scripted speech on daily-use sentences, digits, phrases, vocabulary, and letter-by-letter spelling contributed by 3,430 speakers.

Contact business@magicdatatech.com to learn more.

Sample:

Dataset Overview

Dataset Type

ASR speech corpus

Language

en-CN

Speech Style

scripted

Content

daily-use sentence, digits, phrase, vocabulary, and letter-by-letter spelling

Audio Parameters

16 kHz, 16 bits, mono

File Format

WAV (PCM)
TXT (UTF8)

Recording Equipment

mobile

Recording Environment

indoor environment

License

{{ reviewsTotal }}{{ options.labels.singularReviewCountLabel }}
{{ reviewsTotal }}{{ options.labels.pluralReviewCountLabel }}
{{ options.labels.newReviewButton }}
{{ userData.canReview.message }}

Verifying Email