MagicData
SIGN IN

Dataset Overview

Dataset Type

ASR speech corpus

Language

zh-CN

Speech Style

Scripted

Content

Daily-Use Sentence

Audio Parameters

16 kHz, 16 bits, 3 channels

File Format

WAV (PCM)
TXT (UTF8)

Recording Equipment

microphone, mobile & bluetooth headset

Recording Environment

indoor, outdoors, in-vehicle, public place
Proprietary
ASR Corpus
506 hours

ASR-SCDuSC: A Scripted Chinese Daily-use Speech Corpus

MDT-ASR-F001 | 506 hours of transcribed Mandarin Chinese scripted speech on daily use sentences

This dataset consists of 506 hours of transcribed Mandarin Chinese scripted speech focusing on daily use sentences contributed by 768 speakers.

Contact business@magicdatatech.com to learn more.

Sample:

Dataset Overview

Dataset Type

ASR speech corpus

Language

zh-CN

Speech Style

Scripted

Content

Daily-Use Sentence

Audio Parameters

16 kHz, 16 bits, 3 channels

File Format

WAV (PCM)
TXT (UTF8)

Recording Equipment

microphone, mobile & bluetooth headset

Recording Environment

indoor, outdoors, in-vehicle, public place

License

{{ reviewsTotal }}{{ options.labels.singularReviewCountLabel }}
{{ reviewsTotal }}{{ options.labels.pluralReviewCountLabel }}
{{ options.labels.newReviewButton }}
{{ userData.canReview.message }}

Verifying Email