MagicData
SIGN IN

Total Size: 590 MB

Dataset Overview

Dataset Type

ASR speech corpus

Language

cmn-Sichuan,
Mandarin Chinese (Sichuan, China)

Speech Style

scripted monologue

Content

daily use sentences

Audio Parameters

16 kHz, 16 bits, mono

File Format

WAV (PCM)
TXT (UTF8)

Recording Equipment

mobile

Recording Environment

indoor environment

License

Magic Data
open-source license

Open Source
ASR Corpus

ASR-SCSichDiaDuSC: A Scripted Chinese Sichuan Dialect Daily-use Speech Corpus

6.4 hours of transcribed Sichuan dialect scripted speech
on daily use sentences

This open-source dataset consists of 6.4 hours of transcribed Sichuan dialect scripted speech focusing on daily use sentences, where 6,428 utterances contributed by ten speakers were contained.

Sample:

"北京哪里有批发精仿运动“嚡”【鞋】的地方啊?"

Dataset Overview

Dataset Type

ASR speech corpus

Language

cmn-Sichuan,
Mandarin Chinese (Sichuan, China)

Speech Style

scripted monologue

Content

daily use sentences

Audio Parameters

16 kHz, 16 bits, mono

File Format

WAV (PCM)
TXT (UTF8)

Recording Equipment

mobile

Recording Environment

indoor environment

License

Magic Data
open-source license

{{ reviewsTotal }}{{ options.labels.singularReviewCountLabel }}
{{ reviewsTotal }}{{ options.labels.pluralReviewCountLabel }}
{{ options.labels.newReviewButton }}
{{ userData.canReview.message }}

Verifying Email