MagicData
SIGN IN

Total Size: 714 MB

Dataset Overview

Dataset Type

ASR speech corpus

Language

zh-CN, Mandarin Chinese (China)

Speech Style

scripted monologue

Content

muti-field sentences in vehicle-related scenes

Audio Parameters

16 kHz, 16 bits, mono

File Format

WAV (PCM)
TXT (UTF-8)

Recording Equipment

mobile

Recording Environment

indoor environment

License

Magic Data
open-source license

Open Source
ASRデータセット
8 hours

ASR-SCCabSC: A Scripted Chinese Cabin Speech Corpus

8 hours of transcribed Mandarin Chinese scripted speech focusing on muti-field sentences in vehicle-related scenes

This open-source dataset consists of 8 hours of transcribed Mandarin Chinese scripted speech focusing on muti-field sentences in vehicle-related scenes, where 8,480 utterances contributed by thirty-eight speakers were contained.

Sample:

"我想拨一下马奎新的幺五七开头的号码"

Dataset Overview

Dataset Type

ASR speech corpus

Language

zh-CN, Mandarin Chinese (China)

Speech Style

scripted monologue

Content

muti-field sentences in vehicle-related scenes

Audio Parameters

16 kHz, 16 bits, mono

File Format

WAV (PCM)
TXT (UTF-8)

Recording Equipment

mobile

Recording Environment

indoor environment

License

Magic Data
open-source license

{{ reviewsTotal }}{{ options.labels.singularReviewCountLabel }}
{{ reviewsTotal }}{{ options.labels.pluralReviewCountLabel }}
{{ options.labels.newReviewButton }}
{{ userData.canReview.message }}

Verifying Email