Total Size: 3.09 GB

Sign In to Download.

Dataset Overview

Dataset Type

ASR speech corpus

Language

zh-CN, Mandarin Chinese (China)

Speech Style

scripted monologue

Content

commands and queries
in vehicle-related scenes

Audio Parameters

44.1 kHz, 16 bits, dual

File Format

WAV (PCM)
TXT (UTF-8)

Recording Equipment

microphone

Recording Environment

in-vehicle environment

Popular Datasets

Open Source
5.22 hours
ASR Corpus
421.95 MB
Open Source
4.54 hours
ASR Corpus
322 MB
Open Source
4 hours
ASR Corpus
308 MB
Open Source
3 hours
ASR Corpus
158.12 MB
Open Source
NLP Corpus
8 KB
Open Source
5.2 hours
ASR Corpus
202 MB
Open Source
ASR Corpus
6.13 hours

Mandarin Chinese Scripted Speech Corpus – in-Vehicle Scene

6.13 hours of transcribed Mandarin Chinese scripted speech
on commands and queries in vehicle-related scenes

Dataset Overview

Dataset Type

ASR speech corpus

Language

zh-CN, Mandarin Chinese (China)

Speech Style

scripted monologue

Content

commands and queries
in vehicle-related scenes
44.1 kHz, 16 bits, dual

File Format

WAV (PCM)
TXT (UTF-8)

Recording Equipment

microphone

Recording Environment

in-vehicle environment

This open-source dataset consists of 6.13 hours of transcribed Mandarin Chinese scripted speech focusing on commands and queries in vehicle-related scenes, where 5,948 utterances contributed by ten speakers were contained.

A noteworthy feature is that two microphones were set up while recording—one at the sun visor, another near the speaker’s mouth, on a front passenger seat. Synchronous dual voices, consequently, were recorded.

Sample:

“去珠江发展中心的最快路线”

The dataset is provided on an “As Is” basis, and no warranty, either expressed or implied, is given. Your use of the dataset is at your sole risk. You expressly understand and agree that MagicHub and/or Beijing Magic Data Technology Co., Ltd. shall not be liable for any direct, indirect, incidental, special or consequential damages; including but not limited to, damages for loss of profits, goodwill, use, data or other intangible losses related to the datasets.

Copyright © 2021 Beijing Magic Data Technology Co., Ltd. All rights reserved.

Similar datasets are available! Please feel free to CONTACT US if you have any questions or data requirements.

Comments

{{ reviewsTotal }} Review
{{ reviewsTotal }} Reviews
{{ options.labels.newReviewButton }}
{{ userData.canReview.message }}