Total Size: 1.13 GB

Sign In to Download.

Dataset Overview

Dataset Type

ASR speech corpus

Language

yue-Guangdong, Yue Chinese (Guangdong, China)

Speech Style

scripted monologue

Content

digits, commands and queries

Audio Parameters

16 kHz, 16 bits, dual

File Format

WAV (PCM)

Recording Equipment

microphone

Recording Environment

in the vehicle

Popular Datasets

Open Source
ASR Corpus
5 hours

Guangzhou Cantonese Scripted Speech Corpus – in the Vehicle

5 hours of transcribed Guangzhou Cantonese scripted speech in the vehicle

Dataset Overview

Dataset Type

ASR speech corpus

Language

yue-Guangdong, Yue Chinese (Guangdong, China)

Speech Style

scripted monologue

Content

digits, commands and queries
16 kHz, 16 bits, dual

File Format

WAV (PCM)

Recording Equipment

microphone

Recording Environment

in the vehicle

This open-source dataset consists of 5 hours of transcribed Guangzhou Cantonese scripted speech in the vehicle focusing on digits, commands and queries, where 6,219 utterances contributed by ten speakers were contained.

Sample:

” 世纪大道塞唔塞车啊 ”

The dataset is provided on an “As Is” basis, and no warranty, either expressed or implied, is given. Your use of the dataset is at your sole risk. You expressly understand and agree that MagicHub and/or Beijing Magic Data Technology Co., Ltd. shall not be liable for any direct, indirect, incidental, special or consequential damages; including but not limited to, damages for loss of profits, goodwill, use, data or other intangible losses related to the datasets.

Copyright © 2021 Beijing Magic Data Technology Co., Ltd. All rights reserved.

Similar datasets are available! Please feel free to CONTACT US if you have any questions or data requirements.

Comments

{{ reviewsTotal }} Review
{{ reviewsTotal }} Reviews
{{ options.labels.newReviewButton }}
{{ userData.canReview.message }}