Sign In to Download.

Dataset Overview

Dataset Type

ASR speech corpus

Language

zh & en-CN

Speech Style

Scripted

Content

Daily-Use Sentence (Chinese-English Code-Mixing)

Audio Parameters

16 kHz, 16 bits, mono

File Format

WAV (PCM)
TXT (UTF8)

Recording Equipment

mobile

Recording Environment

indoor environment

Popular Datasets

Proprietary
ASR Corpus
1650 hours

Chinese-English Code-Mixing Scripted Speech Corpus – Daily-Use Sentence

MDT-ASR-D028 | 1,650 hours of transcribed Chinese-English Code-Mixing scripted speech on daily use sentences

Dataset Overview

Dataset Type

ASR speech corpus

Language

zh & en-CN

Speech Style

Scripted

Content

Daily-Use Sentence (Chinese-English Code-Mixing)
16 kHz, 16 bits, mono

File Format

WAV (PCM)
TXT (UTF8)

Recording Equipment

mobile

Recording Environment

indoor environment

License

This dataset consists of 1,650 hours of transcribed Chinese-English Code-Mixing scripted speech focusing on daily use sentences contributed by 2,134 speakers.

Contact business@magicdatatech.com to learn more.

Comments

{{ reviewsTotal }} Review
{{ reviewsTotal }} Reviews
{{ options.labels.newReviewButton }}
{{ userData.canReview.message }}