MagicData

sign in

Total Size: 694 MB

Dataset Overview

Dataset Type

ASR speech corpus

Language

zh-CN, Mandarin Chinese (China)

Speech Style

scripted monologue

Content

scripted speech of keyword spotting

Audio Parameters

48 kHz, 16 bits, mono

File Format

WAV (PCM)
TXT (UTF-8)

Recording Equipment

microphone

Recording Environment

indoor environment

License

Magic Data
open-source license

Open Source
ASR Corpus
3.23 hours

ASR-SCKwsptSC: A Scripted Chinese Keyword Spotting Speech Corpus

3.23 hours of transcribed Mandarin Chinese scripted speech of keyword spotting in fast, normal, and slow speed

This open-source dataset consists of 3.23 hours of transcribed Mandarin Chinese scripted speech of keyword spotting in fast, normal, and slow speed, where 4,546 utterances contributed by 102 speakers were contained.

Sample:

  • Normal Speed:

  • Fast Speed:

  • Slow Speed:

Dataset Overview

Dataset Type

ASR speech corpus

Language

zh-CN, Mandarin Chinese (China)

Speech Style

scripted monologue

Content

scripted speech of keyword spotting

Audio Parameters

48 kHz, 16 bits, mono

File Format

WAV (PCM)
TXT (UTF-8)

Recording Equipment

microphone

Recording Environment

indoor environment

License

Magic Data
open-source license

{{ reviewsTotal }}{{ options.labels.singularReviewCountLabel }}
{{ reviewsTotal }}{{ options.labels.pluralReviewCountLabel }}
{{ options.labels.newReviewButton }}
{{ userData.canReview.message }}

Verifying Email