Datasets
English

English Scripted Speech Corpus
Category : ASR Corpus
Datasets Source : MagicData
Language : en
Content : commands and queries
Tags : English
Size : Not Described.
File Format : WAV (PCM)
TXT (UTF8)
License : Not Described.
Chinese English Scripted Speech Corpus - Daily-Use Sentence, Digits, Phrase, Vocabulary, Letter-by-Letter Spelling
Category : ASR Corpus
Datasets Source : MagicData
Language : en-CN
Content : daily-use sentence, digits, phrase, vocabulary, and letter-by-letter spelling
Tags : English
Size : Not Described.
File Format : WAV (PCM)
TXT (UTF8)
License : Not Described.
Chinese English Scripted Speech Corpus for Evaluation
Category : ASR Corpus
Datasets Source : MagicData
Language : en-CN
Content : daily-use sentence (rated)
Tags : English
Size : Not Described.
File Format : WAV (PCM)
TXT (UTF8)
License : Not Described.
Chinese English Scripted Speech Corpus - Daily-Use Sentence
Category : ASR Corpus
Datasets Source : MagicData
Language : en-CN
Content : daily-use sentence
Tags : English
Size : Not Described.
File Format : WAV (PCM)
TXT (UTF8)
License : Not Described.
Hong Kong English Scripted Speech Corpus - Command and Query
Category : ASR Corpus
Datasets Source : MagicData
Language : en-HK
Content : command and queries
Tags : English
Size : Not Described.
File Format : WAV (PCM)
TXT (UTF8)
License : Not Described.
Hong Kong English Scripted Speech Corpus - Keyword Spotting
Category : ASR Corpus
Datasets Source : MagicData
Language : en-HK
Content : keyword spotting
Tags : English
Size : Not Described.
File Format : WAV (PCM)
TXT (UTF8)
License : Not Described.
Australian English Scripted Speech Corpus
Category : ASR Corpus
Datasets Source : MagicData
Language : en-AU
Content : Daily-Use Sentence
Tags : English
Size : Not Described.
File Format : WAV TXT
License : Not Described.
Pakistani English Scripted Speech Corpus - Daily Use Sentence
Category : ASR Corpus
Datasets Source : MagicData
Language : en-PK, English (Pakistan)
Content : daily use sentences
Tags : English
Size : 307 MB
File Format : WAV (PCM)
TXT (UTF8)
License : Magic Data
open-source license
Giga Speech
Category : ASR Corpus
Datasets Source : Tsinghua University
Language : English
Content : Various topics
Tags : English
Size : 435GB
File Format : OPUS
License : TERMS OF ACCESS
English Speech Corpus from TED-LIUM
Category : ASR Corpus
Datasets Source :
Language : English
Content : daily use sentences
Tags : English
Size : 21G
File Format : Not Described.
License : Creative Commons BY-NC-ND 3.0