Total Size: 282M

Dataset Overview

Dataset Type

ASR Corpus

Language

English

Speech Style

N/A

Content

N/A

Audio Parameters

16 kHz, 16 bits

File Format

WAV (PCM)

Recording Equipment

mobile

Recording Environment

mobile

License

MAGIC DATA OPEN-SOURCE LICENSE

Open Source

ASR Corpus

5 hours

Multi-stream Spontaneous Conversation Training Datasets_English

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

The Multi-stream conversation dataset developed by MagicData captures each speaker's audio track and labels each speaker separately, thereby preserving the natural occurrences of interruptions, interactions, and other dynamics in conversation. By isolating each speaker's audio, we can provide clearer and more accurate training data, enabling models to more effectively understand and respond to natural conversational exchanges. To facilitate broader understanding and accessibility, we have released a 5-hour sample as part of our open-source initiative: "Multi-stream Spontaneous Conversation Training Datasets_English".

For more commercial datasets, please contact business@magicdatatech.com.

Dataset Overview

Dataset Type

ASR Corpus

Language

English

Speech Style

N/A

Content

N/A

Audio Parameters

16 kHz, 16 bits

File Format

WAV (PCM)

Recording Equipment

mobile

Recording Environment

mobile

License

MAGIC DATA OPEN-SOURCE LICENSE

备案号: 京ICP备18008050号-6号

京公网安备 11010802035822号

Your IP is: 216.73.216.237

SIGN IN

SIGN UP

Total Size: 282M

Dataset Overview

Dataset Type

Language

Speech Style

Content

Audio Parameters

File Format

Recording Equipment

Recording Environment

License

MAGIC DATA OPEN-SOURCE LICENSE

Multi-stream Spontaneous Conversation Training Datasets_English

Dataset Overview

Dataset Type

Language

Speech Style

Content

Audio Parameters

File Format

Recording Equipment

Recording Environment

License

MAGIC DATA OPEN-SOURCE LICENSE

京公网安备 11010802035822号

SIGN IN

SIGN UP

Total Size: 282M

Dataset Overview

Dataset Type

Language

Speech Style

Content

Audio Parameters

File Format

Recording Equipment

Recording Environment

License

MAGIC DATA OPEN-SOURCE LICENSE

Multi-stream Spontaneous Conversation Training Datasets_English

Dataset Overview

Dataset Type

Language

Speech Style

Content

Audio Parameters

File Format

Recording Equipment

Recording Environment

License

MAGIC DATA OPEN-SOURCE LICENSE

京公网安备 11010802035822号

Verifying Email