MagicData
SIGN IN

Total Size: 1914.71M

Dataset Overview

Dataset Type

Conversation

Language

日本語

Speech Style

Conversational

Content

N/A

Audio Parameters

16 kHz, 16 bits

File Format

WAV

Recording Equipment

mobile

Recording Environment

indoor
Open Source
Conversation
10.35 hours

Japanese Duplex Conversation Training Dataset

This dataset focuses on processing Japanese conversational speech in real-world settings. Designed in a conversation-based style, it captures the interactive and complex nature of everyday communication, thereby enhancing model performance in authentic conversational environments. Recordings were made using mobile devices, a choice that closely mirrors actual usage scenarios and highlights the dataset’s practical relevance. With a total duration of 10 hours, the dataset offers a diverse and realistic collection of conversational speech samples.

Sample:

Two-speaker conversation with separate tracks:

The dataset is not for commercial use. The open-source dataset may be used for academic research and must be properly cited with the source.

Citation Format:Japanese Duplex Conversation Training Dataset. 2025. https://magichub.com/datasets/japanese-duplex-conversation-training-dataset/. Beijing Magic Data Technology Co., Ltd.

For more commercial datasets, please contact business@magicdatatech.com.

Dataset Overview

Dataset Type

Conversation

Language

日本語

Speech Style

Conversational

Content

N/A

Audio Parameters

16 kHz, 16 bits

File Format

WAV

Recording Equipment

mobile

Recording Environment

indoor
{{ reviewsTotal }}{{ options.labels.singularReviewCountLabel }}
{{ reviewsTotal }}{{ options.labels.pluralReviewCountLabel }}
{{ options.labels.newReviewButton }}
{{ userData.canReview.message }}

Verifying Email