This dataset focuses on processing Japanese conversational speech in real-world settings. Designed in a conversation-based style, it captures the interactive and complex nature of everyday communication, thereby enhancing model performance in authentic conversational environments. Recordings were made using mobile devices, a choice that closely mirrors actual usage scenarios and highlights the dataset’s practical relevance. With a total duration of 10 hours, the dataset offers a diverse and realistic collection of conversational speech samples.
Sample:
Two-speaker conversation with separate tracks: