sign in

ISCSLP 2022 Conversational Short-phrase Speaker Diarization Challenge (CSSD)

ISCSLP2022对话短语音说话人日志挑战赛

Datasets

Dataset

The MagicData-RAMC corpus contains 180 hours of conversational speech data recorded from native speakers of Mandarin Chinese over mobile phones with a sampling rate of 16 kHz. The dialogs in MagicData-RAMC are classified into 15 diversified domains and tagged with topic labels, ranging from science and technology to ordinary life. Accurate transcription and precise speaker voice activity timestamps are manually labeled for each sample. Speakers' detailed information is also provided. As a Mandarin speech dataset designed for dialog scenarios with high quality and rich annotations, MagicData-RAMC enriches the data diversity in the Mandarin speech community and allows extensive research on a series of speech-related tasks, including automatic speech recognition, speaker diarization, topic detection, keyword search, text-to-speech, etc. Please refer to MagicData RAMC

开发训练集

主办方针对赛道“对话短语音说话人日志(SD)准确率”开放了以下训练数据集:
1、MagicData-RAMC 包括351组多轮普通话对话,时长共计180小时。每组对话的标注信息包括转录文本、语音活动时间戳、说话人信息、录制信息和话题信息。说话人信息包括了性别、年龄和地域,录制信息包括了环境和设备。请参赛者查看邮件进行数据集下载。

2、评估集(Test),将于9月8日开放。

所有参与者都应遵守以下规则:

1. DATA:只允许使用MagicData RAMC(openslr 123)、VoxCeleb Data(openslr 49)和CN-Celeb Corpus(openslr 82)。数据增强可以使用两个噪声数据集,即 MUSAN(openslr17), RIRNoise (openslr 28)。

2. 严禁以任何形式使用测试集,包括但不限于使用测试数据集对模型进行微调或训练。

3.允许多系统融合。然而不鼓励使用具有相同结构的系统进行融合。

4. 所有模型都应在允许的数据集上进行训练。具体来说,预训练模型不允许使用其他数据集(包括未标记的数据)。

5. 最终解释权归主办方所有。

Teams (Submitted)

Team

Corr

Sub

Del

Ins

WER

Updated

Result

No Team Data.

Teams (Registered)

Team

Corr

Sub

Del

Ins

WER

Updated

Result

云鸷
0
little children
0
Playaudio
0
imm
0
1234
0
BigB
0
Salted fish
0
语科倾听小队
0
Team SMILE
0
TSUP
0
Small Shrimp
0
3322
0
3322
0
4&4
0
funspeech
0
CN171-11
0
wantt
0
kekexili
0
speechlake
0
zju00
0
Zju00
0
AI实验室
0
swen
0
UJS417
0
ximalaya-cssd
0
ximalaya
0
voicecomm
0
HFXAISpeech
0
hahaha
0
怀宁路1639号
0
AINJ
0
SpeakerX
0
Lovely_speech
0
BigSmart
0
XINFU
0
AMS
0
SSP
0
SD321
0
USTC-NERCSLIP
0
Bigsur
0
XJUSpeech
0
VWM
0
xlteam
0
trip.com
0
Eagle
0
Team_Bhasha
0
BigYellow
0
team
0
Team Fraunhofer
0
YUN
0
try yi try
0
Qmhn
0
x-spk
0
asdf
0
shuishou
0
HFXAISpeech
0
yigedaxigua
0