This open-source dataset consists of 6.13 hours of transcribed Mandarin Chinese scripted speech focusing on commands and queries in vehicle-related scenes, where 5,948 utterances contributed by ten speakers were contained.
A noteworthy feature is that two microphones were set up while recording—one at the sun visor, another near the speaker's mouth, on a front passenger seat. Synchronous dual voices, consequently, were recorded.
Sample:
"去珠江发展中心的最快路线"