MagicData
SIGN IN

Dataset Overview

Dataset Type

text corpus for NLP

Language

yue-Guangdong

Speech Style

N/A

Content

daily-use sentence

Audio Parameters

N/A

File Format

TXT (UTF8)

Recording Equipment

N/A

Recording Environment

N/A
Proprietary
NLP Corpus
828114 sentences

NLP-MCCantParaC: A Mandarin Chinese-Cantonese (Canton) Parallel Corpus

MDT-NLP-F017 | 828,114 daily-use sentences in Guangzhou Cantonese

This dataset consists of 828,114 daily-use sentences in Guangzhou Cantonese.

Contact business@magicdatatech.com to learn more.

Sample:

Chinese Guangzhou Cantonese
你漫画看多了吧你漫画睇多咗啊
没问道怎么说冇问到哦点讲啊
写了我发现我好有文采写咗嘞我发现我好有文采啊
睡觉了吗今天怎么了瞓咗觉未啊今日点啊
是你的号码吗系你嘅号码咩
你会等我吗你会唔会等我噶

Dataset Overview

Dataset Type

text corpus for NLP

Language

yue-Guangdong

Speech Style

N/A

Content

daily-use sentence

Audio Parameters

N/A

File Format

TXT (UTF8)

Recording Equipment

N/A

Recording Environment

N/A

License

{{ reviewsTotal }}{{ options.labels.singularReviewCountLabel }}
{{ reviewsTotal }}{{ options.labels.pluralReviewCountLabel }}
{{ options.labels.newReviewButton }}
{{ userData.canReview.message }}

Verifying Email