Text Datasets

Links to text and corpus data

Quran Translation

Dhivehi Quran translation source files.

Leipzig University Corpora Collection

The Leipzig Corpora Collection provides different tools and data for download, which are protected by copyright. For more details please refer to our terms of usage.

Sofwaths sentence data set

speech text data for common voice