Text Datasets
Links to text and corpus data
Quran Translation
Dhivehi Quran translation source files.
Leipzig University Corpora Collection
The Leipzig Corpora Collection provides different tools and data for download, which are protected by copyright. For more details please refer to our terms of usage.
Sofwaths sentence data set
speech text data for common voice