Speech Datasets

Links to speech data

Mozilla Common Voice

Common Voice is Mozilla’s initiative to help teach machines how real people speak. The project is currently active and collective data from contributors.

Presidential Speeches

Dhivehi speech data - data collected from PO MV (* 1 GB). Extracted and processed by Sofwath as part of a collection of Dhivehi datasets found here.