This is a tutorial for MFA 2.0 Installation and Usage on Windows. MFA official website does not have instructions for Windows installation. It is quite complicated since it requires running Linux on Windows. You need admin access to be able to install everything.
CharsiuG2P is transformer based tool for grapheme-to-phoneme conversion in 100 languages. Given an orthographic word, CharsiuG2P predicts its pronunciation through a neural G2P model.
The rhythm.metrics package is designed for calculating and visualising speech rhythm metrics. This package provides the calculation of Delta C / Delta V, VarcoC / VarcoV, %V, rPVI_C, nPVI_V.
dataset
Phoneme and word level forced aligned data: Common Voice - English (860,000 utterances)
Word & phone alignments for 2000 hrs of English from Common Voice (https://github.com/lingjzhu/charsiu/blob/main/misc/data.md#alignments-for-english-datasets). Some data come with demographic annotations. Great for studying speech styles, accents & variations
dataset
Phoneme and word level forced aligned data: multiple datasets - Mandarin (over 1 million utterances)
Charsiu is a phonetic alignment tool, which can: (1) force-align given speech audio + text transcription to phone level; and/or (2) automatically recognise the text in speech audio without the need for any transcription. It is currently available in both Mandarin Chinese and English (mainly American English).
tutorial
Compiling REAPER (Robust Epoch And Pitch EstimatoR)