CharsiuG2P

Grapheme-to-phoneme conversion tool in 100 languages

CharsiuG2P is transformer based tool for grapheme-to-phoneme conversion in 100 languages. Given an orthographic word, CharsiuG2P predicts its pronunciation through a neural G2P model.

Github repo:

Interspeech
ByT5 model for massively multilingual grapheme-to-phoneme conversion

Jian Zhu*, Cong Zhang*, and David Jurgens [* equal contribution]

In Interspeech 2022, 2022

Abs Bib PDF Code

In this study, we tackle massively multilingual grapheme-to-phoneme conversion through implementing G2P models based on ByT5. We have curated a G2P dataset from various sources that covers around 100 languages and trained large-scale multilingual G2P models based on ByT5. We found that ByT5 operating on byte-level inputs significantly outperformed the token-based mT5 model in terms of multilingual G2P. Pairwise comparison with monolingual models in these languages suggests that multilingual ByT5 models generally lower the phone error rate by jointly learning from a variety of languages. The pretrained model can further benefit low resource G2P through zero-shot prediction on unseen languages or provides pretrained weights for finetuning, which helps the model converge to a lower phone error rate than randomly initialized weights. To facilitate future research on multilingual G2P, we make available our code and pretrained multilingual G2P models at: https://github.com/lingjzhu/CharsiuG2P.
@inproceedings{zhu2022byt5-g2p, title = {ByT5 model for massively multilingual grapheme-to-phoneme conversion}, author = {{Jian Zhu*, Cong Zhang*} and {[* equal contribution]}, David Jurgens}, year = {2022}, booktitle = {Interspeech 2022} }

Related talks:

speech tech
ByT5 model for massively multilingual grapheme-to-phoneme conversion

Jian Zhu, Cong Zhang, and David Jurgens

Interspeech 2022
Incheon, Korea [online], 18-22 sep 2022

Abs Bib PDF Code

In this study, we tackle massively multilingual grapheme-to-phoneme conversion through implementing G2P models based on ByT5. We have curated a G2P dataset from various sources that covers around 100 languages and trained large-scale multilingual G2P models based on ByT5. We found that ByT5 operating on byte-level inputs significantly outperformed the token-based mT5 model in terms of multilingual G2P. Pairwise comparison with monolingual models in these languages suggests that multilingual ByT5 models generally lower the phone error rate by jointly learning from a variety of languages. The pretrained model can further benefit low resource G2P through zero-shot prediction on unseen languages or provides pretrained weights for finetuning, which helps the model converge to a lower phone error rate than randomly initialized weights. To facilitate future research on multilingual G2P, we make available our code and pretrained multilingual G2P models at: https://github.com/lingjzhu/CharsiuG2P.
@conference{zhu2022byt5-g2p-c, title = {ByT5 model for massively multilingual grapheme-to-phoneme conversion}, author = {Zhu, Jian and Zhang, Cong and Jurgens, David}, year = {2022}, booktitle = {Interspeech 2022}, month = {18-22 Sep}, location = {Incheon, Korea [online]} }

Related resources:

CharsiuG2P
CharsiuG2P

Jian Zhu, Cong Zhang, and David Jurgens

2022

Abs Bib PDF Code

CharsiuG2P is transformer based tool for grapheme-to-phoneme conversion in 100 languages. Given an orthographic word, CharsiuG2P predicts its pronunciation through a neural G2P model.
@misc{zhu2022charsiug2p, author = {Zhu, Jian and Zhang, Cong and Jurgens, David}, title = {{CharsiuG2P}}, year = {2022}, category = {tool} }
dictionary
CharsiuG2P: pronunciation dictionaries

Jian Zhu, Cong Zhang, and David Jurgens

2022

Abs Bib HTML

Data collected for the CharsiuG2P project. This is a collection of pronunciation dictionaries for over 100 languages.
@misc{zhu2022charsiug2p-dict, author = {Zhu, Jian and Zhang, Cong and Jurgens, David}, title = {CharsiuG2P: pronunciation dictionaries}, year = {2022}, category = {tool} }
data
CharsiuG2P-Data: A Multi-lingual G2P Corpora

Jian Zhu, Cong Zhang, and David Jurgens

2022

Abs Bib HTML

Data collected for the CharsiuG2P project. This is a collection of pronunciation dictionaries for over 100 languages.
@misc{zhu2022charsiug2p-resources, author = {Zhu, Jian and Zhang, Cong and Jurgens, David}, title = {CharsiuG2P-Data: A Multi-lingual G2P Corpora}, year = {2022}, category = {tool} }

Related articles:

Related talks:

Related resources: