ERIC Number: EJ1480443
Record Type: Journal
Publication Date: 2025
Pages: 19
Abstractor: As Provided
ISBN: N/A
ISSN: ISSN-1092-4388
EISSN: EISSN-1558-9102
Available Date: 0000-00-00
A Tunable Forced Alignment System Based on Deep Learning: Applications to Child Speech
Journal of Speech, Language, and Hearing Research, v68 n7 spec iss p3583-3601 2025
Purpose: Phonetic forced alignment has a multitude of applications in automated analysis of speech, particularly in studying nonstandard speech such as children's speech. Manual alignment is tedious but serves as the gold standard for clinical-grade alignment. Current tools do not support direct training on manual alignments. Thus, a trainable speaker adaptive phonetic forced alignment system, Wav2TextGrid, was developed for children's speech. The source code for the method is publicly available along with a graphical user interface at https://github.com/pkadambi/Wav2TextGrid. Method: We propose a trainable, speaker-adaptive, neural forced aligner developed using a corpus of 42 neurotypical children from 3 to 6 years of age. Evaluation on both child speech and on the TIMIT corpus was performed to demonstrate aligner performance across age and dialectal variations. Results: The trainable alignment tool markedly improved accuracy over baseline for several alignment quality metrics, for all phoneme categories. Accuracy for plosives and affricates in children's speech improved more than 40% over baseline. Performance matched existing methods using approximately 13 min of labeled data, while approximately 45-60 min of labeled alignments yielded significant improvement. Conclusion: The Wav2TextGrid tool allows alternate alignment workflows where the forced alignments, via training, are directly tailored to match clinical-grade, manually provided alignments.
Descriptors: Phonetics, Speech, Young Children, Phonemes, Automation, Adults, Accuracy, Pronunciation, Artificial Intelligence
American Speech-Language-Hearing Association. 2200 Research Blvd #250, Rockville, MD 20850. Tel: 301-296-5700; Fax: 301-296-8580; e-mail: slhr@asha.org; Web site: http://jslhr.pubs.asha.org
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Authoring Institution: N/A
Grant or Contract Numbers: R01DC01964503
Author Affiliations: N/A

Peer reviewed
Direct link
