TORRES Humberto Maximiliano
congresos y reuniones científicas
Phonological Phrase Segmentation Based On Acoustic Information
Vietri sul Mare
Workshop; TIMELY Workshop on Dynamical systems for psychological timing and timing in speech processing; 2012
Institución organizadora:
ISCH Action TD0904 Time In MEntaL activitY: theoretical, behavioral, bioimaging and clinical perspectives (TIMELY),COST
One of the functions attributed to prosody is to serve as temporary organizer of fluent speech. In this role, prosody acts either joining or breaking speech units, which leads to a hierarchical organization that listeners exploit during speech recognition. A component of this prosodic hierarchy is the phonological phrase [1, 2]. The automatic detection of phonological phrases from acoustic information can be applied during automatic speech recognition. For example: filtering hypotheses incompatibles with the observed segmentation, or during training and using language models. This work aims to investigate the segmentation of fluent speech in phonological phrases using acoustic information, in particular, vowel durations contrasts. The proposed system consists of two modules: one for automatic detection of vowels, and the other for the segmentation of phonological phrases. For the vowel detection we compare two alternatives: the first based on conventional hidden Markov models, and the second which makes use of recurrent neural networks to find vocalic nucleus [3], and a second phase where the candidates are post-processed to treat special vowel contexts such as diphthongs and epenthetic vowels. That post-processing is based on a clustering procedure on frames of the vowel nucleous, parameterized by continuous multiresolution entropy [4,5]. For the segmentation of phonological phrases we also analyze two different strategies: one based primarily on local differences in duration between consecutive vowels, and the other which also makes use of tonal and energy information of the analyzed segments. In this presentation some work in progress will be shown, together with a description of the proposed techniques