INVESTIGADORES
SCHLOTTHAUER Gaston
congresos y reuniones científicas
Título:
The 13th International Conference on Advances in Quantitative Laryngology, Voice and Speech Research
Autor/es:
JUAN MANUEL MIRAMONT; JUAN FELIPE RESTREPO; CODINO, J.; GASTÓN SCHLOTTHAUER; MARÍA CRISTINA JACKSON MENALDI
Lugar:
Montreal
Reunión:
Congreso; The 13th International Conference on Advances in Quantitative Laryngology, Voice and Speech Research; 2021
Resumen:
ObjectivesThe aim of this research is to evaluate the use of classic and nonlinear dynamics features as objective measures for automatic voice classification in three types as proposed by Titze [1], where type 1 voices are nearly periodic, type 2 voices have strong modulating and subharmonic frequencies, and type 3 three voices lack of an apparent periodic structure.IntroductionPerturbation measures are ubiquitous in voice clinical evaluation, but they fail to assess signals that suffer from heavy fluctuations. To determine the suitability of voice signals to perturbationanalysis, a classification scheme was proposed by Titze [1]. Nevertheless, distinguishing among voice types is still rather subjective. As a solution, we propose an automatic algorithm for signal typing, based on quantitative descriptors. MethodsCorrelation dimension, correlation entropy and noise level, were estimated using the recently proposed U-Correlation Integral [2]. In contrast to previous works [3], this method for estimating attractor ?s invariants is automatic and user-independent. Additionally, were included Shimmer, Jitter, Harmonic-to-Noise Ratio (HNR), First Rahmonic (R1), and a novel feature called Principal Component Normalized Variance (PCNV) which measures the variance explained by the principal component of the set of the signal?s periods.Pathological voices from the MEEI [4] database were labeled by experts as type 1, 2 or 3 (207, 313 and 137 voices, respectively). Firstly, a linear Support Vector Machine (SVM) was trained to separate types 1 and 2 voices from type 3. Secondly, another SVM was trained to separate type 1 from type 2 voices. The rationale behind this is that some descriptors cannot be reliably measured for type 3 voices. The 80% of the data were used to train and validate the model, while it was tested with the remaining 20%. Validation measures were estimated by 10-fold cross-validation. A subset of the extracted features was selected by forward feature selection. ResultsFor types 1 and 2 vs. type 3 classification R1, HNR and noise level were used. The accuracy obtained was 93.18 ±1.6% (mean and standard deviation), where 91.06 ±2.0% of types 1 and 2 voices, and 93.82 ±1.73% of type 3 voices, were correctly classified. The accuracy for the test set was 90.25%. For type 1 vs. type 2 classification, PCNV, R1, noise level and correlation entropy were selected. The accuracy obtained was 83.64 ±1.8%, where 85.46 ±2.0% of types 1 voices and 81.27 ±2.02% of type 2  voices were correctly classified. The accuracy for the test set was 82.69%.ConclusionsThe nonlinear dynamics features used were estimated with a user-independent method, which is a further step towards a fully automatic tool for objective voice type classification. Our results showed that the proposed features can be used as objectives measures to distinguish between voice types. Further research will include a statistical evaluation of inter-rater agreement to assess the generalizability of the proposed approach.