CONICET | Buscador de Institutos y Recursos Humanos

INVESTIGADORES

SAD Gonzalo Daniel

datos académicos

artículos

capítulos de libros

congresos y reuniones científicas

convenios, asesorías y/o servicios tecnológicos

artículos

Título:

Complementary Gaussian Mixture Models for Multimodal Speech Recognition

Autor/es:

GONZALO D. SAD; LUCAS D. TERISSI; JUAN CARLOS GÓMEZ

Revista:

LECTURE NOTES IN COMPUTER SCIENCE

Editorial:

Springer

Referencias:

Año: 2015

ISSN:

0302-9743

Resumen:

In speech recognition systems, typically, each word/phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. This work describes new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs). In particular, two different conditions are considered. If the data is represented by single feature vectors a cascade classification scheme using HMMs and CGMMs is proposed. On the other hand, when data is represented by multiple feature vectors, a classification scheme based on a voting strategy which combines scores from individual HMMs and CGMMs is proposed. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios are achieved with the proposed classification methodologies.

enviar mensaje