INVESTIGADORES
FERRER Luciana
congresos y reuniones científicas
Título:
Effective Use of DCTs for Contextualizing Features for Speaker Recognition
Autor/es:
MITCH MCLAREN; NICOLAS SCHEFFER; LUCIANA FERRER; YUN LEI
Lugar:
Florencia
Reunión:
Congreso; IEEE Conference on Acoustics, Speech, and Signal Processing (ICASSP); 2014
Institución organizadora:
IEEE
Resumen:
This article proposes a new approach for contextualizing features for speaker recognition through the discrete cosine transform (DCT). Specifically, we apply a 2D-DCT transform on the Mel filterbank outputs to replace the common Mel frequency cepstral coefficients (MFCCs) appended by deltas and double deltas. A thorough com- parison of algorithms for delta computation and DCT-based contex- tualization for speaker recognition is provided and the effect of vary- ing the size of analysis window in each case is considered. Selection of 2D-DCT coefficients using a zig-zag approach permits definition of an arbitrary feature dimension using the most energized coeffi- cients. We show that 60 coefficients computed using our approach outperforms the standard MFCCs appended with double deltas by up to 25% relative on the NIST 2012 speaker recognition evaluation (SRE) corpus in both Cprimary and equal error rate (EER) while additional coefficients increase system robustness to noise.