ICC   25427
INSTITUTO DE INVESTIGACION EN CIENCIAS DE LA COMPUTACION
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
Calibration Approaches for Language Detection
Autor/es:
LUCIANA FERRER; AARON LAWSON; MITCHELL MCLAREN; DIEGO CASTÁN
Lugar:
Estocolmo
Reunión:
Congreso; Interspeech 2017; 2017
Institución organizadora:
ISCA
Resumen:
To date, automatic spoken language detection research has largely been based on a closed-set paradigm, in which the languages to be detected are known prior to system application. In  actual  practice,  such  systems  may  face  previously  unseen languages  (out-of-set  (OOS)  languages)  which  should  be  rejected,  a common problem that has received limited attention from the research community.  In this paper, we focus on situations in which either (1) the system-modeled languages are not observed during use or (2) the test data contains OOS languages that are unseen during modeling or calibration. In these situations,  the common multi-class objective function for calibration  of  language-detection  scores  is  problematic.   We  describe  how  the  assumptions  of  multi-class  calibration  are  not always fulfilled in a practical sense and explore applying global and language-dependent binary objective functions to relax system constraints. We contrast the benefits and sensitivities of thecalibration approaches on practical scenarios by presenting results using both LRE09 data and 14 languages from the BABEL dataset.  We show that the global binary approach is less sensitive to the characteristics of the training data and that OOS modeling with individual detectors is the best option when OOS test languages are not known to the system.