CONICET | Buscador de Institutos y Recursos Humanos

Research has shown that trust is an essential aspect of humancomputer interaction directly determining the degree to which the person is willing to use a system. An automatic prediction of the level of trust that a user has on a certain system could be used to attempt to correct potential distrust by having the system take relevant actions like, for example, apologizing or explaining its decisions. With this goal in mind, in this work we aim to explore the feasibility of automatically detecting the technical competence or ability of a virtual assistant (VA) from the users speech, a simple proxy for the task we truly care about: detecting whether the user trusts the ability of the VA. Since, to our knowledge, no public databases were available to perform such study, we developed a novel protocol for collecting speech data from subjects interacting with VAs with different skill levels. The protocol consists of an interactive session where the subject is asked to respond to a series of factual questions with the help of a VA. At the beginning of each session, subjects are informed that the VA they are going to use has been previously rated by other users as being either competent or incompetent. During the session, the VA answers the subjects questions consistently to its alleged ability. All interactions are speech-based, with subjects and VAs communicating verbally, which allows the recording of speech produced under different conditions. The goal of the protocol was to induce subjects to either trust or distrust the VAs skills, assuming that this would, in turn, affect their speech patterns in ways that could be automatically detected. Using this protocol, we collected a speech corpus in Argentine Spanish which is publicly available for research use. We show clear evidence that the protocol effectively succeeded in influencing subjects into the desired mental state of either trusting or distrusting the agents skills. Using the collected data, we developed a system to detect the ability of the VA with which a subject interacted during a session, based on the subjects speech patterns. We found that it was possible to detect whether the VA was competent or incompetent with an accuracy up to 76%, compared to a random baseline of 50%. Our analysis suggests that these results are possible because the subjects change the way they speak to the VA depending on whether they perceive it as more or less competent; that is, depending on whether they trust its ability or not.

enviar mensaje