INVESTIGADORES
ARNEODO Ezequiel Matias
congresos y reuniones científicas
Título:
A brain-machine-interface to generate vocal communications
Autor/es:
S. CHEN; E. M. ARNEODO; D. E. BROWN, JR; V. GILJA; T. GENTNER
Lugar:
San Diego
Reunión:
Conferencia; SfN 2018; 2018
Institución organizadora:
Society for Neuroscience
Resumen:
Brain Machine Interfaces (BMIs) can restore impaired motor function and have been employed to understand the mapping between neural activity and motor control. State-of-the-art BMIs fall short, however, when it comes to decoding complex behaviors with high dimensionality, such as vocal communication. Using birdsong as a model for complex behavior similar to human speech, we previously created a BMI for birdsong in which spiking activity in the sensorimotor region HVC (used as a proper noun) can be fit to the parameters of a low-dimensional model of zebra finch syringeal dynamics that generates natural song, using a simple feedforward neural network. The dimensionality reduction provided by the syringeal model is crucial to the performance of the feedforward network. Here we propose an innovative method that incorporates advances in machine learning, specifically a Long Short-Term Memory (LSTM) network, capable of temporal sequence mapping, to produce a BMI that directly translates HVC spiking activity into the frequency domain representation (mel spectrogram) of a bird?s own song. The LSTM-based BMI yields synthetic bird?s own songs that sound similar to natural songs using as little as 20% of a 70-song repertoire for training. Acoustic variability in the BMI synthesized songs (computed as the RMSE of the spectrograms) falls within the range of natural variation in the bird's own songs, and is significantly lower than the variability between songs from different conspecific birds. The LSTM-based BMI can also reconstruct novel vocalizations, not presented to the machine during training. For birdsong researchers, these results provide a platform where song output (and thus auditory feedback) can now be directly modulated in much more precise ways compared to previous methods. The BMI also provides a framework for a deeper quantitative investigation of the general intuition that transformation of HVC spiking patterns into a high-dimensional vocal motor behavior involves substantial non-linearities that are captured by the recurrent architecture of an LSTM. Comparing the capacities of various network architectures to generate song from the neural activity of RA and other well-studied song system nuclei can be used to isolate the source of different non-linear mappings and more specifically define processing functions throughout the song system. We suggest that once fully optimized, such a system will substantially advance our understanding of the physiological mechanism behind vocal communication, and benefit fully automated assistive technologies to regain a much wider range of lost motor function than currently available.