SINC(I)   25518
INSTITUTO DE INVESTIGACION EN SEÑALES, SISTEMAS E INTELIGENCIA COMPUTACIONAL
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
A deep learning approach for type-of-relation identification in biomedical publications
Autor/es:
YONES, C.; MILONE, D H; BERTINETTI, J.; STEGMAYER, G; BUGNON, L. A.; RAMIREZ D.
Lugar:
online
Reunión:
Congreso; XI-CAB2C; 2021
Institución organizadora:
A2B2C
Resumen:
Background:The volume of biomedical literature is growing at an exponential rate. Such growth represents a challenge for information retrieval beyond a keyword search, such as the discovery of the relation type between keywords. Machine learning can be used to assist biologists and health professionals to overcome such overload, helping the quick identification of cutting-edge treatments or drug responses for a novel virus or disease. In this context, and analysing the whole paper, it is particularly difficult to find the exact type of relation that links two keywords of interest.Results:In this work we present a new deep neural network model that can classify relations between key terms in a full text. The model is a residual convolutional neural network which analyses hierarchical relations between each word in the text. Several embedding methods to convert text information into useful inputs for the network were compared. To evaluate this approach, a dataset of 4,000 publications was built from PubMed. The dataset contains different drug-gene variant interactions tagged in 8 specific categories of relations, including, for example, ?antagonist?, ?cofactor?, ?inhibitor? and ?no interaction?. A cross-validation scheme was used to find the best network architectures and compare text embeddings.Conclusions:We found promising results in the task of relation identification in full biomedical publications with the proposed model. Results show that keywords with ?no interaction? are detected with a recall of 99%, which can be very useful to filter irrelevant texts. Among the specific interactions, ?agonist?, ?cofactor? and ?inhibitor? are detected with a recall in a range of 75-80%. Moreover, the proposed approach is flexible and can be applied to texts of related fields.