INVESTIGADORES
GOMEZ federico Jose Vicente
artículos
Título:
Predicting the formation of NADES using a transformer‑based model
Autor/es:
LUCAS B. AYRES; FEDERICO J. V. GOMEZ; MARIA FERNANDA SILVA; JEB R. LINTON; CARLOS GARCÍA
Revista:
Scientific Reports
Editorial:
Springer Nature
Referencias:
Año: 2023
ISSN:
2045-2322
Resumen:
The application of natural deep eutectic solvents (NADES) in the pharmaceutical, agricultural, andfood industries represents one of the fastest growing fields of green chemistry, as these mixturescan potentially replace traditional organic solvents. These advances are, however, limited by thedevelopment of new NADES which is today, almost exclusively empirically driven and often derivativefrom known mixtures. To overcome this limitation, we propose the use of a transformer-basedmachine learning approach. Here, the transformer-based neural network model was first pre-trainedto recognize chemical patterns from SMILES representations (unlabeled general chemical data) andthen fine-tuned to recognize the patterns in strings that lead to the formation of either stable NADESor simple mixtures of compounds not leading to the formation of stable NADES (binary classification).Because this strategy was adapted from language learning, it allows the use of relatively smalldatasets and relatively low computational resources. The resulting algorithm is capable of predicting the formation of multiple new stable eutectic mixtures (n = 337) from a general database of natural compounds. More importantly, the system is also able to predict the components and molar ratios needed to render NADES with new molecules (not present in the training database), an aspect thatwas validated using previously reported NADES as well as by developing multiple novel solventscontaining ibuprofen. We believe this strategy has the potential to transform the screening processfor NADES as well as the pharmaceutical industry, streamlining the use of bioactive compounds asfunctional components of liquid formulations, rather than simple solutes.