INVESTIGADORES
NOGUERA Martin Ezequiel
congresos y reuniones científicas
Título:
Decoding relationships between phenotypes and protein sequences
Autor/es:
HERNÁNDEZ BERTHET, AYELÉN S.; APTEKMANN, ARIEL A.; SANCHEZ, IGANCIO E.; NOGUERA MARTIN E.; ROMAN ERNESTO A.
Reunión:
Congreso; LI Reunión Anual de la Sociedad Argentina de Biofísica; 2023
Resumen:
Organisms live in diverse environments, therefore, there are molecular adaptations that allow them to grow in those conditions. One of the most clear examples of adaptation is the difference in optimal growth temperatures between organisms. For organisms to survive, the essential metabolic pathways must be functional, thus, proteins being part of them must also be active at the temperature in which organisms live. Then, temperature acts as a conditioning factor that exerts a selective pressure for protein sequences to fold and work properly. In this work we studied the relation between protein sequences and associated phenotypes, such as temperature, by aligning a set of sequences from a protein family and doing pairwise comparisons position by position. We define the sequence divergence by assigning a 1 if the amino acid differs and a 0 if the amino acid is the same and studied the correlation between the sequence divergence and the difference in the phenotypes associated to each sequence. Then, we identified the group of positions that improve the correlation with respect to the individual ones and used this group to predict the phenotype associated to protein sequences that were not in the training set. We assayed this method with different protein families and phenotypes, such as adenylate kinase and optimal growth temperatures, HIV-1 protease and the resistance to different inhibitors. We showed that strong correlations exist in single positions while an improvement is achieved when the most correlated positions are jointly analyzed and then used to perform a phenotype prediction. The diversity of the explored systems make this method valuable to find sequence determinants of biological activity modulation and to predict various functional features for uncharacterized members of a protein family.