IQUIBICEN   23947
INSTITUTO DE QUIMICA BIOLOGICA DE LA FACULTAD DE CIENCIAS EXACTAS Y NATURALES
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
DNA linear motif classes, how many? how different?
Autor/es:
ALEJANDRO NADRA; APTEKMANN, ARIEL ALEJANDRO; IGNACIO SANCHEZ
Reunión:
Congreso; 9CA2B2C; 2018
Resumen:
DNA linear motif classes, how many? how different? Sequence motifs are relatively short, recurring patterns.When found in DNA they are presumed to have a biological function. Some of them indicate sequence-specific binding sites for proteins, such as nucleases or transcription factors (TF).Frequently conservation of a sequence implies a selective pressure, which in turn suggests a function, although there are some motifs with no apparent functionality.In this work we study sequence motif databases by modelling sequence motifs as regular expressions, which specify the length of the motif and which bases are allowed at each motif position.We develop a method for building a regular expression from position specific scoring matrices.Using this representation we tackle:How many linear motif classes remain to be discovered in nature?How many classes coexist on a genome? How different the motifs on a genome are? As a measure of motif specificity for a pair of linear motif classes, we quantify how many motif-discriminating positions prevent a subsequence from being an instance of the two classes at once.Naturally occurring pairs of DNA linear motif classes present most often one motif-discriminating position, which maximizes the potential number of coexisting linear motif classes. Increasing the size of the alphabet by means modifications increases the potential number of coexisting linear motif classes.We calculate the fraction of all possible protein subsequences that would belong to a linear motif class if the potential number of coexisting linear motif classes came into actual existence.This number is highest if the specificity requirement is no motif-discriminating positions.We propose that naturally occurring DNA linear motif classes operate under mild specificity requirements that maximize the potential number of coexisting linear motif classes.