IQUIBICEN   23947
INSTITUTO DE QUIMICA BIOLOGICA DE LA FACULTAD DE CIENCIAS EXACTAS Y NATURALES
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
Natural DNA motiff exploration of sequence space
Autor/es:
IGNACIO SANCHEZ; APTEKMANN, ARIEL A.; ALEJANDRO NADRA
Lugar:
Buenos Aires
Reunión:
Congreso; Highlights of the 1st Argentine Symposium of Young Bioinformatics Researchers (1SAJIB); 2017
Institución organizadora:
RSG-Argentina
Resumen:
There are many ways to represent a DNA binding site, each of them with it´s own advantages. Regular expression, also known as REGEX, are sequences of characters that form a search pattern. Commonly used to search patterns in character sequences. It is possible to build a regular expression that represent a set of sequences that bind to a recognizer (i.e. the set of binding sites for a transcription factor).Using regular expressions it is possible to search trough a large database with the disadvantage that the search has a worst ROC curve than it would with a PWM search or more complex models.On this work we generated regular expressions from DNA motiff databases.We used this regular expression, applying theorems originaly developed for protein linear motiffs, to aproach (among others) the following fundamental questions:How many linear motifs can coexist on the same genome? How different do they need to be?How occupied is the universe of posible motifs? What is the effect of the alphabet size on occupancy?Are motifs on the same genome more different than motifs on different genomes?