INVESTIGADORES
DI CONZA Jose Alejandro
congresos y reuniones científicas
Título:
SpaTFinder “a new bioinformatic tool for improving Staphylococcus aureus spa typing”
Autor/es:
VIELMA VALLENILLA JESÚS ; NUSKE EZEQUIEL; HAIM MARÍA SOL; RAGO LUCÍA ; STAPHNET-SA CONSORTIUM; DI CONZA JOSÉ; MOLLERACH MARTA; DI GREGORIO SABRINA
Lugar:
Hinxton
Reunión:
Conferencia; 9th Applied Bioinformatics and Public Health Microbiology; 2023
Institución organizadora:
Sanger
Resumen:
Introduction: Spa typing is one of the used techniques for molecular epidemiology of Staphylococcus aureus and was traditionally carried out by PCR plus sanger sequencing. With the globalisation of whole genome sequencing (WGS) this approach has changed, because spa types can be determined directly from genomic assemblies.Currently, spa type determination can be performed from annotated genomes with online available tools, such as fortinbras Spatyper and the CGE Spatyper, but the process is laborious and time-consuming when analysing a large number of genomes.Objective: To address this issue, we aimed to develop a fast and scalable bioinformatic tool for spa type determination from assemblies. SpaTFinder is written in python and applies data mining strategies.Methods: SpaTFinder uses BioPython and Pandas packages. First, annotated sequences in fasta format (.ffn) are read and searched in the dictionary which contains the sequence of each spa repeat reported to date (https://spa.ridom.de/repeats.shtml). Secondly, detected spa repeats are ordered by its genome position. If a new or unrecognised repeat is found in a position between 2 recognised repeats, that new repeat is stored to determine if it is a possible new repeat. Then, the most similar pattern in the dictionary (according to the Needleman-Wunsch algorithm) is searched. The combination of generated repeats for each genome sequence is compared against the database available in the Ridom SpaServer (https://spa.ridom.de/). If the pattern exists, it will return as a result the compatible spa type. If not, the result will be a novel spa type. To validate the program, 404 annotated S. aureus genomes were analysed by using our SpaTFinder and the fortinbras Spatyper and their results were compared.Results: SpaTFinder assigned a spa type to 97.8% of genomes, the remaining 2.2% were non-typable. Fortinbras assigned a spa type for 92.3% genomes, 7.0% harboured novel spa types and 0.7% were non-typable. Of the 28 genomes that the Fortinbras spatyper identified as Novel, SpaTFinder was able to assign a known spa to 19 of them and the 9 remaining were non-typeable. The total run time (404 genomes) on a personal computer was 1h 42min.Comparatively, both methods assign the same spa type for 91.8% of genomes.Conclusion: SpaTFinder is an scalable time-efficient tool, capable of analysing a large number of genomes with reliable results. Further developments will make this tool available as an online platform for non-bioinformaticians.