INVESTIGADORES
MOLLERACH Marta Eugenia
congresos y reuniones científicas
Título:
SpaTFinder “a new bioinformatic tool for improving Staphylococcus aureus spa typing”
Autor/es:
VIELMA VALLENILLA J; NUSKE E; HAIM MS; RAGO L; MOLLERACH M; DI GREGORIO S
Lugar:
Cambridge
Reunión:
Conferencia; Applied Bioinformatics & Public Health Microbiology 2023; 2023
Institución organizadora:
Wellcome Connecting Science Conferences
Resumen:
Introduction: Spa typing is one of the used techniques for molecular epidemiology ofStaphylococcus aureus and was traditionally carried out by PCR plus sanger sequencing.With the globalisation of whole genome sequencing (WGS) this approach has changed,because spa types can be determined directly from genomic assemblies.Currently, spa type determination can be performed from annotated genomes with onlineavailable tools, such as fortinbras Spatyper and the CGE Spatyper, but the process islaborious and time-consuming when analysing a large number of genomes.Objective: To address this issue, we aimed to develop a fast and scalable bioinformatic toolfor spa type determination from assemblies. SpaTFinder is written in python and appliesdata mining strategies.Methods: SpaTFinder uses BioPython and Pandas packages. First, annotated sequences infasta format (.ffn) are read and searched in the dictionary which contains the sequence ofeach spa repeat reported to date (https://spa.ridom.de/repeats.shtml). Secondly, detectedspa repeats are ordered by its genome position. If a new or unrecognised repeat is found ina position between 2 recognised repeats, that new repeat is stored to determine if it is apossible new repeat. Then, the most similar pattern in the dictionary (according to theNeedleman-Wunsch algorithm) is searched. The combination of generated repeats for eachgenome sequence is compared against the database available in the Ridom SpaServer(https://spa.ridom.de/). If the pattern exists, it will return as a result the compatible spa type.If not, the result will be a novel spa type. To validate the program, 404 annotated S. aureusgenomes were analysed by using our SpaTFinder and the fortinbras Spatyper and theirresults were compared.Results: SpaTFinder assigned a spa type to 97.8% of genomes, the remaining 2.2% werenon-typable. Fortinbras assigned a spa type for 92.3% genomes, 7.0% harboured novel spatypes and 0.7% were non-typable. Of the 28 genomes that the Fortinbras spatyper identifiedas Novel, SpaTFinder was able to assign a known spa to 19 of them and the 9 remainingwere non-typeable. The total run time (404 genomes) on a personal computer was 1h42min.Comparatively, both methods assign the same spa type for 91.8% of genomes.Conclusion: SpaTFinder is an scalable time-efficient tool, capable of analysing a largenumber of genomes with reliable results. Further developments will make this tool availableas an online platform for non-bioinformaticians