BECAS
CACCHIARELLI Paolo
congresos y reuniones científicas
Título:
de novo assembly of tomato genomes with emphasis in chromosome 6 sHSPs
Autor/es:
CACCHIARELLI, PAOLO; KRSTICEVIC, FLAVIA; EZPELETA, JOAQUÍN; TAPIA, ELIZABETH; PRATTA, GUILLERMO R.
Lugar:
Mar del Plata
Reunión:
Congreso; IX Argentinian Congress of Bioinformatics and Computational Biology 9CAB2C; 2018
Institución organizadora:
Asociación Argentina de Bioinformática y Biología Computacional - A2B2C
Resumen:
Evidence is worldwide accumulating about the inadequacy of a single reference genome for identifying the extensive genetic diversity present in tomato (Solanum lycopersicum) as well as in other crops such as soybean and maize. In consequence, nowadays different pan-genome project are being achieving in various species, in order to cover a wider range of biological variation. The small Heat Shock Proteins (sHSPs) represent a very diverse multigene family, expressed in different types of stress environment and strongly induced and highly abundant during tomato ripening. In addition, several analysis identified 4 tandem duplicated intronless sHSP genes mapping within a ~17.9 kbp region of chromosome 6 in S. lycopersicum cv. Heinz 1706. This repetitive region poses a computational challenge for assemblies, consequently conventional alignment methods based on a reference genome are insufficient to accurately identify the real gene copy number, i.e., the chromosome loci occupied by these genes. Aiming to contribute with alternatives for improving sHSPs identification, we evaluated de novo assembly strategy to establish more exactly the gene copy number in this chromosome 6 region in six tomato related wild species. Genomes of these Solanaceae species were derived from ERA282888 project, downloaded from DNA Data Bank of Japan, and de novo assembled with software SPAdes Assembler (v 3.10.1). While previously reference-guided genome assemblies always showed 4 sHSP copies, likely due to genome reference bias, de novo assemblies showed 5 copies in S. arcanum LA2172, 3 in S. habrochaites LA1777, 4 in S. habrochaites LA0407, 5 in S. pennellii LA0716, 5 in S. neorickii LA2133, and 4 in S. huaylasense LA1364. De novo assemblies are a better approximation to detect genetic variants that were missed in a reference genome. Development of pan-genome resources will further promote evolutionary and functional studies into tomato clade. Although there is still not an available pan-genome in Solanum spp., we started with an alternative strategy to resolve this challenge and in the short term the incorporation of long-read sequencing data could provide more accurately information for establishing a robust workflow to better characterize the tomato biodiversity.