INVESTIGADORES
STEGMAYER Georgina Silvia
congresos y reuniones científicas
Título:
A novel approach for highly-diverse multi-omics data fusion applied to tomato germplasm selection
Autor/es:
M. PIVIDORI, G. STEGMAYER, A. CERNADAS, M. CONTE, F. CARRARI, D.H. MILONE
Reunión:
Conferencia; 4th International Society for Computational Biology Latin America Bioinformatics Conference (ISCB-LA); 2016
Institución organizadora:
International Society for Computational Biology (ISCB)
Resumen:
Tomato (Solanum lycopersicum) is one of the major vegetable crop consumed worldwide being a valuable source of vitamins and antioxidants for the human diet. Because of the variability constraints associated with breeding programs, the phenotypic and genetic diversity in heirloom varieties emerges as a landmark to rescue desired agronomic traits for crop improvement. A germplasm collection of Andean tomato landraces materials originally cultivated by family farmers in the Cuyo region (Mendoza-ARG), was characterized based on morpho-agronomic and biochemical traits of their mature fruits. In several growing seasons, highly-diverse kinds of quantitative and qualitative measurements were obtained using GC-MS, NMR and HPLC to quantify fruit soluble and volatile metabolites; transcriptomics to assess gene expression; and tasting panels to evaluate and determine consumer preferences. The application of a classical clustering approach to integrate these kinds of heterogeneous variables for finding hidden relations would require a very complex, manual and time-consuming preprocessing to normalize each particular source of data. This should be done one-by-one, according to each particular variable and technique, being highly dependant also on the assumptions of the clustering method chosen. To the best of our knowledge, up to date there are no methods available for the integration of such highly-diverse complex data (i.e. metabolomics, transcriptomics, agronomics, tasting panels and categorical/quality assessment data) to perform an integrative analysis.This novel approach for highly-diverse multi-modal data fusion demonstrated to have several advantages, including i) not requiring preprocessing of any of the input data to perform heterogeneous data fusion; thus enabling simple integration of categorical as well as different types of numerical data; ii) does not demand a certain (possibly different) number of replicates for each type of measure and, iii) it is specially suited for cases where highly-diverse kinds of variables (measures) have to be compared or clustered, in particular when they are not available for all the biological material under analysis. Furthermore, the new method of cluster generation and analysis based on accessions diversity and data harvested along several seasons could readily assist to infer the most probable traits to be stable inherited for germplasm selection.