IABIMO   27858
INSTITUTO DE AGROBIOTECNOLOGIA Y BIOLOGIA MOLECULAR
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
A new feature selection approach for genomic prediction methods
Autor/es:
GARCÍA, MARTÍN NAHUEL; RIVAS, JUAN GABRIEL; HARRAND, LEONEL; HOPP, HORACIO ESTEBAN; VILLALBA, PAMELA VICTORIA; ACUÑA, CINTIA VANESA; AGUIRRE, NATALIA CRISTINA; MARCÓ, MARTÍN; MARÍA CAROLINA MARTÍNEZ; OBERSCHELP, JAVIER; MARCUCCI POLTRI, SUSANA NOEMÍ
Lugar:
Virtual Congress
Reunión:
Congreso; First Latin American Congress of Women in Bioinformatics and Data Science (Virtual Edition 2020); 2020
Resumen:
Genomic selection (GS) is based on the simultaneous estimation of the effects of all available markers along the genome for predicting individual breeding values. In GS genetic markers covering the whole genome are used so that all quantitative trait loci (QTL) are in linkage disequilibrium with at least one marker. However, it is reasonable to assume that not all markers contribute to the trait of interest and that the elimination of those irrelevant and redundant markers will give more accurate models.On the other hand, the reduction of the dimensionality allows to only keep markers which are linked to those QTLs directly or indirectly involved with the performance of the trait as well as the effect of dominance or epistasis. In addition, more compact models have a greater generalization ability.We propose a new feature selection methodology based on the meta-analysis of the effects of intrinsically different GS methodologies: a linear regression (Ridge regression BLUP), a Bayesian linear regression (Bayes LASSO) and a non-parametric methodology (Random Forest). We evaluated the performance of this new feature selection method in two different plant species: Zea mays L. simulated data that comprises 1250 doubled haploid (DH) lines fingerprinted for 1117 SNPs and a quantitative trait (Wimmer et al., 2012) and Eucalyptus grandis real dataset that comprises 131 full sib individuals, 2378 DArTs and 74 SSR, and three quantitative traits (García 2013). Comparing with standard GS methodologies this approach performed better in terms of accuracy (Pearson correlation between predicted and observed phenotype). These higher accuracies were more evident in low heritability traits, being this issue very important in characteristics that are difficult or expensive to measure. Therefore, this strategy could improve GS in plants.

