PERSONAL DE APOYO
HOLLEY REGUILO Juan Alfredo
congresos y reuniones científicas
Título:
Effects of non-randomly distributed missing data in support values of bayesian and parsimony analysis
Autor/es:
JUAN ALFREDO HOLLEY; POL DIEGO
Lugar:
Denver
Reunión:
Simposio; Annual Meeting of the Geological Society of America.; 2016
Institución organizadora:
The Geological Society of America.
Resumen:
Paleontological datasets are characterized by the copious amount of missing data and their problematiceffects in phylogenetic analyses have long been noted. In terms of parsimony analyses, recent advances innumerical methods and their efficient implementation in phylogenetic software currently allowsincorporating numerous characters or taxa with large amounts of missing entries without creatingproblems related to the large numbers of equally parsimonious trees. The effects that missing data has onsupport values, however, is much less understood. Regarding Bayesian analyses, recent studies using bothempirical and simulated data matrices have shown that missing data also affect the performance of thismethod, especially when the missing data is non-randomly distributed. Non-random distribution ofmissing data in paleontological data matrices is quite common as it is usually concentrated on highlyincompletely scored taxa and highly incompletely scored characters. As in parsimony, the effects of theamount of missing data (and the different patterns of distribution) on posterior clade probability is poorlyunderstood. Here we present a study on the effect of randomly and non-randomly distributed missingentries have on a set of empirical data matrices of morphological characters in support values for bothBayesian and parsimony analyses. Different regimes of missing entries were artificially added to thesedatasets and the support/credibility values obtained for the modified datsets were compared with those ofthe original matrices (without missing data). The results of these analyses show that support/credibilityvalues are highly sensitive to the presence of non-randomly distributed missing entries, in particular forthe case of highly incompletely scored taxa. A major difference in the results of both methods is found inthe frequency of high credibility values obtained for erroneous groups in the case of Bayesian analyses.