INVESTIGADORES
POL Diego
congresos y reuniones científicas
Título:
Effects of non-randomly distributed missing data in parsimony and bayesian analysis
Autor/es:
POL, D.; XU, X.
Lugar:
Berlin
Reunión:
Congreso; 74° Annual Meeting of the Society of Vertebrate Paleontology; 2014
Resumen:
<!-- /* Font Definitions */ @font-face {font-family:Times; panose-1:2 0 5 0 0 0 0 0 0 0; mso-font-charset:0; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:3 0 0 0 1 0;} @font-face {font-family:"MS 明朝"; mso-font-charset:78; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-536870145 1791491579 18 0 131231 0;} @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:0; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-536870145 1107305727 0 0 415 0;} @font-face {font-family:Cambria; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:0; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:-1610611985 1073741899 0 0 159 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:""; margin:0cm; margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:12.0pt; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"MS 明朝"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi; mso-ansi-language:EN-US;} .MsoChpDefault {mso-style-type:export-only; mso-default-props:yes; font-family:Cambria; mso-ascii-font-family:Cambria; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"MS 明朝"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Cambria; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi; mso-ansi-language:EN-US;} @page WordSection1 {size:612.0pt 792.0pt; margin:70.85pt 3.0cm 70.85pt 3.0cm; mso-header-margin:36.0pt; mso-footer-margin:36.0pt; mso-paper-source:0;} div.WordSection1 {page:WordSection1;} --> The use of Bayesian analyses of paleontological data matrices has increased in recent years and the potential advantages of this approach have been advocated in the literature, such as statistical properties of the estimates and its natural integration with Bayesian molecular clock estimates. Sample cases have been discussed given they resulted in disparate topological results in comparison with parsimony analyses, such as the recently discussed phylogenetic position of Archaeopteryx and its affinities with basal avialans. All these applications of Bayesian phylogenetic analyses of morphological data are based on the assumption that all characters evolve through a homogeneous Markov model, the Mk model that is a generalization of the simplest model used for nucleotide substitutions (Jukes-Cantor model). Despite the adequacy of this model for treating morphological data, paleontological datasets are characterized by the presence of abundant missing data. The distribution of missing data in paleontological data matrices is non-random, and is usually concentrated on highly incompletely scored taxa and highly incompletely scored characters. Recent studies using both empirical and simulated data matrices have shown that probability- based methods (including Bayesian analysis) can be affected by the presence of abundant missing entries. However, the impact of these problems for paleontological matrices has not been thoroughly studied yet. Here I present a study on the effect that non-randomly distributed missing entries have on a set of empirical data matrices of morphological characters and assess the impact on the type and quantity of missing data for Bayesian analysis in comparison with parsimony analysis. The sensitivity of both methods is compared in terms of the topological results obtained under different regimes of quantity and distribution of missing entries, as well as on their support measures (posterior probabilities in Bayesian analysis and bootstrap frequencies for parsimony analysis). The results of these analyses show that both methods can be highly sensitive to the presence of non-randomly distributed missing entries, in particular for the case of highly incompletely scored taxa. However, a major difference in the results of both methods is found in the obtained support measures, which indicate an overestimation of credibility measures for the position of highly incomplete taxa in Bayesian analyses.