INVESTIGADORES
POL Diego
congresos y reuniones científicas
Título:
Effects of non-randomly distributed missing data in parsimony and bayesian analysis
Autor/es:
POL, D.; XU, X.
Lugar:
Berlin
Reunión:
Congreso; 74° Annual Meeting of the Society of Vertebrate Paleontology; 2014
Resumen:
<!--
/* Font Definitions */
@font-face
{font-family:Times;
panose-1:2 0 5 0 0 0 0 0 0 0;
mso-font-charset:0;
mso-generic-font-family:auto;
mso-font-pitch:variable;
mso-font-signature:3 0 0 0 1 0;}
@font-face
{font-family:"MS 明朝";
mso-font-charset:78;
mso-generic-font-family:auto;
mso-font-pitch:variable;
mso-font-signature:-536870145 1791491579 18 0 131231 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;
mso-font-charset:0;
mso-generic-font-family:auto;
mso-font-pitch:variable;
mso-font-signature:-536870145 1107305727 0 0 415 0;}
@font-face
{font-family:Cambria;
panose-1:2 4 5 3 5 4 6 3 2 4;
mso-font-charset:0;
mso-generic-font-family:auto;
mso-font-pitch:variable;
mso-font-signature:-1610611985 1073741899 0 0 159 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{mso-style-unhide:no;
mso-style-qformat:yes;
mso-style-parent:"";
margin:0cm;
margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:"MS 明朝";
mso-fareast-theme-font:minor-fareast;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;
mso-ansi-language:EN-US;}
.MsoChpDefault
{mso-style-type:export-only;
mso-default-props:yes;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:"MS 明朝";
mso-fareast-theme-font:minor-fareast;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;
mso-ansi-language:EN-US;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 3.0cm 70.85pt 3.0cm;
mso-header-margin:36.0pt;
mso-footer-margin:36.0pt;
mso-paper-source:0;}
div.WordSection1
{page:WordSection1;}
-->
The use of Bayesian analyses of
paleontological data matrices has increased in recent years and the potential
advantages of this approach have been advocated in the literature, such as
statistical properties of the estimates and its natural integration with
Bayesian molecular clock estimates. Sample cases have been discussed given they
resulted in disparate topological results in comparison with parsimony
analyses, such as the recently discussed phylogenetic position of Archaeopteryx and its affinities
with basal avialans. All these applications of Bayesian phylogenetic analyses
of morphological data are based on the assumption that all characters evolve
through a homogeneous Markov model, the Mk model that is a generalization of
the simplest model used for nucleotide substitutions (Jukes-Cantor model).
Despite the adequacy of this model for
treating morphological data, paleontological datasets are characterized by the
presence of abundant missing data. The distribution of missing data in
paleontological data matrices is non-random, and is usually concentrated on
highly incompletely scored taxa and highly incompletely scored characters.
Recent studies using both empirical and simulated data matrices have shown that
probability- based methods (including Bayesian analysis) can be affected by the
presence of abundant missing entries. However, the impact of these problems for
paleontological matrices has not been thoroughly studied yet.
Here I present a study on the effect
that non-randomly distributed missing entries have on a set of empirical data
matrices of morphological characters and assess the impact on the type and
quantity of missing data for Bayesian analysis in comparison with parsimony
analysis. The sensitivity of both methods is compared in terms of the
topological results obtained under different regimes of quantity and
distribution of missing entries, as well as on their support measures
(posterior probabilities in Bayesian analysis and bootstrap frequencies for
parsimony analysis). The results of these analyses show that both methods can
be highly sensitive to the presence of non-randomly distributed missing
entries, in particular for the case of highly incompletely scored taxa.
However, a major difference in the results of both methods is found in the
obtained support measures, which indicate an overestimation of credibility
measures for the position of highly incomplete taxa in Bayesian analyses.