INVESTIGADORES
BALSEIRO Diego
artículos
Título:
A novel distance that reduces information loss in continuous characters with few observations
Autor/es:
LO VALVO, GERARDO A.; LEHMANN, OSCAR E. R.; BALSEIRO, DIEGO
Revista:
PALAEONTOLOGIA ELECTRONICA
Editorial:
COQUINA PRESS
Referencias:
Año: 2023 vol. 26
ISSN:
1094-8074
Resumen:
The calculation of pairwise distances is a fundamental step in many statistical analyses in biology and paleontology. The most commonly used distances work with a single observation per object and character, but there are scenarios where multiple observations are available per object. In these situations, the information for the character spans an interval, and pairs of objects can have overlapping intervals, which further complicates the distance calculation. Some coefficients can deal with this wealth of information but are either too coarse to provide detailed results or too computationally demanding for even moderately large data sets. Here, we present the Distance Between Intervals (DBI) as a novel semi-metric distance that can accommodate both singular and multiple observations per object by analyzing them as intervals. The DBI ranges from 0 to 1 when there is an overlap between the objects and from 1 to infinity when there is no overlap between them. It is easy to calculate and can be applied to a wide variety of data types. Both simulated and empirical test cases show that the DBI correctly ranks pairs of objects by their level of overlap and non-overlap, while other distances struggle to do it. Therefore, the DBI can provide a finer level of definition than other available distances for empirical data sets, while generally agreeing with the broad results they provide. An implementation of DBI is provided for the R programming language.