INVESTIGADORES
BRIGNOLE Nelida Beatriz
artículos
Título:
A Structural Analysis of Topic Ontologies
Autor/es:
XAMENA E.; BRIGNOLE N.B.; MAGUITMAN A.G.
Revista:
INFORMATION SCIENCES
Editorial:
ELSEVIER SCIENCE INC
Referencias:
Lugar: Amsterdam; Año: 2017 p. 15 - 29
ISSN:
0020-0255
Resumen:
DMOZ is the largest human edited topic ontology available on the Web. Understanding how topics interrelate in this ontology can provide useful insights on the nature of topic connectivity, topic relevance and topic similarity, among other useful concepts. This article studies the structural properties of the DMOZ graph as well as those of several models that result from augmenting the original graph by applying different relevance-propagation operations. A number of global and local properties of the original and augmented graphs are examined by means of metrics commonly used in complex network analysis. In particular, we investigate the presence of various features that characterize small-world networks such as short diameter, high clustering coefficient and the existence of hubs. This analysis is complemented by examining other interesting characteristics of the graphs such as connectivity measures, centrality measures and sizes of the strongly connected components. The connectivity and centrality patterns are further studied by means of visualizations of the graphs? k-core decomposition, the in- and out-degree distributions and the size distribution of the strongly connected components. This analysis provides a general picture of this large human edited topic ontology by helping recognize several non-trivial regularities that are also encountered in other artificial and natural complex networks. In addition, it allows to identify some interesting artifacts that arise from existing constraints on the ontology structure, such as its underlying taxonomic hierarchy. This analysis is of major interest as it allows a better understanding of notions such as topic importance and relevance among topics, while helping to place meaningful constraints on models of relevance propagation and semantic similarity.