IC   26529
INSTITUTO DE CALCULO REBECA CHEREP DE GUBER
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
The R packages network: Structure, temporal evolution and assistance for users
Autor/es:
INÉS CARIDI; ARIEL SALGADO; ANDRÉS FARALL
Lugar:
Ciudad Autónoma de Buenos Aires
Reunión:
Conferencia; StatPhys 27 ? International Conference on Statistical Physics; 2019
Resumen:
The R-Packages network: Structure and temporal evolutionR is free, open source, multi-platform programming language [1]. It´s a proyect sustained by a team made up of people from different places and disciplines, including statistics, finances, genetics, network analysis and others. Today, there are more than 12000 packages of R in official repositories, and the proyect keeps growing both in number of packages and in areas of knowledge.In this work we study the relationships between R-packages through their dependency relationships. Considering the packages as nodes, we build a directed network, in wich the connections between packages arise from the dependency relations between them (one package depends on another if it needs it for its operation). We observe the evolution of this network from it´s beginning in 1999 until 2017. The necessary information is avaiable in an open database, the CRAN website, wich contains the publication information of each package, along with their relations with other packages. Using this information, we build the R-packages network for each instant of time, taking in account the different versiones of the packages.We analyze the evolution of the global connectivy, based on magnitudes like the mean number of conections, the number of loose packages and the size of the biggest conected component. We compare the evolution and structure of the network built with each relationship, and the flux of conections between them. We study the local structure of the networks searching for 3 node motif, finding similarities with neural networks and the world wide web, depending on wich relationship of packages is considered.We compare the different snapshots of the R-package network with random networks, finding strong similiraties with scale free networks. Based on temporal information, we identify the main ingredients to build a model of the network growth, starting from the preferential attachment between packages. As an application, we present a package recommendation system for the R-users, wich is based on the structure of the global network of packages, and the network induced by the particular user packages.