INVESTIGADORES
BEIRO Mariano Gaston
congresos y reuniones científicas
Título:
Weighted Graph Convolutional Networks for Twitter users’ geolocation
Autor/es:
FEDERICO M. FUNES; J IGNACIO ALVAREZ-HAMELIN; MARIANO G. BEIRO
Lugar:
Shanghai
Reunión:
Conferencia; NetSci 2022; 2022
Institución organizadora:
Network Science Society
Resumen:
Predicting the geographical location of users of social media like Twitter has several applications inhealth surveillance, emergency monitoring, content personalization, and social studies in general.Thus, recent works have explored the usage of deep learning techniques as transformers andembeddings for the user geolocation prediction task.In this work we process a large collection of 900M tweets collected in Argentina in 2019, fromwhich we prepare and make available a labelled dataset composed of 140k geolocated users, 9Mgeolocated tweets and 124 tweets in total. The dataset is available for hydration.We contribute to the research in this area by designing and evaluating new methods based onweighted multigraphs combined with state-of-the-art deep learning techniques. The structure ofthese graphs is the combination in different layers of “extended” mentions and follower networks,which we define in a special way to take into account the connections (mentioning or following)that users have through paths that go across external users). The features associated to each usercome from a logistic regressor that combines embeddings of the user tweets’ content and the usage of local indicative words (LIW). We train the graphs with different information processingstrategies, e.g., information diffusion through transductive and inductive algorithms -RGCNs andGraphSAGE, respectively- and node embeddings with Node2vec+. We assess the performance of each method in terms of the Acc@100 (accuracy at 100 miles) and execution time, comparing them to baseline models both in the public Twitter-US dataset as in our dataset from Argentina. In particular, our weighted R-GCN model reaches a performance of 0.67 (Acc@100) in Twitter-US and 0.83 in Argentina.