ZUNINO SUAREZ Alejandro Octavio
Discovering web services in social web service repositories using deep variational autoencoders
LIZARRALDE, IGNACIO; MATEOS, CRISTIAN; ZUNINO, ALEJANDRO; MAJCHRZAK, TIM A.; GRØNLI, TOR-MORTEN
INFORMATION PROCESSING & MANAGEMENT
PERGAMON-ELSEVIER SCIENCE LTD
Lugar: Amsterdam; Año: 2020 vol. 57
Web Service registries have progressively evolved to social networks-like software repositories. Users cooperate to produce an ever-growing, rich source of Web APIs upon which new value-added Web applications can be built. Such users often interact in order to follow, comment on, consume and compose services published by other users. In this context, Web Service discovery is a core functionality of modern registries as needed Web Services must be discovered before being consumed or composed. Many efforts to provide effective keyword-based service discovery mechanisms are based on Information Retrieval techniques as services are described using structured or unstructured textdocuments that specify the provided functionality. However, traditional techniques suffer from term-mismatch, which means that only the terms that are contained in both user queries and descriptions are exploited to perform service retrieval. Early feature learning techniques such as LSA or LDA tried to solve this problem by finding hidden or latent features in text documents. Recently, alternative feature learning based techniques such as Word Embeddings achieved state of the art results for Web Service discovery. In this paper, we propose to learn features from service descriptions by using Variational Autoencoders, a special kind of autoencoder which restricts the encoded representation to model latent variables. Autoencoders in turn are deep neural networks used for unsupervised learning of efficient codings. We train our autoencoder using a real 17 113-service dataset extracted from the ProgrammableWeb.com API social repository. We measure discovery efficacy by using both Recall and Precision metrics, achieving significant gains compared to both Word Embeddings and classic latent features modelling techniques. Also, performance-oriented experiments show that the proposed approach can be readily exploited in practice.