INVESTIGADORES
ZABALA Paula Lorena
congresos y reuniones científicas
Título:
Complexity and Algorithms for Composite Retrieval
Autor/es:
SIHEM AMER-YAHIA; FRANCESCO BONCHI; CARLOS CASTILLO; ESTEBAN FEURTEIN; ISABEL MÉNDEZ-DÍAZ; PAULA ZABALA
Lugar:
Rio de Janeiro
Reunión:
Conferencia; 22nd International World Wide Web Conference; 2013
Resumen:
Users are often faced with the problem of finding complementaryitems that together achieve a single common goal (e.g., astarter kit for a novice astronomer, a collection of question/answersrelated to low-carb nutrition, a set of places to visit on holidays).In this article, we argue that for some application scenariosreturning item bundles is more appropriate than ranked lists. Thuswe define composite retrieval as the problem of finding k bundlesof complementary items. Beyond complementarity of items, thebundles must be valid w.r.t. a given budget, and the answer set ofk bundles must exhibit diversity.We formally define the problem and characterize its complexity.We prove that the problem in its general form is NP-hard and thatalso the special cases in which each bundle is formed by only oneitem, or only one bundle is sought, are hard. Our characterizationhowever suggests us how to adopt a two-phase approach (Produceand-Choose, or PAC) in which we first produce many valid bundles,and then we choose k among them. For the first phase we devise twoad-hoc clustering algorithms, while for the second phase we adaptheuristics with approximation guarantees.We also devise another approach which is based on first findinga k-clustering and then selecting a valid bundle from each of theproduced clusters (Cluster-and-Pick, or CAP).We compare experimentally the proposed methods on a largereal-world database of user-generated restaurant reviews from Yahoo!Local, exploring their performance under a variety of settings.Our experiments show that when diversity is highly important, CAPis the best option, while when diversity is less important, a PACapproach constructing bundles around randomly chosen pivots, isbetter.