ISISTAN   23985
INSTITUTO SUPERIOR DE INGENIERIA DEL SOFTWARE
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
TeXTracT: a Web-based Tool for Building NLP-enabled Applications
Autor/es:
RAGO ALEJANDRO; DIAZ PACE JORGE ANDRÉS; FACUNDO RAMOS; MARCOS CLAUDIA; VELEZ JUAN IGNACIO
Lugar:
Capital Federal
Reunión:
Simposio; Argentine Symposium on Software Engineering - ASSE'2016; 2016
Institución organizadora:
SADIO
Resumen:
Over the last few years, the software industry has showed an increasing interest for applications with Natural Language Processing (NLP) capabilities. Several cloud-based solutions have emerged with the purpose of simplifying and streamlining the integration of NLP techniques via Web services. These NLP techniques cover tasks such as language detection, entity recognition, sentiment analysis, classification, among others. However, the services provided are not always as extensible and configurable as a developer may want, preventing their use in industry-grade developments and limiting their adoption in specialized domains (e.g., for analyzing technical documentation). In this context, we have developed a tool called TeXTracT that is designed to be composable, extensible, configurable and accessible. In our tool, NLP techniques can be accessed independently and orchestrated in a pipeline via RESTful Web services. Moreover, the architecture supports the setup and deployment of NLP techniques on demand. The NLP infrastructure is built upon the UIMA framework, which defines communication protocols and uniform service interfaces for text analysis modules. TeXTracT has been evaluated in two case-studies to assess its pros and cons.