30 junio, 2016

Charla SoCVis 1 de julio

diccionario de palabras

El Grupo de Computación Social y Visualización de la Pontificia Universidad Católica de Chile (SoCVis) y el Departamento de Ciencia de la Computación de la Escuela de Ingeniería UC invitan (DCC UC) para este viernes 1 de julio, a las 15:30 horas, a la charla “Acquiring and Exploiting Lexical Knowledge for Twitter Sentiment Analysis”, con el investigador del Grupo de Inteligencia de Máquina de la Universidad de Waikato, Felipe Bravo.

Lugar: Sala Álvaro Campos, campus San Joaquín, edificio San Agustín 4to. piso (lado sur).


“Acquiring and Exploiting Lexical Knowledge for Twitter Sentiment Analysis”


Felipe José Bravo Márquez, investigador Grupo de Inteligencia de Máquina, Universidad de Waikato (Nueva Zelanda).

Resumen/ Abstract

This talk addresses the label sparsity problem for Twitter polarity classification by automatically building two type of resources that can be exploited when labelled data is scarce: opinion lexicons, which are lists of words labelled by sentiment, and synthetically labelled tweets. We build Twitter-specific opinion lexicons by training words-level classifiers using representations that exploit different sources of information such as (a) the morphological information conveyed by part-of-speech (POS) tags, (b) associations between words and the sentiment expressed in the tweets that contain them, and (c) distributional representations calculated from unlabelled tweets. Experimental results show that the generated lexicons produce significant improvements over existing manually annotated lexicons.

In the second part, we develop distant supervision methods for generating synthetic training data for Twitter polarity classification by exploiting unlabelled tweets and prior lexical knowledge. We study different mechanisms for selecting the candidate tweets to be averaged. Our experimental results show that the training data generated by the proposed models produce classifiers that perform significantly better than classifiers trained from tweets annotated with emoticons, a popular distant supervision approach for Twitter sentiment analysis.

Sobre el expositor/ About the speaker

Felipe Bravo-Marquez is currently doing his PhD at the machine learning group in the University of Waikato, New Zealand. He received two engineering degrees in the fields of computer science and industrial engineering, and a master’s degree in computer science, all from the University of Chile. He worked for three years as a research engineer at Yahoo! Labs Latin America. His main areas of interest are: data mining, natural language processing, information retrieval, and sentiment analysis.
La charla “Acquiring and Exploiting Lexical Knowledge for Twitter Sentiment Analysis” es patrocinada por el proyecto Fondecyt de Iniciación 11150783, “Dealing with information overload using intelligent recommender system interfaces”.