MSc in Computer Science at São Paulo University, Brazil. I have interest in Natural Language Processing and Natural Language Generation. I am research engineer at Inria, France. Previously, I worked in other NLP research projects. I received the bachelor's degree from San Agustin National University, Peru.
MSc @ São Paulo University - Brazil
Master of Science in Computer Science
Monograph title: Automatic Aspect-Based Opinion Summarization Methods
In this master’s project are presented investigations to generate extractive and abstractive summaries of opinions using an aspect-based approach. Besides using known methods in the area, it also was proposed two new methods for Portuguese language which got the best performance in the experiments.
BSc @ San Agustin National University - Peru
Bachelor of Science in Computer Science
In this monograph, it is presented a method to classify medical documents which improves the results of Naive Bayes and Rocchio algorithm. This method, in addition to onsidering statistical information, taking into account the semantic relatedness between the keywords of medical documents.
Research Engineer - Inria
from February 2017 to present
I am working on extracting symbolic knowledge about objects and their characteristics from unstructured text (relying on natural language processing and machine reading techniques), as well as available ontologies and knowledge on the Semantic Web.
CEO, Co-Founder - SimiLabs
from March 2016 to present
Peruvian NLP startup dedicated to create tools and resources in order to analyze opinions on the Internet about companies, products and/or services.
Researcher, Software Engineer - Elabora
from July 2015 to January 2017
I researched some information retrieval methods and word embedding representations to create hierarchies of terms in collections of patent about different topics. In this project, I used Spark, HBase and Python.
Software Engineer - Dicionário Criativo
from October 2013 to December 2013
I implemented a crawler for Twitter in Python. This crawler collects public comments about music, books and films. I also developed a module to normalize these comments (spell checker, etc.). With this data and other resources the database was increased.
Researcher, Developer - Lindexa
from January 2012 to December 2012
Lindexa - Wayra Telefónica
I researched and developed some automatic techniques of Opinion Mining and Sentiment Analysis for Spanish language. In addition, I implemented a crawler for Facebook and Twitter in Python. This crawler collects all public comments written on these social networks. Lindexa is a winner startup of the first edition of Wayra Peru.
Technical Researcher, Developer - Concytec
from January 2011 to December 2011
In general, we researched techniques to generate extractive summaries of many texts. These summaries represent the most important data. I also researched and implemented Text Classification and Clustering algorithms.
Journal Articles (1)
-  Roque López and Thiago Pardo. Opinion Summarization Methods: Comparing and Extending Extractive and Abstractive Approaches. Expert Systems with Applications (ESWA), volume 78. [pdf] [demo]
Conference Papers (13)
-  Lucia Castro, Roque López, Gabriel Cavalcante and Luiz Lapolla. Towards Automatic Building of Term Hierarchies from Large Patent Datasets. In AINL: V Artificial Intelligence and Natural Language Conference. Saint-Petersburg, Russia. [pdf]
-  Roque López and Thiago Pardo. Experiments on Sentence Boundary Detection in User-Generated Web Content. In CICLING: XVI International Conference on Intelligent Text Processing and Computational Linguistics. Cairo, Egypt. [preprint version]
-  Verônica Agostini, Roque López and Thiago Pardo. Automatic Alignment of News Texts and their Multi-document Summaries: Comparison among Methods. In PROPOR: XI International Conference on Computational Processing of Portuguese. São Carlos, Brazil. [preprint version] [code]
 Roque López, Javier Tejada and Mikhail Alexandrov. Medical Texts Classification based on Keywords using Semantic Information. In INFOS: VII International Conference on Intelligent Information and Engineering Systems. Krynica, Poland. [pdf] [code] Young Scientific Best Paper
-  Roque López, Javier Tejada and Mike Thelwall. Spanish Sentistrength as a Tool for Opinion Mining Peruvian Facebook and Twitter. In INFOS: V International Conference on Intelligent Information and Engineering Systems. Krynica, Poland. [pdf]
-  Angels Catena, Mikhail Alexandrov and Roque López. Parameterization of comments from Peruvian Facebook and Twitter: Lexical Resources and Algorithm. In INFOS: V International Conference on Intelligent Information and Engineering Systems. Krynica, Poland. [pdf]
-  Alessandro Bokan and Roque López. Método No Supervisado para la sugerencia de Tags utilizando información semántica basada en conocimiento. In COMTEL: VI Congreso Internacional de Computación y Telecomunicaciones. Lima, Peru. [pdf]
-  Roque López, Dennis Barreda and Javier Tejada. MFSRank: An Unsupervised Method to Extract Keyphrases Using Semantic Information. In MICAI: X Mexican International Conference on Artificial Intelligence. Lecture Notes in Artificial Intelligence. Springer. Puebla, Mexico. [preprint version]
-  Roque López, Dennis Barreda, Javier Tejada and Luis Alfaro. Método supervisado orientado a la clasificación automática de Historias Clínicas. In JCC: Jornadas Chilenas de Computación. XXIII Encuentro Chileno de Computación. Curicó, Chile. [pdf]
-  Roque López, Mikhail Alexandrov, Dennis Barreda and Javier Tejada. LexisTerm - The program for term selection by the criterion of specificity. In INFOS: IV International Conference on Intelligent Information and Engineering Systems. Polanczyk, Poland. [pdf] [program]
-  Ales Bourek, Mikhail Alexandrov and Roque López. Folksonomy - supplementing RICHE expert based taxonomy by terms from online documents. In INFOS: IV International Conference on Intelligent Information and Engineering Systems. Polanczyk, Poland. [pdf]
-  Roque López, Dennis Barreda, Javier Tejada and Luis Alfaro. Clasificación automática de Historias Clínicas basada en Prototipos utilizando técnicas de Procesamiento de Lenguaje Natural. In JPC: X Jornadas Peruanas de Computación. Pucallpa, Peru. [pdf]
-  Dennis Barreda, Roque López, Javier Tejada and Luis Alfaro. Un algoritmo genético para la agrupación de documentos aplicado en corpus con características diferentes. In JPC: X Jornadas Peruanas de Computación. Pucallpa, Peru. [pdf]
Workshop Papers (2)
-  Roque López, Lucas Avanço, Pedro Balage, Alessandro Bokan, Paula Cardoso, Márcio Dias, Fernando Nóbrega, Marco Sobrevilla, Jackson Souza, Andressa Zacarias, Ariani Di Felippo, Eloize Seno and Thiago Pardo. A Qualitative Analysis of a Corpus of Opinion Summaries based on Aspects. In LAW: IX Linguistic Annotation Workshop. Colorado, USA. [pdf] [corpus]
-  Márcio Dias, Alessandro Bokan, Carla Chuman, Cláudia Barros, Erick Maziero, Fernando Nobrega, Jackson Souza, Marco Sobrevilla, Marina Delege, Lucía Castro, Naira Silva, Paula Cardoso, Pedro Balage, Roque López, Vanessa Marcasso, Ariani Felippo, Maria Graças and Thiago Pardo. Enriquecendo o Corpus CSTNews - a Criação de Novos Sumários Multidocumento. In ToRPorEsp: I Workshop on Tools and Resources for Automatically Processing Portuguese and Spanish. São Carlos, Brazil. [pdf]
-  Roque López. Sumarização Automática de Opiniões Baseada em Aspectos. Masters Thesis (Portuguese), Universidade de São Paulo. [pdf] Finalist for the Best MSc Dissertation in IEEE LA-CCI.
-  Roque López. Método de Clasificación Automática de Textos basado en Palabras Claves utilizando Información Semántica: Aplicación a Historias Clínicas. Undergraduate Thesis (Spanish), Universidad Nacional de San Agustin. [pdf] Top 5 undergraduate thesis in SPIA.