Roque López


Roque López


Researcher - Natural Language Processing

MSc in Computer Science at São Paulo University, Brazil. I have interest in Natural Language Processing and Natural Language Generation. I am research engineer at Inria, France. Previously, I worked in other NLP research projects. I received the bachelor's degree from San Agustin National University, Peru.

Personal Information

Name
Roque Enrique López Condori
Nationality
Peruvian
Address
Nice, France

Education

2013 - 2015

MSc @ São Paulo University - Brazil

Master of Science in Computer Science

Monograph title: Automatic Aspect-Based Opinion Summarization Methods

In this master’s project are presented investigations to generate extractive and abstractive summaries of opinions using an aspect-based approach. Besides using known methods in the area, it also was proposed two new methods for Portuguese language which got the best performance in the experiments.

2006 - 2010

BSc @ San Agustin National University - Peru

Bachelor of Science in Computer Science

Monograph title: Medical Documents Classification based on Keywords using Semantic Information

In this monograph, it is presented a method to classify medical documents which improves the results of Naive Bayes and Rocchio algorithm. This method, in addition to onsidering statistical information, taking into account the semantic relatedness between the keywords of medical documents.

Employment

Current

Research Engineer - Inria

from February 2017 to present

Inria

I am working on extracting symbolic knowledge about objects and their characteristics from unstructured text (relying on natural language processing and machine reading techniques), as well as available ontologies and knowledge on the Semantic Web.

Current

CEO, Co-Founder - SimiLabs

from March 2016 to present

SimiLabs

Peruvian NLP startup dedicated to create tools and resources in order to analyze opinions on the Internet about companies, products and/or services.

→ View website

2015

Researcher, Software Engineer - Elabora

from July 2015 to January 2017

Elabora Consultoria

I researched some information retrieval methods and word embedding representations to create hierarchies of terms in collections of patent about different topics. In this project, I used Spark, HBase and Python.

2013

Software Engineer - Dicionário Criativo

from October 2013 to December 2013

Dicionário Criativo

I implemented a crawler for Twitter in Python. This crawler collects public comments about music, books and films. I also developed a module to normalize these comments (spell checker, etc.). With this data and other resources the database was increased.

2012

Researcher, Developer - Lindexa

from January 2012 to December 2012

Lindexa - Wayra Telefónica

I researched and developed some automatic techniques of Opinion Mining and Sentiment Analysis for Spanish language. In addition, I implemented a crawler for Facebook and Twitter in Python. This crawler collects all public comments written on these social networks. Lindexa is a winner startup of the first edition of Wayra Peru.

2011

Technical Researcher, Developer - Concytec

from January 2011 to December 2011

Cátedra CONCYTEC

In general, we researched techniques to generate extractive summaries of many texts. These summaries represent the most important data. I also researched and implemented Text Classification and Clustering algorithms.

Research Interest

NLP
Opinion Summarization Sentiment Analysis Natural Language Generation
Others
Machine Learning Data Science Software Development

Publications

Journal Articles (1)

  • [2017] Roque López and Thiago Pardo. Opinion Summarization Methods: Comparing and Extending Extractive and Abstractive Approaches. Expert Systems with Applications (ESWA), volume 78. [pdf] [demo]

Conference Papers (13)

  • [2016] Lucia Castro, Roque López, Gabriel Cavalcante and Luiz Lapolla. Towards Automatic Building of Term Hierarchies from Large Patent Datasets. In AINL: V Artificial Intelligence and Natural Language Conference. Saint-Petersburg, Russia. [pdf]
  • [2015] Roque López and Thiago Pardo. Experiments on Sentence Boundary Detection in User-Generated Web Content. In CICLING: XVI International Conference on Intelligent Text Processing and Computational Linguistics. Cairo, Egypt. [preprint version]
  • [2014] Verônica Agostini, Roque López and Thiago Pardo. Automatic Alignment of News Texts and their Multi-document Summaries: Comparison among Methods. In PROPOR: XI International Conference on Computational Processing of Portuguese. São Carlos, Brazil. [preprint version] [code]
  • [2014] Roque López, Javier Tejada and Mikhail Alexandrov. Medical Texts Classification based on Keywords using Semantic Information. In INFOS: VII International Conference on Intelligent Information and Engineering Systems. Krynica, Poland. [pdf] [code] Young Scientific Best Paper
  • [2012] Roque López, Javier Tejada and Mike Thelwall. Spanish Sentistrength as a Tool for Opinion Mining Peruvian Facebook and Twitter. In INFOS: V International Conference on Intelligent Information and Engineering Systems. Krynica, Poland. [pdf]
  • [2012] Angels Catena, Mikhail Alexandrov and Roque López. Parameterization of comments from Peruvian Facebook and Twitter: Lexical Resources and Algorithm. In INFOS: V International Conference on Intelligent Information and Engineering Systems. Krynica, Poland. [pdf]
  • [2012] Alessandro Bokan and Roque López. Método No Supervisado para la sugerencia de Tags utilizando información semántica basada en conocimiento. In COMTEL: VI Congreso Internacional de Computación y Telecomunicaciones. Lima, Peru. [pdf]
  • [2011] Roque López, Dennis Barreda and Javier Tejada. MFSRank: An Unsupervised Method to Extract Keyphrases Using Semantic Information. In MICAI: X Mexican International Conference on Artificial Intelligence. Lecture Notes in Artificial Intelligence. Springer. Puebla, Mexico. [preprint version]
  • [2011] Roque López, Dennis Barreda, Javier Tejada and Luis Alfaro. Método supervisado orientado a la clasificación automática de Historias Clínicas. In JCC: Jornadas Chilenas de Computación. XXIII Encuentro Chileno de Computación. Curicó, Chile. [pdf]
  • [2011] Roque López, Mikhail Alexandrov, Dennis Barreda and Javier Tejada. LexisTerm - The program for term selection by the criterion of specificity. In INFOS: IV International Conference on Intelligent Information and Engineering Systems. Polanczyk, Poland. [pdf] [program]
  • [2011] Ales Bourek, Mikhail Alexandrov and Roque López. Folksonomy - supplementing RICHE expert based taxonomy by terms from online documents. In INFOS: IV International Conference on Intelligent Information and Engineering Systems. Polanczyk, Poland. [pdf]
  • [2011] Roque López, Dennis Barreda, Javier Tejada and Luis Alfaro. Clasificación automática de Historias Clínicas basada en Prototipos utilizando técnicas de Procesamiento de Lenguaje Natural. In JPC: X Jornadas Peruanas de Computación. Pucallpa, Peru. [pdf]
  • [2011] Dennis Barreda, Roque López, Javier Tejada and Luis Alfaro. Un algoritmo genético para la agrupación de documentos aplicado en corpus con características diferentes. In JPC: X Jornadas Peruanas de Computación. Pucallpa, Peru. [pdf]

Workshop Papers (2)

  • [2015] Roque López, Lucas Avanço, Pedro Balage, Alessandro Bokan, Paula Cardoso, Márcio Dias, Fernando Nóbrega, Marco Sobrevilla, Jackson Souza, Andressa Zacarias, Ariani Di Felippo, Eloize Seno and Thiago Pardo. A Qualitative Analysis of a Corpus of Opinion Summaries based on Aspects. In LAW: IX Linguistic Annotation Workshop. Colorado, USA. [pdf] [corpus]
  • [2014] Márcio Dias, Alessandro Bokan, Carla Chuman, Cláudia Barros, Erick Maziero, Fernando Nobrega, Jackson Souza, Marco Sobrevilla, Marina Delege, Lucía Castro, Naira Silva, Paula Cardoso, Pedro Balage, Roque López, Vanessa Marcasso, Ariani Felippo, Maria Graças and Thiago Pardo. Enriquecendo o Corpus CSTNews - a Criação de Novos Sumários Multidocumento. In ToRPorEsp: I Workshop on Tools and Resources for Automatically Processing Portuguese and Spanish. São Carlos, Brazil. [pdf]

Monographs (2)

  • [2015] Roque López. Sumarização Automática de Opiniões Baseada em Aspectos. Masters Thesis (Portuguese), Universidade de São Paulo. [pdf] Finalist for the Best MSc Dissertation in IEEE LA-CCI.
  • [2014] Roque López. Método de Clasificación Automática de Textos basado en Palabras Claves utilizando Información Semántica: Aplicación a Historias Clínicas. Undergraduate Thesis (Spanish), Universidad Nacional de San Agustin. [pdf] Top 5 undergraduate thesis in SPIA.

Skills

Programming Languages
Python Java C++
Big Data
Hadoop Spark HBase
Databases
MySQL PostGresSQL SQL Server
Others
Git NLTK Scikit-learn Linux

Blog

Algoritmo de Porter Español

FEB 16

La frecuencia de una palabra en un texto puede ser útil para muchas tareas...

Startups en el Perú

JAN 16

Actualmente las condiciones para el desarrollo de una startup en nuestro país han cambiado...

Hobbies

Movies Soccer Running Traveling