Roque López


Roque López


Research Software Engineer

MSc in Computer Science, I have interests in Machine Learning and Natural Language Processing. I am a research associate at New York University (NYU). Previously, I worked in other ML and NLP research projects.

Personal Information

Name
Roque Enrique López Condori
Nationality
Peruvian
Address
Brooklyn, New York, U.S.A.

Education

2013 - 2015

MSc @ São Paulo University - Brazil

Master of Science in Computer Science

Monograph title: Automatic Aspect-Based Opinion Summarization Methods

In this master’s project are presented investigations to generate extractive and abstractive summaries of opinions using an aspect-based approach. Besides using known methods in the area, it also was proposed two new methods for Portuguese language which got the best performance in the experiments.

2006 - 2010

BSc @ San Agustin National University - Peru

Bachelor of Science in Computer Science

Monograph title: Medical Documents Classification based on Keywords using Semantic Information

In this monograph, it is presented a method to classify medical documents which improves the results of Naive Bayes and Rocchio algorithm. This method, in addition to onsidering statistical information, taking into account the semantic relatedness between the keywords of medical documents.

Employment

Current

Research Associate - NYU

from December 2018 to present

NYU

Using Deep Learning and Reinforcement Learning methods, I am working on building an Auto Machine Learning system that automatically searches for models and derives end-to-end pipelines that read, pre-process the data, and train the model.

2017

Research Engineer - Inria

from February 2017 to August 2018

Inria

I worked on extracting symbolic knowledge about objects and their characteristics from unstructured text (relying on natural language processing, machine learning and machine reading techniques), as well as available ontologies and knowledge on the Semantic Web.

Current

CTO - DoctorCV

from October 2016 to present

DoctorCV

Peruvian startup which evaluates and proposes improvements in your CV using NLP techniques. It is like having a virtual recruiter expert who will tell what to change in the resume to increase the chances of being called to the job interview.

→ View website

2015

Researcher, Software Engineer - Elabora

from July 2015 to January 2017

Elabora Consultoria

I researched some information retrieval techniques with machine learning methods and word embedding representations to create hierarchies of terms in collections of patent about different topics. In this project, I used Spark, HBase and Python.

2013

Software Engineer - Dicionário Criativo

from October 2013 to December 2013

Dicionário Criativo

I implemented a crawler for Twitter in Python. This crawler collected public comments about music, books and films. I also developed a module to normalize these comments (spell checker, etc.). With this data and other esources the database was increased.

2012

Researcher, Developer - Lindexa

from January 2012 to December 2012

Lindexa - Wayra Telefónica

I researched and developed some automatic techniques of opinion mining and sentiment analysis for Spanish language. In addition, I implemented a crawler for Facebook and Twitter in Python. This crawler collected all public comments written on these social networks. Lindexa was a winner startup of the first edition of Wayra Peru.

2011

Technical Researcher, Developer - Concytec

from January 2011 to December 2011

Cátedra CONCYTEC

In general, we researched techniques to generate extractive summaries of many texts. These summaries selected the most important data in terms of coverage and representativeness. I also researched and implemented text classification and clustering algorithms.

Publications

Journal Articles (1)

  • [2017] Roque López and Thiago Pardo. Opinion Summarization Methods: Comparing and Extending Extractive and Abstractive Approaches. Expert Systems with Applications (ESWA), volume 78. [pdf] [code] [demo]

Conference Papers (13)

  • [2020] Jorge Piazentin, Sonia Castelo, Roque López, Enrico Bertini, Juliana Freire and Claudio Silva. PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines. In IEEE Visualization and Computer Graphics [pdf]
  • [2016] Lucia Castro, Roque López, Gabriel Cavalcante and Luiz Lapolla. Towards Automatic Building of Term Hierarchies from Large Patent Datasets. In AINL: V Artificial Intelligence and Natural Language Conference. Saint-Petersburg, Russia. [pdf]
  • [2015] Roque López and Thiago Pardo. Experiments on Sentence Boundary Detection in User-Generated Web Content. In CICLING: XVI International Conference on Intelligent Text Processing and Computational Linguistics. Cairo, Egypt. [preprint version]
  • [2014] Verônica Agostini, Roque López and Thiago Pardo. Automatic Alignment of News Texts and their Multi-document Summaries: Comparison among Methods. In PROPOR: XI International Conference on Computational Processing of Portuguese. São Carlos, Brazil. [preprint version] [code]
  • [2014] Roque López, Javier Tejada and Mikhail Alexandrov. Medical Texts Classification based on Keywords using Semantic Information. In INFOS: VII International Conference on Intelligent Information and Engineering Systems. Krynica, Poland. [pdf] [code] Young Scientific Best Paper
  • [2012] Roque López, Javier Tejada and Mike Thelwall. Spanish Sentistrength as a Tool for Opinion Mining Peruvian Facebook and Twitter. In INFOS: V International Conference on Intelligent Information and Engineering Systems. Krynica, Poland. [pdf]
  • [2012] Angels Catena, Mikhail Alexandrov and Roque López. Parameterization of comments from Peruvian Facebook and Twitter: Lexical Resources and Algorithm. In INFOS: V International Conference on Intelligent Information and Engineering Systems. Krynica, Poland. [pdf]
  • [2012] Alessandro Bokan and Roque López. Método No Supervisado para la sugerencia de Tags utilizando información semántica basada en conocimiento. In COMTEL: VI Congreso Internacional de Computación y Telecomunicaciones. Lima, Peru. [pdf]
  • [2011] Roque López, Dennis Barreda and Javier Tejada. MFSRank: An Unsupervised Method to Extract Keyphrases Using Semantic Information. In MICAI: X Mexican International Conference on Artificial Intelligence. Lecture Notes in Artificial Intelligence. Springer. Puebla, Mexico. [preprint version]
  • [2011] Roque López, Dennis Barreda, Javier Tejada and Luis Alfaro. Método Supervisado orientado a la clasificación automática de documentos. Caso Historias Clínicas. In JCC: Jornadas Chilenas de Computación. XXIII Encuentro Chileno de Computación. Curicó, Chile. [pdf]
  • [2011] Roque López, Mikhail Alexandrov, Dennis Barreda and Javier Tejada. LexisTerm - The program for term selection by the criterion of specificity. In INFOS: IV International Conference on Intelligent Information and Engineering Systems. Polanczyk, Poland. [pdf] [program]
  • [2011] Ales Bourek, Mikhail Alexandrov and Roque López. Folksonomy - supplementing RICHE expert based taxonomy by terms from online documents. In INFOS: IV International Conference on Intelligent Information and Engineering Systems. Polanczyk, Poland. [pdf]
  • [2011] Roque López, Dennis Barreda, Javier Tejada and Luis Alfaro. Clasificación automática de Historias Clínicas basada en Prototipos utilizando técnicas de Procesamiento de Lenguaje Natural. In JPC: X Jornadas Peruanas de Computación. Pucallpa, Peru. [pdf]
  • [2011] Dennis Barreda, Roque López, Javier Tejada and Luis Alfaro. Un algoritmo genético para la agrupación de documentos aplicado en corpus con características diferentes. In JPC: X Jornadas Peruanas de Computación. Pucallpa, Peru. [pdf]

Workshop Papers (3)

  • [2018] Valerio Basile, Roque López, Elena Cabrio. Measuring Frame Instance Relatedness. In *SEM: Joint Conference on Lexical and Computational Semantics. New Orleans, USA. [pdf]
  • [2015] Roque López, Lucas Avanço, Pedro Balage, Alessandro Bokan, Paula Cardoso, Márcio Dias, Fernando Nóbrega, Marco Sobrevilla, Jackson Souza, Andressa Zacarias, Ariani Di Felippo, Eloize Seno and Thiago Pardo. A Qualitative Analysis of a Corpus of Opinion Summaries based on Aspects. In LAW: IX Linguistic Annotation Workshop. Colorado, USA. [pdf] [corpus]
  • [2014] Márcio Dias, Alessandro Bokan, Carla Chuman, Cláudia Barros, Erick Maziero, Fernando Nobrega, Jackson Souza, Marco Sobrevilla, Marina Delege, Lucía Castro, Naira Silva, Paula Cardoso, Pedro Balage, Roque López, Vanessa Marcasso, Ariani Felippo, Maria Graças and Thiago Pardo. Enriquecendo o Corpus CSTNews - a Criação de Novos Sumários Multidocumento. In ToRPorEsp: I Workshop on Tools and Resources for Automatically Processing Portuguese and Spanish. São Carlos, Brazil. [pdf]

Monographs (2)

  • [2015] Roque López. Sumarização Automática de Opiniões Baseada em Aspectos. Masters Thesis (Portuguese), Universidade de São Paulo. [pdf] Finalist for the Best MSc Dissertation in IEEE LA-CCI.
  • [2014] Roque López. Método de Clasificación Automática de Textos basado en Palabras Claves utilizando Información Semántica: Aplicación a Historias Clínicas. Undergraduate Thesis (Spanish), Universidad Nacional de San Agustin. [pdf] Top 5 undergraduate thesis in SPIA.

Projects

AlphaD3M

This project aims to develop automated model discovery systems that enable users with subject matter...

ALOOF

The goal of ALOOF is to equip autonomous systems with the ability to learn the meaning of objects...

PROS@

Its objective is to advance the state-of-the-art for semantic processing of texts written in Portuguese...

Research Interest

NLP
Opinion Summarization Sentiment Analysis Keyword Extraction
ML
AutoML Reinforcement Learning
Others
Data Science Knowledge Engineering