MSc in Computer Science, I have interests in Machine Learning and Natural Language Processing. I am a research associate at New York University (NYU). Previously, I worked in other ML and NLP research projects.
MSc @ São Paulo University - Brazil
Master of Science in Computer Science
Monograph title: Automatic Aspect-Based Opinion Summarization Methods
In this master’s project are presented investigations to generate extractive and abstractive summaries of opinions using an aspect-based approach. Besides using known methods in the area, it also was proposed two new methods for Portuguese language which got the best performance in the experiments.
BSc @ San Agustin National University - Peru
Bachelor of Science in Computer Science
In this monograph, it is presented a method to classify medical documents which improves the results of Naive Bayes and Rocchio algorithm. This method, in addition to onsidering statistical information, taking into account the semantic relatedness between the keywords of medical documents.
Research Associate - NYU
from December 2018 to present
Using Deep Learning and Reinforcement Learning methods, I am working on building an Auto Machine Learning system that automatically searches for models and derives end-to-end pipelines that read, pre-process the data, and train the model.
Research Engineer - Inria
from February 2017 to August 2018
I worked on extracting symbolic knowledge about objects and their characteristics from unstructured text (relying on natural language processing, machine learning and machine reading techniques), as well as available ontologies and knowledge on the Semantic Web.
CTO - DoctorCV
from October 2016 to present
Peruvian startup which evaluates and proposes improvements in your CV using NLP techniques. It is like having a virtual recruiter expert who will tell what to change in the resume to increase the chances of being called to the job interview.
Researcher, Software Engineer - Elabora
from July 2015 to January 2017
I researched some information retrieval techniques with machine learning methods and word embedding representations to create hierarchies of terms in collections of patent about different topics. In this project, I used Spark, HBase and Python.
Software Engineer - Dicionário Criativo
from October 2013 to December 2013
I implemented a crawler for Twitter in Python. This crawler collected public comments about music, books and films. I also developed a module to normalize these comments (spell checker, etc.). With this data and other esources the database was increased.
Researcher, Developer - Lindexa
from January 2012 to December 2012
Lindexa - Wayra Telefónica
I researched and developed some automatic techniques of opinion mining and sentiment analysis for Spanish language. In addition, I implemented a crawler for Facebook and Twitter in Python. This crawler collected all public comments written on these social networks. Lindexa was a winner startup of the first edition of Wayra Peru.
Technical Researcher, Developer - Concytec
from January 2011 to December 2011
In general, we researched techniques to generate extractive summaries of many texts. These summaries selected the most important data in terms of coverage and representativeness. I also researched and implemented text classification and clustering algorithms.
Journal Articles (1)
-  Roque López and Thiago Pardo. Opinion Summarization Methods: Comparing and Extending Extractive and Abstractive Approaches. Expert Systems with Applications (ESWA), volume 78. [pdf] [code] [demo]
Conference Papers (13)
-  Jorge Piazentin, Sonia Castelo, Roque López, Enrico Bertini, Juliana Freire and Claudio Silva. PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines. In IEEE Visualization and Computer Graphics [pdf]
-  Lucia Castro, Roque López, Gabriel Cavalcante and Luiz Lapolla. Towards Automatic Building of Term Hierarchies from Large Patent Datasets. In AINL: V Artificial Intelligence and Natural Language Conference. Saint-Petersburg, Russia. [pdf]
-  Roque López and Thiago Pardo. Experiments on Sentence Boundary Detection in User-Generated Web Content. In CICLING: XVI International Conference on Intelligent Text Processing and Computational Linguistics. Cairo, Egypt. [preprint version]
-  Verônica Agostini, Roque López and Thiago Pardo. Automatic Alignment of News Texts and their Multi-document Summaries: Comparison among Methods. In PROPOR: XI International Conference on Computational Processing of Portuguese. São Carlos, Brazil. [preprint version] [code]
 Roque López, Javier Tejada and Mikhail Alexandrov. Medical Texts Classification based on Keywords using Semantic Information. In INFOS: VII International Conference on Intelligent Information and Engineering Systems. Krynica, Poland. [pdf] [code] Young Scientific Best Paper
-  Roque López, Javier Tejada and Mike Thelwall. Spanish Sentistrength as a Tool for Opinion Mining Peruvian Facebook and Twitter. In INFOS: V International Conference on Intelligent Information and Engineering Systems. Krynica, Poland. [pdf]
-  Angels Catena, Mikhail Alexandrov and Roque López. Parameterization of comments from Peruvian Facebook and Twitter: Lexical Resources and Algorithm. In INFOS: V International Conference on Intelligent Information and Engineering Systems. Krynica, Poland. [pdf]
-  Alessandro Bokan and Roque López. Método No Supervisado para la sugerencia de Tags utilizando información semántica basada en conocimiento. In COMTEL: VI Congreso Internacional de Computación y Telecomunicaciones. Lima, Peru. [pdf]
-  Roque López, Dennis Barreda and Javier Tejada. MFSRank: An Unsupervised Method to Extract Keyphrases Using Semantic Information. In MICAI: X Mexican International Conference on Artificial Intelligence. Lecture Notes in Artificial Intelligence. Springer. Puebla, Mexico. [preprint version]
-  Roque López, Dennis Barreda, Javier Tejada and Luis Alfaro. Método Supervisado orientado a la clasificación automática de documentos. Caso Historias Clínicas. In JCC: Jornadas Chilenas de Computación. XXIII Encuentro Chileno de Computación. Curicó, Chile. [pdf]
-  Roque López, Mikhail Alexandrov, Dennis Barreda and Javier Tejada. LexisTerm - The program for term selection by the criterion of specificity. In INFOS: IV International Conference on Intelligent Information and Engineering Systems. Polanczyk, Poland. [pdf] [program]
-  Ales Bourek, Mikhail Alexandrov and Roque López. Folksonomy - supplementing RICHE expert based taxonomy by terms from online documents. In INFOS: IV International Conference on Intelligent Information and Engineering Systems. Polanczyk, Poland. [pdf]
-  Roque López, Dennis Barreda, Javier Tejada and Luis Alfaro. Clasificación automática de Historias Clínicas basada en Prototipos utilizando técnicas de Procesamiento de Lenguaje Natural. In JPC: X Jornadas Peruanas de Computación. Pucallpa, Peru. [pdf]
-  Dennis Barreda, Roque López, Javier Tejada and Luis Alfaro. Un algoritmo genético para la agrupación de documentos aplicado en corpus con características diferentes. In JPC: X Jornadas Peruanas de Computación. Pucallpa, Peru. [pdf]
Workshop Papers (3)
-  Valerio Basile, Roque López, Elena Cabrio. Measuring Frame Instance Relatedness. In *SEM: Joint Conference on Lexical and Computational Semantics. New Orleans, USA. [pdf]
-  Roque López, Lucas Avanço, Pedro Balage, Alessandro Bokan, Paula Cardoso, Márcio Dias, Fernando Nóbrega, Marco Sobrevilla, Jackson Souza, Andressa Zacarias, Ariani Di Felippo, Eloize Seno and Thiago Pardo. A Qualitative Analysis of a Corpus of Opinion Summaries based on Aspects. In LAW: IX Linguistic Annotation Workshop. Colorado, USA. [pdf] [corpus]
-  Márcio Dias, Alessandro Bokan, Carla Chuman, Cláudia Barros, Erick Maziero, Fernando Nobrega, Jackson Souza, Marco Sobrevilla, Marina Delege, Lucía Castro, Naira Silva, Paula Cardoso, Pedro Balage, Roque López, Vanessa Marcasso, Ariani Felippo, Maria Graças and Thiago Pardo. Enriquecendo o Corpus CSTNews - a Criação de Novos Sumários Multidocumento. In ToRPorEsp: I Workshop on Tools and Resources for Automatically Processing Portuguese and Spanish. São Carlos, Brazil. [pdf]
-  Roque López. Sumarização Automática de Opiniões Baseada em Aspectos. Masters Thesis (Portuguese), Universidade de São Paulo. [pdf] Finalist for the Best MSc Dissertation in IEEE LA-CCI.
-  Roque López. Método de Clasificación Automática de Textos basado en Palabras Claves utilizando Información Semántica: Aplicación a Historias Clínicas. Undergraduate Thesis (Spanish), Universidad Nacional de San Agustin. [pdf] Top 5 undergraduate thesis in SPIA.