About Me
Hi, I’m Roque!
I am a research engineer at New York University (NYU), I have been involved in various research and development projects throughout my career.
My background and experience are primarily in data integration, applied machine learning, and natural language processing. I also have a strong interest in reinforcement learning.
Research Software
BDI-Kit is a toolkit for data integration and harmonization that offers a comprehensive suite of methods for schema and value matching. It can be used programmatically through a Python API or interactively via an AI agent using natural language queries.
This work is part of the BDF project of ARPA-H.
AlphaD3M, an AutoML library implemented in Python that automatically synthesizes ML end-to-end pipelines for different machine learning tasks and different data types. Through an API, it allows the users to explore the input data and the derived pipelines, as well as customized the pipelines.
This tool is part of NYU’s implementation of the Data Driven Discovery project (D3M), DARPA.
Publications
Selected publications. For the complete list, please visit my Google Scholar profile.
Roque Lopez, Aécio Santos, Christos Koutras, Juliana Freire
Patterns, 2026
Roque Lopez, Raoni Lourenco, Remi Rampin, Sonia Castelo, Aécio Santos, Jorge Ono, Claudio Silva, Juliana Freire
AutoML Conference, 2023
Jorge Ono, Sonia Castelo, Roque Lopez, Enrico Bertini, Juliana Freire, Claudio Silva
IEEE Visualization Conference, 2020
Education
São Paulo University
MSc in Computer Science
Monograph title: Automatic Aspect-Based Opinion Summarization Methods
This master’s thesis investigates extractive and abstractive opinion summarization using an aspect-based approach. In addition to applying existing methods, two new approaches for Portuguese were proposed, achieving the best performance in the experiments.
San Agustin National University
BSc Computer Science
Monograph title: Medical Documents Classification based on Keywords using Semantic Information
This thesis presents a method for classifying medical documents that improves over existing approaches. The method combines statistical information with semantic relatedness between document keywords.
CV
You can find a detailed version of my complete CV, including additional information about my professional and research experience, by clicking here.