Roque Lopez

About Me

Hi, I’m Roque!

I am a research engineer at New York University (NYU), I have been involved in various research and development projects throughout my career.

My background and experience are primarily in data integration, applied machine learning, and natural language processing. I also have a strong interest in reinforcement learning.

Research Software

BDI-Kit

BDI-Kit is a toolkit for data integration and harmonization that offers a comprehensive suite of methods for schema and value matching. It can be used programmatically through a Python API or interactively via an AI agent using natural language queries.

This work is part of the BDF project of ARPA-H.

AlphaD3M

AlphaD3M, an AutoML library implemented in Python that automatically synthesizes ML end-to-end pipelines for different machine learning tasks and different data types. Through an API, it allows the users to explore the input data and the derived pipelines, as well as customized the pipelines.

This tool is part of NYU’s implementation of the Data Driven Discovery project (D3M), DARPA.

Publications

Selected publications. For the complete list, please visit my Google Scholar profile.

BDI-Kit: An AI-Powered Toolkit for Biomedical Data Harmonization

Roque Lopez, Aécio Santos, Christos Koutras, Juliana Freire

Patterns, 2026

AlphaD3M: An Open-Source AutoML Library for Multiple ML Tasks

Roque Lopez, Raoni Lourenco, Remi Rampin, Sonia Castelo, Aécio Santos, Jorge Ono, Claudio Silva, Juliana Freire

AutoML Conference, 2023

PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines

Jorge Ono, Sonia Castelo, Roque Lopez, Enrico Bertini, Juliana Freire, Claudio Silva

IEEE Visualization Conference, 2020

Opinion Summarization Methods: Comparing and Extending Extractive and Abstractive Approaches

Roque Lopez, Thiago Pardo

Expert Systems with Applications Journal, 2017

Education

São Paulo University

MSc in Computer Science

Monograph title: Automatic Aspect-Based Opinion Summarization Methods

This master’s thesis investigates extractive and abstractive opinion summarization using an aspect-based approach. In addition to applying existing methods, two new approaches for Portuguese were proposed, achieving the best performance in the experiments.

San Agustin National University

BSc Computer Science

Monograph title: Medical Documents Classification based on Keywords using Semantic Information

This thesis presents a method for classifying medical documents that improves over existing approaches. The method combines statistical information with semantic relatedness between document keywords.

CV

You can find a detailed version of my complete CV, including additional information about my professional and research experience, by clicking here.