Roque Lopez

Research Engineer

About Me

Hi, I’m Roque!

I am a research engineer at New York University (NYU), I have been involved in various research and development projects throughout my career.

My background and experience are primarily in data integration, applied machine learning, and natural language processing. I also have a strong interest in reinforcement learning.

Research Software

BDI-Kit

BDI-Kit is a toolkit for data integration and harmonization that offers a comprehensive suite of methods for schema and value matching. It can be used programmatically through a Python API or interactively via an AI agent using natural language queries.

This work is part of the BDF project of ARPA-H.

AlphaD3M

AlphaD3M, an AutoML library implemented in Python that automatically synthesizes ML end-to-end pipelines for different machine learning tasks and different data types. Through an API, it allows the users to explore the input data and the derived pipelines, as well as customized the pipelines.

This tool is part of NYU’s implementation of the Data Driven Discovery project (D3M), DARPA.

Publications

Selected publications. For the complete list, please visit my Google Scholar profile.

BDI-Kit: An AI-Powered Toolkit for Biomedical Data Harmonization

Roque Lopez, Aécio Santos, Christos Koutras, Juliana Freire

Patterns, 2026

AlphaD3M: An Open-Source AutoML Library for Multiple ML Tasks

Roque Lopez, Raoni Lourenco, Remi Rampin, Sonia Castelo, Aécio Santos, Jorge Ono, Claudio Silva, Juliana Freire

AutoML Conference, 2023

PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines

Jorge Ono, Sonia Castelo, Roque Lopez, Enrico Bertini, Juliana Freire, Claudio Silva

IEEE Visualization Conference, 2020

Opinion Summarization Methods: Comparing and Extending Extractive and Abstractive Approaches

Roque Lopez, Thiago Pardo

Expert Systems with Applications Journal, 2017

Education

São Paulo University

MSc in Computer Science

2013 - 2015

Monograph title: Automatic Aspect-Based Opinion Summarization Methods

In this master’s project are presented investigations to generate extractive and abstractive summaries of opinions using an aspect-based approach. Besides using known methods in the area, it also was proposed two new methods for Portuguese language which got the best performance in the experiments.

San Agustin National University

BSc Computer Science

2006 - 2010

Monograph title: Medical Documents Classification based on Keywords using Semantic Information

In this monograph, it is presented a method to classify medical documents which improves the results of Naive Bayes and Rocchio algorithm. This method, in addition to considering statistical information, taking into account the semantic relatedness between the keywords of medical documents.

CV

You can find a detailed version of my complete CV, including additional information about my professional and research experience, by clicking here.