Portfolio

NLP Pipelines for Corporate and Policy Data

During my internship at Goods Unite Us, I developed NLP-driven data pipelines to process large-scale corporate and policy datasets, including SEC filings and political contribution records. This project focused on extracting, normalizing, and integrating unstructured data from into structured insights to support transparency, check reliable patterns and integrated to the Mobile App.

Design and Implementation of a Self-Paced Listening Experiment with Integrated Bilingual Language Profiling (Quechua-Spanish)

End-to-end psycholinguistic experiment built in PsychoPy, combining self-paced listening with bilingual profiling and a reproducible Python data pipeline.

Pupillometry Experiment on Spanish Subjunctive Processing: EyeLink Implementation and Psycholinguistic Analysis [Ongoing Project]

A pupillometry-based experiment using EyeLink to investigate real-time processing of Spanish subjunctive morphology in heritage speakers and L2 learners.

EEG Signal Processing for Emotion Recognition and Biomarker Extraction

Research project on EEG signal preprocessing and artifact removal using EEGLAB and MATLAB, involving advanced techniques such as filtering, Independent Component Analysis (ICA), and noise reduction to support emotional recognition and biomarker discovery.

Building a Reproducible Transcript Cleaning Pipeline in R [Ongoing Project]

Designed and refactored an R-based pipeline to clean, normalize, and restructure WebVTT transcripts for the digitalization and preprocessing of a large-scale sociolinguistic corpus of bilingual speech in the U.S.–Mexico border, focusing on transcript normalization and timestamp reconstruction using R.