Arnav Gupta
Arnav Gupta profile picture

About

Arnav Gupta

I am an electrical & computer engineering and computer science student at Duke University with a strong interest in applied machine learning. I have experience working with EEG data, deep learning models like EEGNet, and large-scale research datasets. I enjoy building projects that combine technical depth with real-world impact. Currently, I am seeking opportunities to apply my skills in data science and AI-driven research. I would love to connect on LinkedIn or through email.

Experience

Machine Learning Research Assistant

May 2025 - Present

Duke University Applied Machine Learning Lab

Duke University logo
  • Working on EEG-based brain-computer interface (BCI) systems, specifically P300 spellers designed for assistive communication in ALS patients.
  • Research focuses on generalization challenges across users, analyzing trade-offs between data quality versus data quantity in training datasets.
  • Implemented deep learning and classical ML approaches including EEGNet and SWLDA.
  • Built full preprocessing pipelines for EEG data: signal filtering, epoch extraction, and noise or artifact handling.
  • Generated feature-weight and model interpretability visualizations to better understand neural signal contributions.
  • Contributing to a research paper (pending publication, NeurIPS dataset track involvement).

Software Engineer

Jul 2024 - Present

Big Fin Scientific

Big Fin Scientific logo
  • Built DCS LinkStream 2.0, an Android-based system for ecological and environmental data collection.
  • Integrated multiple hardware sensors into a unified data ingestion pipeline.
  • Designed real-time syncing architecture for low-connectivity and offline-first environments.
  • Improved reliability of field data collection across distributed teams and reduced manual data processing time by 75%.
  • Supported 30+ global clients, including government and research organizations such as USGS and NOAA.
  • Focused on data pipeline design, mobile systems engineering, and real-world deployment constraints.

Machine Learning Intern

Mar 2023 - Jul 2024

4th Vector Technologies

4th Vector Technologies logo
  • Developed computer vision pipelines for industrial defect detection in manufacturing.
  • Used Python and OpenCV to preprocess image datasets, extract features, and train classification models.
  • Built models across multiple defect types with high variability and achieved about 90% defect reduction.
  • Delivered production-ready solutions used in real manufacturing environments.
  • Contributed to $100K+ annual cost savings by preventing defects and recalls.
  • Worked with clients such as Bridgestone and GSK.
Earlier Experience See more

Data Science Research Assistant

Jun 2022 - Jun 2023

UNC Wilmington

UNC Wilmington logo
  • Built machine learning models for detecting atrial fibrillation (AFIB) from ECG signals.
  • Processed physiological time-series data using Pandas and NumPy with signal feature extraction.
  • Trained models with Scikit-Learn, achieving 94% accuracy.
  • Developed a Flask-based web app for real-time integration with wearable devices.
  • Presented research at multiple conferences and received 1st Place Catalyst Award.

Data Center Intern

Jul 2023 - Aug 2023

UNH InterOperability Lab

UNH InterOperability Lab logo
  • Automated SSD testing pipelines for large-scale hardware validation.
  • Built Python scripts to parse JSON logs and structure test outputs.
  • Developed a Flask dashboard for real-time monitoring and anomaly detection.
  • Reduced testing time by 80%.
  • Worked with vendors including Samsung and Google.

Research

EEG-Based Brain-Computer Interfaces (Duke)

Ongoing
  • Focused on P300-based speller systems for assistive communication.
  • Key research problem: improving cross-subject generalization in EEG-based ML models.
  • Compared deep learning (EEGNet) and classical ML (SWLDA).
  • Investigated dataset size versus diversity trade-offs and signal preprocessing impact.
  • Built end-to-end ML pipelines for neural decoding.
  • Generated interpretability visualizations for model understanding.
  • Contributing to bigP3BCI dataset research (NeurIPS submission).

AFIB Detection from ECG Signals (UNCW)

Completed
  • Built ML models for cardiac arrhythmia detection using ECG data.
  • Focused on feature engineering for time-series signals and robust preprocessing.
  • Achieved 94% classification accuracy.
  • Deployed model through a Flask-based application for real-time use.
  • Presented at conferences and received 1st Place Catalyst Award.

Projects

Computer Vision Defect Detection System

Industrial ML
  • Designed an end-to-end defect detection pipeline.
  • Used OpenCV and ML models for classification.
  • Handled multi-class defect scenarios.
  • Reduced defects by 90% in production systems.

DCS LinkStream 2.0

Mobile Data Systems
  • Built an Android-based data collection and syncing system.
  • Integrated heterogeneous sensor inputs.
  • Designed an offline-first architecture for field reliability.
  • Used in real-world environmental deployments.

SSD Analytics Dashboard

Data Platform
  • Built a system to process and visualize large-scale SSD testing logs.
  • Parsed 1,000+ JSON files into structured datasets.
  • Developed a Flask dashboard for real-time analytics.
  • Implemented anomaly detection for hardware testing workflows.

Academic

Duke University

B.S. Electrical & Computer Engineering and Computer Science

Focus Areas

Machine Learning Signal Processing Data Systems Computer Vision

Relevant Coursework

Machine Learning Data Structures & Algorithms Signal Processing Computer Systems

Academic Interests

  • Brain-Computer Interfaces (BCI)
  • Applied Machine Learning
  • Physiological Signal Processing
  • Real-world ML deployment

Additional

  • Contributor to bigP3BCI dataset (NeurIPS submission)
  • Active research in EEG-based neural decoding

High School

High School Name (Placeholder)

Add your high school details, graduation year, awards, and notable coursework here.

Contact

Open to internships, collaborations, and software opportunities.