MLB Pitch Quality (Stuff+) Evaluation System
In progress
Modeling pure stuff quality — independent of location — across 700K+ Statcast pitches.
Independently replicated and extended the FanGraphs Stuff+ methodology across two end-to-end
pipelines trained on 700K+ Statcast pitches (2015–2025): a cascaded ensemble
(Random Forest → contact classifier → LightGBM regressor) and a PyTorch MLP
(256→128→64, GELU, BatchNorm) that isolates pure stuff quality from 12 physical features
with no location confounders.
Applied per-pitch-type normalization (average = 100 per type), GroupKFold cross-validation
grouped by pitcher to prevent leakage, and a gradient-based stability regularizer
penalizing year-over-year inconsistency. Validated via Spearman correlation and quartile
persistence metrics. Deployed on W&M's HPC cluster (Sciclone) via SLURM with
automated multi-year data ingestion, model serialization, and leaderboard generation.
- Python
- PyTorch
- LightGBM
- scikit-learn
- pybaseball
- Statcast
- SLURM
GitHub →
Repository under refinement
Computational Modeling of T₂ Relaxation via Single-Sided NMR
Manuscript in prep
Simulating molecular T₂ relaxation for binary mixtures from first principles.
A research project in the Meldrum Spin Lab predicting T₂ relaxation times of substances and
molecular mixtures from physics-based first principles. Simulates molecular behavior in the
presence of a single-sided NMR magnet using molecular dynamics, with target manuscript
submission in August 2026.
Built an automated simulation pipeline combining OpenMM, MDAnalysis, SLURM, MPI, and tmux on
W&M's HPC cluster, with a validation suite comparing simulated vs. experimental T₂
values across a panel of test mixtures.
- OpenMM
- MDAnalysis
- SLURM
- MPI
- Bash
- Python
- SciPy
Repository private — available on request
Automated Google Business Profile Pipeline
Shipped · Client work
Replacing daily manual social posting with a fully automated content pipeline for a Northern Virginia realtor.
Designed and deployed an end-to-end content automation system for
Andrew Capuano,
a realtor and certified appraiser serving the Gainesville and Bristow area. The pipeline replaces
roughly 20 minutes of daily manual work with a fully hands-off system that publishes professional,
localized Google Business posts every morning.
On a daily schedule, the workflow draws a randomized hook–topic pair from a curated content library,
selects an image at random from a pool of 20 pre-sized assets, generates the post copy via a ChatGPT
step keyed to the day's inputs, and publishes the formatted post directly to the Google Business
Profile API through Zapier. The output is consistent, on-brand, and indistinguishable from manual posting.
- Zapier
- ChatGPT API
- Google Business Profile
- Workflow Automation
- Content Systems
Workflow template →