Projects

PhotoPrism MLOps: Semantic Photo Search with Continuous Learning

Spring 2026
MLOps Systems ProjectGitHubDeep dive →
  • Built a self-hosted AI photo search engine on Chameleon Cloud Kubernetes combining CLIP ViT-B/32 retrieval, Qdrant HNSW indexing, and a Qwen2-VL-2B multimodal reranker that continuously fine-tunes from implicit user click feedback
  • Designed and published Flickr30K-CFQ, a training dataset of 31,783 images with 5 query types per image (raw sentence, paraphrase, fragment, phrase, tag) addressing the caption-to-keyword distribution mismatch in standard retrieval benchmarks
  • Adapted the POLAR (CVPR 2025) paradigm to the reranking stage: frozen CLIP handles first-stage ANN retrieval while LoRA adapters (r=8, ~0.2% trainable params) on Qwen2-VL-2B rescore candidates; retraining auto-triggers at 100 click events and redeploys in minutes via Docker
  • Engineered an async ingest pipeline using FOR UPDATE SKIP LOCKED for parallel feature-worker scaling and a 5-stage semantic search stack with graceful fallback, keeping end-to-end latency under 400ms with GPU reranking active
Stack: Python, PyTorch, CLIP, Qwen2-VL-2B, LoRA (PEFT), Qdrant, Kubernetes, Chameleon Cloud, MLflow, FastAPI, Prometheus

Low-Field to High-Field MRI Super-Resolution with Task-Adaptive Transformers

Spring 2026
Independent Neuroinformatics ProjectGitHub
  • Independently designed a full medical imaging pipeline to enhance 64 mT MRI scans into 3 T like images using a transformer-based AMIR architecture
  • Built preprocessing and augmentation pipelines, including slice-to-volume reconstruction, spatial resampling, and synthetic low-field generation from external IXI datasets
  • Trained a 22M-parameter transformer model on ~200 paired subjects, achieving a mean test PSNR of 18.64 dB and max PSNR of 43.21 dB, SSIM of 0.544, demonstrating significant enhancement in image quality and structural similarity
Stack: Python, PyTorch, Nibabel, NumPy, SciPy, CUDA

Chinchilla-Optimal Transformer Pre-training for Music

Fall 2025
Independent ML Systems ProjectGitHub
  • Independently designed and trained decoder-only Transformer models (NanoGPT) on the Lakh MIDI Dataset
  • Optimized training throughput on NVIDIA H100 GPUs using BFloat16 mixed precision, Flash Attention, and torch.compile
  • Achieved a test perplexity of 2.20 with 100% syntactically valid output
Stack: Python, PyTorch, CUDA, Flash Attention

Toy Load Balancer with Consistent Hashing

2024
Systems Engineering ProjectGitHub
  • Built a custom load balancer implementing consistent hashing to distribute traffic across dynamic server nodes
  • Containerized the entire architecture (API Gateway, Nodes, Analytics) using Docker Compose for easy deployment
  • Implemented a management API to dynamically add/remove servers and visualize request rebalancing in real-time
Stack: Docker, Python, Node.js, Shell