svah-x
Waterloo · Math · ML Systems

Kelvin Peng

Math (Combinatorics & Optimization + Statistics) @ University of Waterloo. Currently focused on world-model reinforcement learning, topology-guided optimization, and efficient LLM fine-tuning.

Research Focus
World Models · TDA · LLMs
Stack
PyTorch · JAX · DeepSpeed

Technical Skills

Languages

Python C/C++ Racket SQL Bash LaTeX

ML Frameworks

PyTorch JAX DeepSpeed HuggingFace TensorBoard Gradio

RL & Simulation

Isaac Sim Isaac Lab Gymnasium MuJoCo DMControl

Mathematics

Combinatorial Opt Graph Theory TDA Cryptography

Deep Learning

CNNs Vision Transformers Transfer Learning QLoRA

RL Methods

DQN PPO World Models MPC CEM

Research Tools

GUDHI Ripser WandB RunPod

About

I work at the intersection of mathematical theory and ML engineering. My research focuses on problems where rigorous foundations meet practical implementation: training stability, loss landscape geometry, and building reproducible research prototypes.

Currently exploring how topological data analysis can inform optimizer behavior, and how world models can learn latent dynamics in high-frequency environments. I care about evaluation discipline and making research code that others can actually run.

  • 0
    Research projects
  • 0
    Open-source repos

Education

University of Waterloo
Bachelor of Mathematics — Combinatorics & Optimization
Sept 2023 — Expected 2027
Relevant Coursework
Graph Theory Convex Optimization Applied Cryptography Quantum Info Processing Number Theory Linear Algebra

Awards

#1
Euclid Mathematics Contest
2021 — 2022
School Champion (2x), Honour Roll, Top 1 in BC Province
H
Canadian Senior Math Contest
2022
School Champion, Honour Roll

Open-Source Courses

Educational resources for deep learning and reinforcement learning.

Deep Learning · PyTorch
GitHub →

PyTorch Deep Learning Course

10-chapter hands-on course from tensors to deployment. Covers CNNs, transfer learning, Vision Transformers (ViT), experiment tracking with TensorBoard, and model deployment with Gradio.

PyTorch CNNs Vision Transformers Transfer Learning TensorBoard
RL · World Models
GitHub →

Reinforcement Learning & World Models Course

4-phase course from RL fundamentals to robotics-scale world models. Covers DQN, PPO, model-based planning (MPC, CEM), and Isaac Lab integration for sim-to-real robotics.

DQN / PPO World Models MuJoCo Isaac Lab Sim-to-Real

Research Projects

Selected work in reinforcement learning, optimization, and LLM systems.

World Models · Reinforcement Learning
Research prototype
GitHub →

Geometry Dash World-Model Agent (DreamerV3-style)

A DreamerV3-style agent for a 60Hz physics-driven game environment with tight failure constraints, built with a custom Gymnasium stack, Windows↔WSL synchronization, and high-frequency logging for reproducible evaluation.

JAX DreamerV3-style Gymnasium Windows↔WSL bridge High-frequency logger
Environment
Custom Gymnasium env + reproducible evaluation harness.
Systems
Windows↔WSL bridge to sync observations/state and actions.
Debugging
High-frequency trajectories for offline analysis and sanity checks.
Optimization · TDA
GitHub →

TopoAdamW: TDA-Guided Meta-Optimizer

A PyTorch optimizer that uses GUDHI-based TDA features to probe local loss-landscape geometry (sharp vs. flat regions) and adapt update behavior with stability safeguards.

Method
Topological feature extraction
Benchmark
CIFAR-10 vs AdamW
PyTorch GUDHI Loss landscape
LLM Systems

Efficient LLM Fine-Tuning

Memory-efficient fine-tuning pipelines for Dream-7B and GPT-OSS-20B using QLoRA, gradient checkpointing, and DeepSpeed optimizations.

Math Reasoning
+20%
VRAM Reduction
60%
DeepSpeed QLoRA 4-bit RunPod

Get in Touch

Open to research collaborations, internship opportunities, and interesting projects. Feel free to reach out if you'd like to discuss ML research or mathematics.

[email protected]
Waterloo, Ontario, Canada