Matt McManus

Matt McManus

Quantitative Researcher

Side Projects

Torch-Velocity

Jan 2025

An implementation of speculative decoding with adaptive lookahead mechanisms for LLM inference optimization, achieving 1.5-2.5x speedups on transformer-based models.