Projects - Matt McManus

Torch-Velocity

Jan 2025

An implementation of speculative decoding with adaptive lookahead mechanisms for LLM inference optimization, achieving 1.5-2.5x speedups on transformer-based models.

Source Code

Side Projects

Torch-Velocity