projects

LLM infrastructure, CUDA kernels, and security research.

LLM Infrastructure

tiny-trtllm

Minimal C++ implementation of TensorRT-LLM's core architecture (~3,000 lines)

FP4 Attention Kernel

Porting SageAttention3's FP4 attention from SM120 (B300) to SM100 (B200)

LLM Production Deployment

End-to-end LLM serving infrastructure with Kubernetes and Kong Gateway

AI Security

MCPSec

Formal verification and security analysis of the Model Context Protocol