LLM Infrastructure tiny-trtllm Minimal C++ implementation of TensorRT-LLM's core architecture (~3,000 lines) FP4 Attention Kernel Porting SageAttention3's FP4 attention from SM120 (B300) to SM100 (B200) LLM Production Deployment End-to-end LLM serving infrastructure with Kubernetes and Kong Gateway AI Security MCPSec Formal verification and security analysis of the Model Context Protocol