6. Training & Inference
Overview
Notes on efficient training and inference: batching, KV cache, quantization, distillation, serving patterns.
Notes on efficient training and inference: batching, KV cache, quantization, distillation, serving patterns.