🤗 Hugging Face 6d ago
KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs
KV Packet enables context-independent KV cache reuse without recomputation by wrapping cached documents in trainable soft-token adapters. Unlike CacheBlend or SAM-KV which still require selective recomputation, KV Packet treats caches as immutable packets and uses self-supervised distillation to bridge context discontinuities with zero FLOPs overhead.