\section{Conclusion}
\label{sec:conclusion}

We presented HubKV, a redundancy-aware score refinement layer for KV cache compression. HubKV uses an ideal submodular local coverage objective as a theoretical guide, but deploys only a hardware-parallel proxy: it detects local score hubs, softly discounts neighboring non-hubs, calibrates scores by bounded head selectivity, and gates the correction by compression ratio. On Qwen3-8B, HubKV improves FastKVZip by 3.23 points and KVZip by 6.53 points on average at 95\% prefill compression, and by +1.71/+1.87 points on average across budgets. Decoding-stage experiments show consistent improvements on AIME25 and MATH.

\section{Limitations}
\label{sec:limitations}

HubKV is most useful under aggressive KV compression, where retaining locally redundant tokens is particularly costly. Its benefit depends on the quality of the base importance scorer and on the assumption that nearby high-scoring tokens often carry overlapping information. When adjacent tokens encode complementary evidence, such as formulas, code fragments, or tightly coupled reasoning chains, local discounting may occasionally perturb useful evidence. HubKV should therefore be viewed as a bounded parallel ranking proxy rather than a globally guaranteed submodular optimizer.
