ShadowKV: A High-Throughput Inference System for Long-Context LLM Inference
Large language models (LLMs) are getting better at scaling and handling long contexts. As they are being used on a large scale, there has been a growing demand for efficient…