| | Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving (github.com/kvcache-ai) |
| 8 points by sarkory on April 19, 2025 | past |
|
| | Show HN: KTransformers:671B DeepSeek-R1 on a Single Machine-286 tokens/s Prefill (github.com/kvcache-ai) |
| 14 points by sssummer on Feb 10, 2025 | past |
|
| | Show HN: KTransformers–236B Model and 1M Context LLM Inference on Local Machines (github.com/kvcache-ai) |
| 20 points by sssummer on Aug 29, 2024 | past | 3 comments |
|
| | Mooncake: A KVCache-Centric Disaggregated Architecture for LLM Serving (github.com/kvcache-ai) |
| 13 points by zinccat on June 29, 2024 | past |
|