Experience Enhancements

Platform for AI (PAI) - KV Store Global Context Cache Release

When you deploy the LLM service in EAS, you can configure the global online context cache of KV Store to improve the LLM inference throughput.
Content

Optimization: EAS supports configuring the global online context cache of KV Store in the process of deploying LLM service, and improves kvcache hit rate and LLM inference throughput performance through multi-level storage query of GPU → memory → Redis kv metadata.

Help Document

https://alibabacloud.com/help/doc-detail/110985.htm

7th Gen ECS Is Now Available

Increase instance computing power by up to 40% and Fully equipped with TPM chips.
Powered by Third-generation Intel® Xeon® Scalable processors (Ice Lake).

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.