Service Upgrade

【Model Studio】Qwen 3-Coder-Plus explicit caching feature goes live

Affected Time

2025-09-11 00:00:00 (UTC+08) (subject to actual release time)

Changes

Qwen 3-Coder-Plus introduces more deterministic explicit caching capabilities based on implicit caching, meeting the need to save computing resources in deterministic long-context high-frequency scenarios.

Functional logic differences:

●Implicit caching: No user operation is required. The system automatically determines whether the cache is hit, and tokens that hit the cache enjoy a discounted price.

● Explicit caching: Requires user operation. Users must create cache through an API field during model calls and pay for the created cache tokens. Users specify whether to use the caching feature in subsequent model calls. The system automatically determines whether the cache is hit, and tokens that hit the cache enjoy a discounted price.

● Implicit caching and explicit caching do not share within the same model call, and the discount prices for these two types of caching are different.

Impact

After the update:

The billing model for the Qwen 3-Coder-Plus model has changed.

  • For the Beijing region, this period's newly added explicit cache token types and prices for Qwen 3-Coder-Plus are as follows:

p

  • For the Singapore region, this period's newly added explicit cache token types and prices for Qwen 3-Coder-Plus are as follows:

p

Log in to Model Studio to activate your trial experience. If you have any feedback or suggestions during the trial, please feel free to submit them via a service ticket.