Platform for AI (PAI) - EAS releases distributed inference
Feb 07 2025
Platform for AI (PAI)Content
Target customers: Customers who use AI inference, model service or AIGC. New features/specifications: With the advent of ultra-large-scale MoE models such as Qwen-max and Deepseek, it is difficult for a single device to handle their huge parameter sizes. EAS offers a multi-machine distributed inference solution that overcomes hardware limitations and efficiently supports the deployment and operation of models with large parameter sizes. EAS supports multiple parallelism methods, such as pipeline parallelism, tensor parallelism, and data parallelism. It is also compatible with high-performance inference engine frameworks, such as BladeLLM, vLLM, and SGLang.
Help Document
https://www.alibabacloud.com/help/pai/user-guide/multi-machine-distributed-inference