DSW provides a multi-instance interconnection feature that lets you perform distributed development and training across multiple machines and GPUs.
Prerequisites
You must have multiple instances that are created from a general computing resource group or a Lingjun resource group and are located in the same VPC.
The Internet access gateway for the resource group that contains the instances must be set to Dedicated Gateway.
The instances must be in the same cluster. For example, you cannot interconnect Lingjun instances and general computing resource instances.
Only some instance types support Remote Direct Memory Access (RDMA) or enhanced RDMA (eRDMA). For more information, see Default variables (pre-configured by the platform) and Limits.
DSW and DLC provide the same features for RDMA/eRDMA. For more information, see the DLC documentation.
Features
DSW provides pre-configured, high-performance network environment variables that are optimized for different resources and network architectures.
For DSW instances created from Lingjun resources, see the pre-configured environment variables described in Default variables (pre-configured by the platform).
For DSW instances created from general computing resources, see the pre-configured environment variables described in Platform-preconfigured environment variables.
On nodes that support RDMA, you can use RDMA/eRDMA for interconnection.
You can interconnect instances using their instance IDs as DNS domain names.

These features allow you to develop and debug distributed tasks across multiple machines and GPUs.
Procedure
Use the DSW instance cloning feature to start the required number of instances with identical environments.
(Optional) Install the RDMA/eRDMA library on the instances.
For Lingjun resources, use an image that contains RDMA. For more information, see Configure an image.
For general computing resources, follow the instructions in Install the eRDMA library.
From one instance, run the
pingcommand on the instance ID of another instance to test network connectivity. For example:ping dsw-l28wnjdlyzj*********.Configure and debug the distributed task using your chosen distributed framework.