DSW sub-containers use Docker in Docker (DinD) technology to create and manage multiple secondary containers within a single DSW instance, enabling environment isolation and resource management. The console provides a visual interface to create, start, stop, and delete sub-containers and access them remotely via SSH.
Use cases
-
Isolate different development environments within the same DSW instance to compare and validate training results across different images.
-
When GPU resources are scarce, switch runtime environments by creating sub-containers with different images, which avoids the need to rebuild the instance and re-queue.
-
Sub-containers enable multiple users to share a single DSW instance while maintaining environment and resource isolation between them.
Prerequisites
-
The sub-container feature is supported only on DSW instances created from a Lingjun Resource Group or a version 1.0 general-purpose resource group.
-
The instance administrator (the instance creator or Owner) must have permission to create and manage DSW instances.
-
The DSW instance must have the Enable Multi-Container Isolation (DinD) feature enabled during creation.
-
If you need to access sub-containers via SSH, the SSH feature must also be enabled for the DSW instance.
Roles and permissions
Sub-container management involves two roles:
-
Instance administrator (Owner): The creator of the DSW instance. This role manages instance-level configurations such as enabling DinD, configuring access restrictions, setting the maximum number of sub-containers, and defining data directory isolation rules.
-
Sub-container developer: A user with a corresponding workspace role (such as Algorithm Development) or RAM permissions. This role can view the sub-container list, create sub-containers, and log in to them via SSH. Developers can start, stop, or delete only the sub-containers they created, but can view information about all sub-containers.
Two methods are available for permission control:
-
Workspace role-based control: The instance administrator can restrict specific roles, such as Algorithm Development, from accessing the main container. Users with restricted roles can only perform development work in sub-containers. This method is straightforward and suitable for most scenarios.
-
RAM policy-based control: By configuring a RAM policy, you can achieve finer-grained permission control, down to a specific sub-container of a specific instance. This method is suitable for enterprise customers with strict security requirements.
Administrator operations
Enable multi-container isolation
-
Log on to the PAI console, navigate to the DSW page, and click Create Instance.
-
In the instance configuration settings, find and enable the Enable Multi-Container Isolation (DinD) switch.
-
After you enable this option, the following settings appear:
-
Restrict roles that can access the main container: From the drop-down list, select the workspace roles to restrict. Users with a restricted role cannot directly open or operate the DSW main container and can only perform development work by logging in to a sub-container via SSH.
For example, if you select Algorithm Development, users with this role cannot access the main container. However, they can still create and use their own sub-containers on the Sub-container Management tab of the instance details page.
If you do not select any roles, all authorized users can access the main container as usual, which matches the existing behavior.
-
Maximum Number of Sub-containers: Set the maximum number of sub-containers that can be created for the instance. You can enter an integer from 1 to 16. The default value is 10.
-
Data Catalog Fencing Configuration: Specify the path of the data directory isolation configuration file within the main container. The default path is
/etc/docker/dockerboard/mount_access.json. This file controls which data directories users can mount when they create sub-containers. You can click Configuration Template Download to download the template file. For configuration steps, see Configure data directory isolation.
-
-
Complete the remaining instance configurations and click Create Instance.
Configure data directory isolation
Data directory isolation controls which directories developers can mount when creating sub-containers, preventing access to storage data beyond their authorized scope.
Procedure
-
On the instance creation or editing page, click the Data Catalog Fencing Configuration link under Configuration Template Download to obtain the JSON template file.
-
Follow the template format to map each Alibaba Cloud sub-account UID to the directories within the DSW main container that the user is permitted to access.
-
Place the edited
mount_access.jsonfile in the location specified by the configuration path in the DSW main container. The default location is/etc/docker/dockerboard/mount_access.json.
Configuration file format
The configuration file is in JSON format and uses the Alibaba Cloud sub-account UID as the key. For each user, you can configure Allow (allowed mount directories) and Deny (denied mount directories). An example is as follows:
{
"123124432xxx": {
"Allow": [
"/mnt/workspace",
"/mnt/data1"
],
"Deny": [
"/mnt/data2"
]
},
"224321234xxx": {
"Allow": [
"*"
]
}
}
The following table describes the fields.
|
Field |
Description |
|
Key (for example, |
The UID of the Alibaba Cloud sub-account. |
|
Allow |
A list of main container directories that the user is allowed to mount. Set this to |
|
Deny |
A list of main container directories that the user is forbidden to mount. Deny takes precedence over Allow. If a directory appears in both lists, the user cannot mount it. This field can be omitted. |
Configuration application rules
-
If the configuration file exists and is valid, the system checks whether the user has permission to mount the selected directories when creating a sub-container.
-
If no record for the user is found in the configuration file, the user cannot mount any data directories when creating a sub-container. The page displays the message: "Your account does not have any available DSW paths. If you need to mount a path, contact the instance administrator to add a configuration for you."
-
If the configuration file path is empty or the file does not exist, the system does not perform directory isolation checks, and all users can mount any directory from the main container.
For developers
Access the management page
-
Log on to the PAI console and go to the DSW page.
-
In the instance list, click the name of the target instance with DinD enabled to go to the instance details page.
-
On the instance details page, click the Sub-container Management tab.
The Running tab is visible only when the DSW main container is Running.
Create a sub-container
On the Sub-container Management page, click Create Sub-container and complete the following parameters in the creation form.
Basic information
|
Parameter |
Required |
Description |
Example |
|
Container name |
Yes |
Specify an easy-to-recognize name for the sub-container. |
|
|
Hostname |
No |
Specify a custom hostname for the sub-container. |
|
|
Restart policy |
No |
Set the automatic restart policy to apply when the container exits. |
Do not restart automatically |
Environment information
|
Parameter |
Required |
Description |
Example |
|
Image configuration |
Yes |
Select the image for the sub-container. Two methods are supported:
|
|
|
SSH public key |
Yes |
Enter your local client's SSH public key. You will use this key to log in to the sub-container via SSH. |
If you have not generated an SSH key pair, run |
|
Main container mounts |
No |
For instances with a persistent system disk, a sub-container can mount any path from the main container. For instances without a persistent system disk, a sub-container can only mount storage paths and datasets already mounted by the main container. |
|
|
Dataset mounts |
No |
You can only mount datasets that are already mounted by the main container. |
|
|
Environment variables |
No |
Set custom environment variables for the sub-container in key-value format. |
|
|
Startup command |
No |
Specify a custom command to execute when the container starts. |
|
|
Entrypoint |
No |
Specify a custom entrypoint for the container. |
|
Resource information
|
Parameter |
Required |
Description |
Example |
|
GPU device mounts |
No |
If the DSW instance is equipped with GPU resources, you can select which GPU devices to mount when creating a sub-container. The system lists all available GPU cards and their device numbers. You can select multiple devices. The selected GPU devices are mounted into the sub-container for use. |
|
|
CPU limit |
No |
Set the maximum CPU usage for the sub-container. If left blank, no limit is set, and the sub-container may consume other idle resources in the instance. |
8 cores |
|
Memory limit |
No |
Set the maximum memory usage for the sub-container. If left blank, no limit is set, and the sub-container may consume other idle resources in the instance. |
48 GiB |
After you complete the configuration, click Confirm to create the sub-container. You can view the creation progress by clicking Child Container Creation Progress on the Sub-container Management page.
Manage sub-containers
In the sub-container list, the actions column provides the following operations:
You can manage only the sub-containers that you created.
-
Start: Starts a sub-container that is in the stopped state.
-
Stop: Stops a running sub-container. Processes inside the container are terminated, but the container configuration and mounted storage data are preserved.
-
Restart: Restarts a running sub-container. This is equivalent to stopping and then starting it.
-
Log: View the sub-container's running logs for troubleshooting.
-
SSH Connection Information: View the SSH connection details for the sub-container, including the connection username and command.
-
Delete: Deletes the sub-container. Data not persisted through a storage mount is permanently lost. Proceed with caution.
Sub-container list
After you go to the Sub-container Management page, a list of all sub-containers on the current instance is displayed, showing the following information:
|
Field |
Description |
|
Container name |
The name of the sub-container. |
|
User name |
The username of the user who created the sub-container. |
|
Status |
The current status, including: Creating (pulling image), Created, Running, Paused, Restarting, Stopped, Starting, Stopping, and Updating. |
|
Video memory (GiB) |
The GPU device number and video memory usage. |
|
CPU resources |
The CPU utilization and core count. |
|
Memory (GiB) |
The memory usage. |
|
Container image |
The image used by the sub-container. |
|
Creation time |
The time when the sub-container was created. |
All users with read permissions on the DSW instance can view the complete sub-container list, but can only manage (start, stop, or delete) the sub-containers they created.
Creation task list
Creating a sub-container can be time-consuming, especially when pulling large images. The system provides a way to track creation tasks.
In the upper-right corner of the Sub-container Management page, click Child Container Creation Progress to view the status and logs of all container creation tasks. You can manually stop tasks that are in the "Creating" state. Task records are automatically deleted after 3 days.
Connect via SSH
After a sub-container is created and running, you can connect to it remotely via SSH.
Procedure
-
In the sub-container list, find the target sub-container and click SSH Connection Information in the actions column. A dialog opens, showing the connection username and the full connection command.
-
Run the connection command in your local terminal to log in to the sub-container. The command format is as follows:
ssh <connection_username>@<instance_public_address> -p 22 -i <local_private_key_path>
The following table describes the parameters.
|
Parameter |
Description |
|
Connection username |
The username is automatically generated and bound to a specific sub-container. It is displayed in the SSH Connection Information dialog. |
|
Instance public address |
The public IP address of the DSW main container. |
|
Local private key path |
The path to the local private key file that corresponds to the SSH public key you provided when creating the sub-container (for example, ~/.ssh/id_rsa). |
After a successful SSH login, you are connected directly to the sub-container's command-line environment. All sub-containers share port 22 of the DSW main container, but access is isolated through unique SSH keys and dedicated usernames.
Limitations and considerations
-
Dependency on main container status: All sub-container management capabilities depend on the running state of the DSW main container. If the main container is stopped or encounters an error, the Sub-container Management page becomes unavailable, and sub-containers cannot be accessed.
-
Storage mount scope: Sub-containers can only mount directories from within the main container. They cannot directly mount external storage.
-
Sub-container limit: The number of sub-containers per instance is limited by the administrator-configured maximum, which can be up to 16.
-
Feature scope: Sub-containers do not provide the full development capabilities of a DSW main container, such as built-in applications like JupyterLab, Terminal, or WebIDE. They are accessed via SSH.
-
Data persistence: When a sub-container is deleted, any data not saved to a mounted directory is permanently lost. Store important data in mounted directories.
-
Effect of instance deletion: When a DSW instance is deleted, all its sub-containers are also deleted. Back up your data before deleting an instance.
FAQ
Q: Why is the "Sub-container management" tab missing?
A: Please check the following:
-
The DSW instance was created with the Enable Multi-Container Isolation (DinD) feature enabled.
-
The DSW main container is currently in the Running state. The Sub-container Management tab is not displayed when the main container is not running.
Q: Why can't I mount a data directory?
A: This may be because the instance administrator has configured a data directory isolation policy and your Alibaba Cloud account has not been assigned any available directories in the configuration file. Contact the instance administrator to add a configuration for your account in the mount_access.json file.
Q: Why is my SSH connection failing?
A: Please check the following in order:
-
Ensure the sub-container is in the Running state.
-
Ensure the local private key you are using matches the public key you provided when creating the sub-container.
-
Ensure the SSH feature is enabled for the DSW instance.
-
Ensure your network allows access to the instance's public IP address on port 22.
Q: Why is sub-container creation slow?
A: The creation time primarily depends on image pull speed. Pulling a large image for the first time can take a while. You can check the progress and logs in the creation task list. Subsequent containers using the same image are created much faster because the image is cached locally.
Q: Can I manage another user's sub-containers?
A: By default, you can only manage sub-containers that you created, but you can view basic information for all sub-containers. To manage sub-containers created by others, an administrator must grant you the necessary permissions through a RAM policy.