All Products
Search
Document Center

Elastic GPU Service:Create a GPU-accelerated instance

Last Updated:Jul 08, 2024

GPU-accelerated instances provide strong computing and graphics processing capabilities in large-scale parallel computing and graphics rendering scenarios. You can use GPU-accelerated instances to improve computing performance for your business and meet requirements in professional graphics design scenarios. This topic describes how to create a GPU-accelerated instance.

Procedure

  1. Go to the instance buy page in the Elastic Compute Service (ECS) console.

  2. Click the Custom Launch tab.

  3. Configure parameters for the instance based on your business requirements. The parameters include Billing Method, Region, Network and Zone, Instance Type, and Image.

    For more information about the parameters, see Parameter settings.

  4. Before you place the order, confirm that the overall settings of the instance meet your business requirements and specify items such as the usage duration.

    The operations to configure the usage duration of an instance vary based on the billing method of the instance.

    • Pay-as-you-go or preemptible instance: Specify an automatic release time for the instance. You can also manually release the instance or specify an automatic release time for the instance after the instance is created. For more information, see Release instances.

    • Subscription instance: Specify the usage duration and whether to enable auto-renewal. You can also manually renew the instance or enable auto-renewal for the instance after the instance is created. For more information, see Renewal overview.

  5. Read ECS Terms of Service and Product Terms of Service. If you agree to the terms, select the check box on the left of them.

  6. Click Create Order.

  7. On the payment page, check and confirm the total fees of the instance. Then, follow the instructions to complete the payment.

Parameter settings

Billing Method

Billing methods affect the billing and charging rules of instances. The resource status change rules of instances vary based on the instance billing method.

Billing Method

Description

References

Subscription

A billing method that requires you to pay for resources before you use them.

Subscription

Pay-as-you-go

A billing method that allows you to pay for resources after you use them. The billing cycles of pay-as-you-go instances are accurate to the second. You can purchase and release instances on demand.

Note

To reduce costs, we recommend that you use the pay-as-you-go billing method together with savings plans and reserved instances.

Preemptible Instance

A billing method that allows you to pay for resources after you use them. Compared with pay-as-you-go instances that do not have discounts, you can bid for the unused ECS capacity to create preemptible instances at discounts. Preemptible instances may be automatically released due to fluctuations in market price or insufficient resources for specific instance types.

Preemptible Instance

Region

Regions are geographical locations where Alibaba Cloud data centers are deployed. Select a region that is close to your geographical location to reduce latency. After an instance is created, the region of the instance cannot be changed. For more information, see Regions and Zones.

Network and Zone

We recommend that you specify a virtual private cloud (VPC). VPCs are logically isolated from one another, ensure enhanced security, and support features such as Elastic IP Address (EIP), IPv6, and Elastic Network Interface (ENI).

A region consists of multiple isolated locations known as zones. A zone is a physical area with an independent network and power supply. Resources that are deployed within the same zone share the network and have minimal latency between each other. Services deployed within the same zone provide faster communication speeds, which allows for more efficient business operations.

Network type

Description

References

VPC

A VPC is an isolated network dedicated for your use. You have full control over your VPC. For example, you can specify a private CIDR block and configure route tables and gateways for the VPC.

If you did not create a VPC in the selected region, skip this step. The system automatically creates a default VPC and vSwitch in the region.

Select an existing VPC and an existing vSwitch. Alternatively, click Create VPC and Create vSwitch to create a VPC and a vSwitch in the VPC console. After the VPC and vSwitch are created, go back to the ECS instance buy page and click the refresh icon to obtain the most recent list of VPCs and vSwitches.

Note

If you want to assign an IPv6 address to the instance, select a VPC and a vSwitch that have an IPv6 CIDR block enabled.

Classic Network

Instances in the classic network are deployed in the public infrastructure of Alibaba Cloud, and are planned and managed by Alibaba Cloud.

Note

The first time you purchase an ECS instance after June 16, 2016 12:00 (UTC +8), you cannot select the classic network.

Network types

Instances & Images

An instance type and an image determine the basic attributes of an instance, such as the vCPUs, memory, and OS.

Instance Type

Available instance types vary based on the selected region. You can go to the Instance Types Available for Each Region page to view the instance types available in each region.

You may have specific configuration requirements for the instance. For example, you may want the instance to have multiple ENIs bound, use Enterprise SSDs (ESSDs), or use local disks. In this case, make sure that the selected instance type meets your business requirements. For information about the features, supported scenarios, and specifications of instance types, see Overview of instance families.

If you set Billing Method to Preemptible Instance, configure the Instance Usage Duration and Highest Price per Instance parameters.

  • Instance Usage Duration specifies the protection period of a preemptible instance. After the protection period ends, the instance may be released due to insufficient resources or a lower bid than the market price.

    Option

    Description

    1 Hour

    After a preemptible instance is created, a 1-hour protection period starts. During the protection period, the instance cannot be automatically released.

    None

    A preemptible instance is created without a protection period. Preemptible instances without a protection period are more cost-effective than preemptible instances with a protection period.

  • Highest Price per Instance

    Option

    Description

    Use Automatic Bid

    The real-time market price of an instance type is automatically used. The price may vary but cannot exceed the pay-as-you-go price of the instance type. Automatic bidding can prevent the preemptible instance from being released due to lower bids than the market price, but cannot prevent the instance from being released due to insufficient resources.

    Set Maximum Price

    Specify a maximum price. If the real-time market price exceeds the maximum price or if resources are insufficient, the preemptible instance is released.

After you configure the instance specifications, you can confirm them by checking the value of the Selected parameter.

Image

Images contain the information required to run instances. Alibaba Cloud provides a variety of image sources for easy access to images. The following table describes the image sources.

Image source

Description

Public image

Public images are fully licensed base images provided by Alibaba Cloud. The images include Windows Server OS images and mainstream Linux OS images.

Custom image

You can create a custom image or import an image as a custom image. Custom images contain initial system environments, application environments, and software configurations. This eliminates the need for repetitive manual configurations.

Shared image

Shared images are custom images shared by other Alibaba Cloud accounts. You can use the images shared with you to create instances.

Alibaba Cloud Marketplace image

Alibaba Cloud Marketplace images are strictly reviewed and have a wide range of types. You can use the images to quickly deploy cloud servers in scenarios, such as website construction and application development scenarios.

Community image

Community images are publicly available. Custom images can be published as community images and used by other users in the Alibaba Cloud community.

When you select an image, you can select Auto-install GPU Driver based on your business requirements to automatically install an NVIDIA Tesla driver. You can also select a free image that is pre-installed with an NVIDIA GRID driver to simultaneously load the driver. For more information, see Automatically install or load a Tesla driver when you create a GPU-accelerated instance and Load a GRID driver by using a community image pre-installed with the driver.

Storage

Instances provide storage capabilities based on the system disks, data disks, and Apsara File Storage NAS (NAS) file systems that are attached to the instances. ECS provides cloud disks and local disks to meet the storage requirements of different scenarios.

  • Cloud disks include ESSDs, standard SSDs, and ultra disks and can be used as system disks or data disks. For more information, see Disks.

    Note

    The billing method of a cloud disk that is created along with an instance is the same as that of the instance.

  • Local disks can be used only as data disks. If an instance type, such as an instance type of an instance family with local SSDs or a big data instance family, is equipped with local disks, information about the local disks is displayed. For more information, see Local disks.

    Note

    You cannot manually attach local disks to instances.

System Disk

System disks are used to install operating systems. The default capacity of a system disk is 40 GiB. However, the actual minimum capacity varies based on the image type. The following table describes the capacity ranges of system disks for different types of images.

Image

System disk capacity (GiB)

Linux (excluding FreeBSD and Red Hat)

[max{20, Image size}, 2,048]

FreeBSD

[max{30, Image size}, 2,048]

Red Hat

[max{40, Image size}, 2,048]

Windows

[max{40, Image size}, 2,048]

(Optional) Data Disk

Data disks are used to store application data. When you add a data disk, you can encrypt the disk to meet data security and regulatory compliance requirements. For information about data encryption, see Overview.

Note

The number of data disks that can be attached to a single instance is limited. For more information, see the Block storage limits section in the "Limits" topic.

(Optional) Snapshot

A snapshot is a point-in-time backup of a disk. You can quickly import data by creating a disk from a snapshot. You can use automatic snapshot policies to periodically back up disks to prevent risks such as accidental data deletion.

Select an existing snapshot policy or click Create Automatic Snapshot Policy to create an automatic snapshot policy on the Snapshots page. For more information, see Create an automatic snapshot policy. After the automatic snapshot policy is created, go back to the ECS instance buy page and click the refresh icon to obtain the most recent list of automatic snapshot policies.

Important

You are charged for snapshots. For information about the billing of snapshots, see Snapshots.

(Optional) NAS File System

If you have a large amount of data to share among multiple instances, we recommend that you use NAS file systems to reduce costs in data copying and synchronization.

Select an existing NAS file system or click Create File System to create a NAS file system in the Apsara File Storage NAS console. For more information, see the Create a General-purpose NAS file system in the NAS console section in the "Create a file system" topic. After the NAS file system is created, go back to the ECS instance buy page and click the refresh icon to obtain the most recent list of NAS file systems. For information about how to mount a NAS file system to an instance, see Mount NAS file systems when you purchase an ECS instance.

Bandwidths & Security Groups

You can configure network bandwidth and security group settings to allow the instance to communicate with the Internet and other Alibaba Cloud resources and ensure the instance security in network communications.

(Conditionally required) Public IP Address

To allow the instance to access the Internet, you must assign a public IP address to the instance. You can select Assign Public IPv4 Address when you create an instance to have a public IP address automatically assigned to the instance. Alternatively, you can configure an EIP or a Network Address Translation (NAT) gateway after an instance is created to provide Internet access for the instance. You must separately purchase an EIP and a NAT gateway. For more information, see What is an EIP? and What is NAT Gateway?

Select Assign Public IPv4 Address and specify Bandwidth Billing and Bandwidth or Peak Bandwidth.

For information about the billing of the public bandwidth, see Public bandwidth.

Bandwidth billing method

Description

Pay-by-bandwidth

You are charged based on a specified bandwidth value. The actual outbound public bandwidth is capped at the specified bandwidth value.

  • Pay-by-bandwidth is suitable for scenarios that require stable bandwidth.

  • If your instance frequently communicates with external networks and requires long-term use of bandwidth or if the public bandwidth utilization of your instance exceeds 10%, we recommend that you select pay-by-bandwidth as the billing method for network usage.

Pay-by-traffic

You are charged based on the actual traffic volume. To prevent excessive fees that are caused by traffic bursts, you can specify a maximum bandwidth for outbound traffic.

  • Pay-by-traffic is suitable for scenarios in which bandwidth demands fluctuate.

  • If your instance has a public bandwidth utilization that does not exceed 10% and experiences occasional traffic spikes, we recommend that you select pay-by-traffic as the billing method for network usage.

(Optional) Select Upgrade to CDT for Data Transfer Billing. Cloud Data Transfer (CDT) provides an efficient and cost-effective method for managing public bandwidth expenses. It supports flexible billing, free data transfer quota, tiered pricing, and unified billing for multiple Alibaba Cloud services. Compared with the pay-by-traffic billing method, the CDT billing method provides specific discounts. For more information, see What is CDT?

Important
  • CDT activation is not subject to any restrictions.

  • CDT is an out-of-the-box service and does not involve resource creation. Therefore, you are not charged for the activation or use of the CDT service. The fees that are displayed in CDT bills are incurred when other Alibaba Cloud products consume data transfer resources.

  • After CDT is activated and used to bill data transfers for Alibaba Cloud products, all existing and new pay-by-data-transfer instances are automatically billed in CDT in a centralized manner, and pay-by-bandwidth instances continue to be billed in the Alibaba Cloud products. You can choose Expenses > User Center to go to the Expenses and Costs console and click Bill Details to view the billing data in CDT.

  • Check whether you have a pay-by-data-transfer instance. If you only activate CDT but do not have a pay-by-data-transfer instance, you do not enjoy the data transfer benefits of CDT.

  • After CDT is activated, you can obtain a quota of 200 GB free Internet data transfers per month. Among the 200 GB of free Internet data transfers, 20 GB can be used in global regions (including regions in the Chinese mainland) and the other 180 GB can be used only in regions outside the Chinese mainland.

Security Group

A security group is a virtual firewall that is used to control the inbound and outbound traffic of instances in the security group. For more information, see Overview.

If the selected VPC does not have a security group, the system automatically creates a default security group. The default security group allows inbound traffic over Secure Shell Protocol (SSH) port 22, Remote Desktop Protocol (RDP) port 3389, and Internet Control Message Protocol (ICMP). You can modify the security group configurations after the security group is created.

You can also select an existing security group or click Create Security Group to create a security group based on your business requirements. When you create a security group, you must configure Security Group Name, Security Group Type, and Open IPv4 Ports.

Note

For more information about how to configure a security group, see Create a security group.

(Optional) ENI

ENIs include primary ENIs and secondary ENIs. Primary ENIs cannot be unbound from instances. They can be created and released only along with instances. Secondary ENIs can be bound to or unbound from instances to allow traffic to be switched between instances. To create a secondary ENI when you create an instance, click the add-nic icon and select a vSwitch to which the secondary ENI belongs.

Note

You can bind only one secondary ENI when you create an instance. You can also create secondary ENIs and bind them to an instance after the instance is created. For information about the number of ENIs that can be bound to an instance of each instance type, see Overview of instance families.

(Optional) IPv6

After you enable IPv6, the public IPv4 address depletion issue is resolved, and a variety of devices can access the Internet.

Select Assign IPv6 Address Free of Charge. After you assign an IPv6 address, you must log on to the instance and configure an IPv6 address in the operating system to use the IPv6 address. For more information, see Configure an IPv6 address for an ECS instance.

Management

Management settings include the Logon Credential and Tag parameters for remote connection to instances and easy retrieval and management of resources.

Logon Credential

Logon credentials are used to securely log on to an instance. For information about how to connect to the instance, see Connection method overview.

Logon credential

Description

Key Pair

Note

You can use key pairs to log on to only Linux instances.

Select a username to use to log on to the instance. Then, select an existing key pair or click Create SSH Key Pair to create a key pair. After the key pair is created, go back to the ECS instance buy page and click the refresh icon to obtain the most recent list of key pairs. For more information, see Create an SSH key pair.

You can set Logon Username to root or ecs-user.

Warning

If you log on to an ECS instance as the root user, you have the highest permissions on the instance. However, security risks may arise. We recommend that you log on to the ECS instance as the ecs-user user.

Password Preset in Image

Note

Only Custom Image and Shared Image support this authentication method.

To use the password preset in the selected image to log on to the instance, select this authentication method. If you want to select this option, make sure that your selected image has a password preset.

Custom Password

Enter and confirm a password. Then, set Logon Username.

  • For Linux instances, set Logon Username to root or ecs-user.

    Warning

    If you log on to an ECS instance as the root user, you have the highest permissions on the instance. However, security risks may arise. We recommend that you log on to the ECS instance as the ecs-user user.

  • For Windows instances, a default value of administrator is used for Logon Username.

Set Later

After the instance is created, bind a key pair or reset the instance password. For more information, see Bind an SSH key pair and Reset the logon password of an instance.

Tag

Each tag consists of a tag key and a tag value. You can add tags to resources that have identical characteristics, such as resources that belong to the same organization and resources that serve the same purpose. You can use tags to search for and manage resources in an efficient manner. For information about tags, see Tags.

Select or create tags.

(Optional) Advanced Settings

Advanced settings include Hostname, Metadata Access Mode, and User Data. You can use advanced settings to customize the information displayed or usage method of the instance in the console and OS.

Parameter

Description

Instance Name, Description, Hostname, and Sequential Suffix

If you want to create multiple instances, you can configure sequential instance names and hostnames to facilitate management. For more information, see Batch configure sequential names or hostnames for multiple instances.

Instance RAM role

An ECS instance can assume an instance Resource Access Management (RAM) role to obtain the permissions of the role. Then, the instance can securely make API requests to specific Alibaba Cloud services and manage specific Alibaba Cloud resources based on the Security Token Service (STS) temporary credentials of the role.

Select an existing instance RAM role, or click Create Instance RAM Role to create an instance RAM role in the RAM console. After the instance RAM role is created, go back to the ECS instance buy page and click the refresh icon to obtain the most recent list of instance RAM roles. For more information, see Grant ECS access to resources of other Alibaba Cloud services by using instance RAM roles.

Metadata Access Mode

ECS instance metadata contains information about instances in Alibaba Cloud. You can view the metadata of running instances and configure or manage the instances based on their metadata. For more information, see Obtain instance metadata.

User Data

User data can be run as scripts on ECS instance startup to automate instance configurations or be passed to ECS instances as regular data. For more information, see Instance user data.

If you do not select Auto-install GPU Driver in the Image section and you want to install your NVIDIA Tesla driver by using an automatic installation script, you can enter the script in the field in the User Data section. For more information, see Install a driver by using an automatic installation script.

Note

Enter the user data that you have prepared. If the user data is encoded in Base64, select Enter Base64 Encoded Information.

Resource Group

Resource groups allow you to manage resources across regions and services based on your business requirements. You can also manage the permissions of resource groups. For more information, see Resource groups.

Select an existing resource group, or click Create Resource Group to create a resource group in the Resource Management console. After the resource group is created, go back to the ECS instance buy page and click the refresh icon to obtain the most recent list of resource groups. For more information, see Create a resource group.

Deployment Set

Deployment sets support the high availability strategy. After you apply the high availability strategy to a deployment set, all instances in the deployment set are distributed across different physical servers to ensure the high availability of your business and implement underlying disaster recovery.

Select an existing deployment set, or click Manage Deployment Sets to create a deployment set. After the deployment set is created, go back to the ECS instance buy page and click the refresh icon to obtain the most recent list of deployment sets. For more information, see Create a deployment set.

Dedicated Host

A dedicated host is a cloud host whose physical resources are exclusively reserved for a single tenant. Dedicated hosts meet strict security compliance requirements and support Bring Your Own License (BYOL) when you migrate services to Alibaba Cloud.

Select an existing dedicated host, or click Create Dedicated Host to create a dedicated host. After the dedicated host is created, go back to the ECS instance buy page and click the refresh icon to obtain the most recent list of dedicated hosts. For more information, see Create a dedicated host.

Private Pool Type

After an elasticity assurance or a capacity reservation is created, the system automatically generates a private pool to reserve resources for a specific number of instances that have specific attributes. During the validity period of the elasticity assurance or capacity reservation, you can access the resources reserved in the private pool when you want to create instances. For more information, see Overview.

Note

Only pay-as-you-go instances can be created from the resources reserved by elasticity assurances or capacity reservations.

  • Open: Compared with the capacity in public pools, the system preferentially uses the capacity in open private pools. If no capacity is available in open private pools, the system attempts to use the capacity in public pools.

  • None: The system does not use the capacity in private pools.

  • Targeted: The system uses the capacity of a specified private pool or an open private pool to create instances. If no capacity is available in the specified private pool, the instances cannot be created.

What to do next

  • Connect to the instance.

    You can choose from a variety of tools such as Workbench, Virtual Network Computing (VNC), and third-party client tools to connect to the instance. For more information, see Connection method overview.

  • Install a driver on the instance.

    If a Tesla or GRID driver is not automatically installed when the GPU-accelerated instance is created, you must install a Tesla or GRID driver that matches the instance based on your business requirements. This way, the GPU-accelerated instance can provide high-performance features as expected. For more information, see Installation guideline for NVIDIA Tesla and GRID drivers.

References

  • You can create a GPU-accelerated instance by calling an API operation. For more information, see RunInstances or CreateInstance.

  • You can enable, hibernate, restart, release, or stop a GPU-accelerated instance. For more information, see Manage a GPU-accelerated instance.

  • After you deploy an NVIDIA GPU Cloud (NGC) environment on a GPU-accelerated instance, developers can access the optimized deep learning framework at the earliest opportunity. This greatly reduces the time spent on product development and business deployment. For more information, see Deploy an NGC environment on a GPU-accelerated instance.

  • You can troubleshoot feature or operation issues that occur when you use a GPU-accelerated instance. For more information, see Elastic GPU Service FAQ.