All Products
Search
Document Center

Elastic GPU Service:Terms

Last Updated:Dec 22, 2023

This topic describes terms that are related to Elastic GPU Service.

Elastic GPU Service terms

Term

Description

Graphics Processing Unit (GPU)

GPUs contain more computing units and data pipelines than CPUs and are suitable for scenarios such as large-scale parallel computing.

CUDA

CUDA is a general-purpose parallel computing platform that is developed by NVIDIA for complicated GPU computing.

cuDNN

NVIDIA cuDNN is a GPU-accelerated library of primitives for deep neural networks.

DeepGPU

DeepGPU is a collection of GPU enhancement tools that are provided by Alibaba Cloud for Elastic GPU Service free of charge.

AIACC-Taining

An AI accelerator developed by Alibaba Cloud for distributed trainings to significantly improve training performance.

AIACC-Inference

An AI accelerator developed by Alibaba Cloud for model inference to significantly improve inference performance.

AIACC-ACSpeed

A communication optimization library that is used for the distributed training of AI models released by Alibaba Cloud. ACSpeed can deliver better compatibility, applicability, and performance.

AIACC-AGSpeed

An optimizing compiler for AI training developed by Alibaba Cloud. AGSpeed is used to optimize the computing performance of the PyTorch models on Alibaba Cloud GPU-accelerated compute-optimized instances in an imperceptible manner.

FastGPU

A set of fast deployment tools of AI computing provided by Alibaba Cloud. FastGPU provides convenient interfaces and automatic tools that you can use to deploy AI intelligent training and inference tasks on infrastructure as a service (IaaS) resources of Alibaba Cloud.

cGPU

A kernel-based GPU virtualization and sharing service developed by Alibaba Cloud. cGPU is used to isolate GPU resources to allow multiple containers to share a single GPU.

ECS terms

Term

Description

ECS instance

An ECS instance is a virtual server that includes basic components such as vCPUs, memory, an operating system (OS), network configurations, and disks.

ECS instance type

Instance types define the basic attributes such as computing capacity, storage capacity, and networking performance of ECS instances. Instance types must be used together with images, Elastic Block Storage (EBS) devices, and network resources to create ECS instances that serve different purposes.

image

Images contain information that is necessary to run ECS instances, such as OSs and initialization data of applications.

public image

Public images are base images provided by Alibaba Cloud. Public images are licensed and include Windows Server OS images and mainstream Linux OS images.

Alibaba Cloud Linux

Alibaba Cloud Linux 2 and 3 are OSs provided by Alibaba Cloud. They offer a safe, stable, and high-performance customized environment for applications on ECS instances and are optimized for the infrastructure of Alibaba Cloud.

custom image

You can create or import custom images. Custom images contain the initial system environment, application environment, and related software configurations. This eliminates the need for repeated manual configurations.

Elastic Block Storage device

EBS devices offer high performance and reduce latency. You can partition and format EBS devices and create file systems on the devices to meet the data storage requirements of your business.

disk

Disks are block-level EBS devices that use a triplicate mechanism to ensure 99.9999999% data durability for ECS instances.

local disk

Local disks are located on the same physical server as the ECS instance to which the disks are attached. Local disks are cost-effective and provide high storage I/O. However, the durability of data stored on local disks is determined by the reliability of the associated physical server, which increases the risks of single points of failure (SPOFs).

snapshot

A snapshot is a point-in-time backup of a disk and is used to back up or restore the disk.

security group

A security group is a virtual firewall that is used to control the inbound and outbound traffic of ECS instances in the security group.

SSH key pair

An SSH key pair is a secure and convenient authentication method provided by Alibaba Cloud for instance logons. An SSH key pair consists of a public key and a private key. You can use SSH key pairs to log on to only Linux instances.

Instance RAM role

Instance Resource Access Management (RAM) roles enable ECS instances to assume roles with specific access permissions. An instance can access the APIs of specified Alibaba Cloud services and manage specified Alibaba Cloud resources based on a Security Token Service (STS) temporary credential to ensure high security.

virtual private cloud (VPC)

A VPC is a private network established on Alibaba Cloud. VPCs are logically isolated from each other based on tunnels. You have full control over your VPCs. For example, you can specify CIDR blocks and configure route tables and gateways for your VPCs.

elastic network interface (ENI)

An ENI is an independent virtual network interface that can be bound to or unbound from an ECS instance to implement the flexible scaling and migration of services.

launch template

A launch template contains configuration information that you can use to create ECS instances and eliminates repeated manual configurations.

deployment set

Deployment sets support the high availability strategy. After you apply a high availability strategy to a deployment set, all the instances within the deployment set are distributed across different physical servers to ensure business availability and disaster recovery capabilities at the underlying layer.

dedicated host

A dedicated host is a cloud host whose physical resources are exclusively reserved for a single tenant. Dedicated hosts meet strict security and compliance requirements and support Bring Your Own License (BYOL) when you migrate services to the cloud.

auto provisioning group

Auto provisioning groups support quick deployment of instance clusters across instance types and zones. Auto provisioning groups can create preemptible instances and pay-as-you-go instances by using a combination of provisioning policies to provide high stability at low cost.

tag

Each tag consists of a key and a value. You can add tags to resources that have identical characteristics, such as resources that belong to the same organization and resources that serve the same purpose. You can use tags to search for and manage resources in an efficient manner.

resource group

Resource groups allow you to manage resources across services and regions based on your business requirements and manage the permissions of resource groups.

Cloud Assistant

Cloud Assistant is an automated O&M tool provided by Alibaba Cloud. Cloud Assistant allows you to perform operations such as running commands in ECS instances and sending files to ECS instances without logging on to the ECS instances.

system event

System events are scheduled or unexpected O&M events that affect the running status of ECS instances and require the restart, stop, or release of ECS instances. For system events, ECS sends you notifications that contain information such as solutions and event cycles so that you can back up data and make preparations in a timely manner.