Alibaba Cloud experts show you the way to the cloud for chip design
Introduction: On September 20, 2022, the program "Alibaba Cloud EDA Cloud Solution" was officially launched. Three experts from Alibaba Cloud showed you how Alibaba Cloud helped chip design enter the "cloud highway" from multiple perspectives. Shao Qi, an architect of Alibaba Cloud's elastic computing product solutions, was the first guest to bring the sharing. He brought the theme "Flexible, Secure, High Performance: Alibaba Cloud EDA Cloud Solutions". The following is a summary of his speech for reading:
01 Industry Overview
1. Computing challenges of typical chip design
The computational challenges of typical chip design are mainly reflected in three aspects:
◾ Fast growth: The improvement of chip manufacturing process has led to a significant increase in computing power, and the computing resources consumed by EDA simulation tools are also significantly increased;
◾ Inaccurate calculation: commercial software cannot estimate the demand for computing power, resulting in budget deviation;
◾ Urgent project: the streaming time cannot be delayed, the development progress is uncontrollable, and the project is greatly affected by various aspects.
The following are typical computing challenges that a chip may face during development:
At the beginning of the chip design, the simulation budget was set at the scale of 10000 cores, including 3000 cores at the front end and 7000 cores at the back end; As the front end entered the operation stage, it was found that the demand for the front end had reached 7000 cores, exceeding the original estimate, and began to divert the budget originally used for the back end; When the back-end task enters the operation stage, the computing power is insufficient, and the research and development wait for the simulation results; If the project falls behind the plan, the project team starts to supplement the budget for procurement; When the back-end task continues, some bugs may cause the front-end verification task to be restarted, further straining the computing power; With the completion of chip streaming, most of the computing power is released, and the device enters the idle state.
As shown in the figure above, there will be peaks and valleys of computing power in the entire chip R&D process, and there is an obvious demand for resource flexibility in chip simulation business.
2. Analysis of pain points in integrated circuit industry
(1) Time
◾ EDA verification takes a lot of time, and the lack of resources will lead to the failure of convergence of the verification work;
◾ The hardware equipment procurement cycle is long, and the deployment and construction cost a lot of time;
(2) Cost
◾ The task has obvious wave crest characteristics, and the cost of holding a large number of hardware for a long time is high;
◾ How to accurately calculate the project cost, especially the cost analysis caused by the occupation of IT resources;
◾ Startups need to spend more money on license and IP procurement;
(3) Security
◾ The architecture design is mainly implemented in Word, which is easy to leak;
◾ Data delivery is complex and massive, there are many authorization review links, and there are loopholes in control;
(4) Multi regional cooperation
◾ Multi regional office collaboration;
◾ Home office environment and safety control.
02 EDA Cloud Solution
1. EDA cloud value
(1) Improve productivity (accelerate TTM)
◾ Capacity expansion on demand and elastic scaling: provide elastic expansion resources on the cloud for peak services to avoid affecting production due to resource overrun;
◾ Minute level delivery of resource application: resource application does not need to wait for complex processes such as procurement, project approval, installation and deployment, and resources are ready to use without waiting for scheduling;
(2) Reduce the difficulty of IT operation and maintenance
◾ Get rid of basic operation and maintenance: IT operation and maintenance departments do not need to worry about physical facilities and underlying operation and maintenance issues, and focus more on business support;
◾ Automated resource delivery: provide full link resource delivery for business parties with the help of automated delivery tools;
◾ Centralized resource control: centralized operation and maintenance and control tools to achieve resource monitoring and support bill splitting by project;
(3) Cost optimization (improve RIO)
◾ Supply precisely matches demand: resources are expanded/shrunk linearly to avoid unnecessary waste of resources;
◾ Supporting cost saving: save the investment in supporting costs such as computer room, air conditioner and hardware maintenance;
◾ Capital cost optimization: the payment mode is flexible, saving the cost of capital occupation;
(4) Improve the use experience
◾ Team collaboration: optimize the efficiency of multi site collaboration and reduce the resource crowding among different teams and tasks;
◾ Transparent and senseless use experience: through a unified development environment, it provides the business side with readily available resources.
2. EDA design whole process cloud architecture
Build a unified security control domain through Alibaba Cloud security:
The front-end design department (the domestic design branch at the top right of the figure) realizes the R&D office security appeal through Alibaba Cloud's shadowless cloud desktop, and deploys shadowless in a separate security domain and VPC, so as to ensure that front-end tasks are isolated from other departments, and achieve the goal that data does not land and cannot be taken away; For the back-end simulation and verification cluster (the green part on the left of the figure), Alibaba Cloud E-HPC products are used for managed computing and high-performance distributed storage to achieve a high-performance supercomputing environment; Connect the offline IDC resources through Alibaba Cloud high-speed channels (lower left) to achieve an on cloud and offline hybrid cloud, easily connect the on cloud and offline data, expand the data center to the cloud, and achieve on cloud elastic computing power; Alibaba Cloud Cloud Enterprise Network product CEN helps IC companies realize high-speed interconnection of global branches, so as to create a set of enterprise private networks for data interchange and office collaboration.
3. Build a comprehensive security environment from the foundation side to the data side
Alibaba Cloud provides comprehensive security protection capabilities, providing corresponding security products from the network side to the data side, including bastion computers, data audit functions, hardware encryptors, etc., to help customers build a solid barrier.
The storage products CPFS and OSS used by EDA on the cloud have the ability to encrypt data on the cloud, which can provide the highest level of data encryption assurance services and create a safe and stable cloud data storage space for customers.
(1) Network side
◾ Advanced anti DDoS IP for DDOS attack cleaning;
◾ Cloud firewall for intrusion prevention and traffic control;
(2) Client
◾ SASE, for terminal security control and data DLP management;
(3) Host side
◾ Cloud security center, for server host intrusion prevention, baseline check, patch management;
(4) Account No
◾ IDP integrates AD domain accounts and unifies account management to improve the ease of jump between different systems;
(5) Audit
◾ ActionTrail;
◾ Database audit;
◾ Fortress machine, virtual machine operation and maintenance management screen recording and video recording, log retention;
(6) Data security
◾ Data encryption: encryption machine, capable of hardware encryption and decryption of cloud data, encrypts, stores and uses sensitive data;
◾ KMS, cloud key lifecycle management, cloud transparent encryption and decryption capabilities;
◾ The data security center conducts data classification and data protection for OSS and databases.
4. Quickly build E-HPC clusters on the cloud
◾ E-HPC can provide a complete set of supercomputing PaaS products, which can be quickly built through graphical interfaces;
◾ E-HPC can provide login and management nodes and graphical nodes to achieve domain control and post job mapping capabilities;
◾ E-HPC can natively provide various kinds of open source scheduler calls. In terms of commercial scheduler requirements, E-HPC can also provide commercial scheduler interfaces, following the offline usage habits;
◾ The E-HPC product can automatically complete the cross zone cluster building, integrate and utilize the resources of multiple data centers, and improve the elastic scheduling capability.
5. Elastic scaling based on E-HPC to automatically match peak business demand
E-HPC can be linked with the scheduler to achieve automatic elastic capacity expansion based on load and scheduler strategy.
◾ The elastically expanded computing node has the ability to automatically mount shared storage and join domains, and can automatically receive scheduler scheduling jobs to improve efficiency;
◾ After the job is completed, E-HPC can release on-demand resources according to the pre configured resource rules, save use costs, and provide customers with an on-demand elastic scaling environment.
6. Build a variety of hybrid cloud architectures
Based on the strong compatibility of E-HPC, it can provide multiple hybrid cloud architectures.
Scheme 1: Focus on offline control, supplemented by online elastic expansion
Most semiconductor companies have offline computer rooms and equipment, and have completed the deployment and debugging of control. Therefore, Alibaba Cloud can provide an EDA hybrid cloud solution that focuses on offline control and is resilient to resources on the cloud. This solution is compatible with the original usage habits and deployed in production environments of multiple industry customers, which is the best practice of EDA hybrid cloud.
◾ Application scenario: Local construction is the priority, while on cloud business is to meet the needs of unexpected business;
◾ Cluster management: cloud based, cloud queue load reaches the threshold, and cloud resources are called. The cloud proxy manager synchronizes the expansion resource information to the cloud manager, and writes it to the local domain controller through the cloud script;
◾ Security boundary: there is no exit on the cloud, and local security is primary;
◾ License deployment: the license server is deployed offline and authorized to use by cloud nodes;
Scheme 2: Focus on cloud management and control, and accept resources under the pipeline
For some cloud native semiconductor companies, it is recommended to use the hybrid cloud architecture in Scheme 2, that is, control on the cloud is dominant, and manage the resources under the cloud through the E-HPC control platform to ensure that the original equipment is not wasted.
◾ Application scenario: The local machine room will no longer be expanded, and the subsequent construction will be mainly on cloud;
◾ Shared storage: cloud based;
◾ Security boundary: On cloud security control is primary, and there is no local exit;
◾ Scheduler deployment: the scheduler is deployed online, and the on cloud and off cloud hybrid cloud scheduling is achieved by loading agents;
◾ License deployment: the license server is deployed offline and authorized to use by cloud nodes.
03 EDA cloud recommended products
1. High performance computing products on the cloud - elastic computing example introduction
Alibaba Cloud uses industry-leading hardware solutions based on the latest hardware adaptation optimization; At the same time, Alibaba Cloud fully supports the multi-core capability of one cloud, and can provide Intel, AMD, ARM and other CPUs.
◾ For front-end businesses, Alibaba Cloud provides a variety of instance types with high primary frequency and large memory. These instance types are based on the DPCA architecture, providing high reliability and superior performance;
◾ For back-end businesses, Alibaba Cloud provides bare metal products with super large memory, especially for demand scenarios over 2T, and instance products based on persistent memory. These instances can significantly increase the memory capacity of a single machine, reduce procurement costs, and improve the concurrency of server back-end operations.
2. Cloud based high-performance storage product - CPFS introduction
◾ Alibaba Cloud file storage CPFS is a large-scale parallel file system designed for high-performance computing, with a fully parallel architecture, million IOPS and OPS, and Tbps level throughput;
◾ Supports Filesets, which can use multiple enterprise level functions, including snapshots, quotas, data flows, life cycles, QoS, etc;
◾ Support ACL, file audit, encryption and other enterprise level functions;
◾ It supports the data flow function, making CPFS a high-performance accelerator of OSS data. Applications can easily access massive low-cost data in OSS through the high-performance file interface of CPFS;
3. Multi site security R&D environment - introduction to Wuying products
Alibaba Cloud's shadowless products provide shadowless access for multi site, home and business development scenarios.
(1) Secure code
◾ No code landing;
◾ Operation log;
◾ Screen recording audit;
◾ Virus vulnerability scanning;
(2) Improve development efficiency
◾ Quickly build the development environment;
◾ Pre install/distribute development tools;
◾ Provide efficient management console;
(3) Safe and efficient data transfer
◾ Desktop, development and production are transmitted in the cloud, which is more efficient and secure;
◾ Integration of multi regional and multinational office environment;
◾ Safe and reliable home office;
01 Industry Overview
1. Computing challenges of typical chip design
The computational challenges of typical chip design are mainly reflected in three aspects:
◾ Fast growth: The improvement of chip manufacturing process has led to a significant increase in computing power, and the computing resources consumed by EDA simulation tools are also significantly increased;
◾ Inaccurate calculation: commercial software cannot estimate the demand for computing power, resulting in budget deviation;
◾ Urgent project: the streaming time cannot be delayed, the development progress is uncontrollable, and the project is greatly affected by various aspects.
The following are typical computing challenges that a chip may face during development:
At the beginning of the chip design, the simulation budget was set at the scale of 10000 cores, including 3000 cores at the front end and 7000 cores at the back end; As the front end entered the operation stage, it was found that the demand for the front end had reached 7000 cores, exceeding the original estimate, and began to divert the budget originally used for the back end; When the back-end task enters the operation stage, the computing power is insufficient, and the research and development wait for the simulation results; If the project falls behind the plan, the project team starts to supplement the budget for procurement; When the back-end task continues, some bugs may cause the front-end verification task to be restarted, further straining the computing power; With the completion of chip streaming, most of the computing power is released, and the device enters the idle state.
As shown in the figure above, there will be peaks and valleys of computing power in the entire chip R&D process, and there is an obvious demand for resource flexibility in chip simulation business.
2. Analysis of pain points in integrated circuit industry
(1) Time
◾ EDA verification takes a lot of time, and the lack of resources will lead to the failure of convergence of the verification work;
◾ The hardware equipment procurement cycle is long, and the deployment and construction cost a lot of time;
(2) Cost
◾ The task has obvious wave crest characteristics, and the cost of holding a large number of hardware for a long time is high;
◾ How to accurately calculate the project cost, especially the cost analysis caused by the occupation of IT resources;
◾ Startups need to spend more money on license and IP procurement;
(3) Security
◾ The architecture design is mainly implemented in Word, which is easy to leak;
◾ Data delivery is complex and massive, there are many authorization review links, and there are loopholes in control;
(4) Multi regional cooperation
◾ Multi regional office collaboration;
◾ Home office environment and safety control.
02 EDA Cloud Solution
1. EDA cloud value
(1) Improve productivity (accelerate TTM)
◾ Capacity expansion on demand and elastic scaling: provide elastic expansion resources on the cloud for peak services to avoid affecting production due to resource overrun;
◾ Minute level delivery of resource application: resource application does not need to wait for complex processes such as procurement, project approval, installation and deployment, and resources are ready to use without waiting for scheduling;
(2) Reduce the difficulty of IT operation and maintenance
◾ Get rid of basic operation and maintenance: IT operation and maintenance departments do not need to worry about physical facilities and underlying operation and maintenance issues, and focus more on business support;
◾ Automated resource delivery: provide full link resource delivery for business parties with the help of automated delivery tools;
◾ Centralized resource control: centralized operation and maintenance and control tools to achieve resource monitoring and support bill splitting by project;
(3) Cost optimization (improve RIO)
◾ Supply precisely matches demand: resources are expanded/shrunk linearly to avoid unnecessary waste of resources;
◾ Supporting cost saving: save the investment in supporting costs such as computer room, air conditioner and hardware maintenance;
◾ Capital cost optimization: the payment mode is flexible, saving the cost of capital occupation;
(4) Improve the use experience
◾ Team collaboration: optimize the efficiency of multi site collaboration and reduce the resource crowding among different teams and tasks;
◾ Transparent and senseless use experience: through a unified development environment, it provides the business side with readily available resources.
2. EDA design whole process cloud architecture
Build a unified security control domain through Alibaba Cloud security:
The front-end design department (the domestic design branch at the top right of the figure) realizes the R&D office security appeal through Alibaba Cloud's shadowless cloud desktop, and deploys shadowless in a separate security domain and VPC, so as to ensure that front-end tasks are isolated from other departments, and achieve the goal that data does not land and cannot be taken away; For the back-end simulation and verification cluster (the green part on the left of the figure), Alibaba Cloud E-HPC products are used for managed computing and high-performance distributed storage to achieve a high-performance supercomputing environment; Connect the offline IDC resources through Alibaba Cloud high-speed channels (lower left) to achieve an on cloud and offline hybrid cloud, easily connect the on cloud and offline data, expand the data center to the cloud, and achieve on cloud elastic computing power; Alibaba Cloud Cloud Enterprise Network product CEN helps IC companies realize high-speed interconnection of global branches, so as to create a set of enterprise private networks for data interchange and office collaboration.
3. Build a comprehensive security environment from the foundation side to the data side
Alibaba Cloud provides comprehensive security protection capabilities, providing corresponding security products from the network side to the data side, including bastion computers, data audit functions, hardware encryptors, etc., to help customers build a solid barrier.
The storage products CPFS and OSS used by EDA on the cloud have the ability to encrypt data on the cloud, which can provide the highest level of data encryption assurance services and create a safe and stable cloud data storage space for customers.
(1) Network side
◾ Advanced anti DDoS IP for DDOS attack cleaning;
◾ Cloud firewall for intrusion prevention and traffic control;
(2) Client
◾ SASE, for terminal security control and data DLP management;
(3) Host side
◾ Cloud security center, for server host intrusion prevention, baseline check, patch management;
(4) Account No
◾ IDP integrates AD domain accounts and unifies account management to improve the ease of jump between different systems;
(5) Audit
◾ ActionTrail;
◾ Database audit;
◾ Fortress machine, virtual machine operation and maintenance management screen recording and video recording, log retention;
(6) Data security
◾ Data encryption: encryption machine, capable of hardware encryption and decryption of cloud data, encrypts, stores and uses sensitive data;
◾ KMS, cloud key lifecycle management, cloud transparent encryption and decryption capabilities;
◾ The data security center conducts data classification and data protection for OSS and databases.
4. Quickly build E-HPC clusters on the cloud
◾ E-HPC can provide a complete set of supercomputing PaaS products, which can be quickly built through graphical interfaces;
◾ E-HPC can provide login and management nodes and graphical nodes to achieve domain control and post job mapping capabilities;
◾ E-HPC can natively provide various kinds of open source scheduler calls. In terms of commercial scheduler requirements, E-HPC can also provide commercial scheduler interfaces, following the offline usage habits;
◾ The E-HPC product can automatically complete the cross zone cluster building, integrate and utilize the resources of multiple data centers, and improve the elastic scheduling capability.
5. Elastic scaling based on E-HPC to automatically match peak business demand
E-HPC can be linked with the scheduler to achieve automatic elastic capacity expansion based on load and scheduler strategy.
◾ The elastically expanded computing node has the ability to automatically mount shared storage and join domains, and can automatically receive scheduler scheduling jobs to improve efficiency;
◾ After the job is completed, E-HPC can release on-demand resources according to the pre configured resource rules, save use costs, and provide customers with an on-demand elastic scaling environment.
6. Build a variety of hybrid cloud architectures
Based on the strong compatibility of E-HPC, it can provide multiple hybrid cloud architectures.
Scheme 1: Focus on offline control, supplemented by online elastic expansion
Most semiconductor companies have offline computer rooms and equipment, and have completed the deployment and debugging of control. Therefore, Alibaba Cloud can provide an EDA hybrid cloud solution that focuses on offline control and is resilient to resources on the cloud. This solution is compatible with the original usage habits and deployed in production environments of multiple industry customers, which is the best practice of EDA hybrid cloud.
◾ Application scenario: Local construction is the priority, while on cloud business is to meet the needs of unexpected business;
◾ Cluster management: cloud based, cloud queue load reaches the threshold, and cloud resources are called. The cloud proxy manager synchronizes the expansion resource information to the cloud manager, and writes it to the local domain controller through the cloud script;
◾ Security boundary: there is no exit on the cloud, and local security is primary;
◾ License deployment: the license server is deployed offline and authorized to use by cloud nodes;
Scheme 2: Focus on cloud management and control, and accept resources under the pipeline
For some cloud native semiconductor companies, it is recommended to use the hybrid cloud architecture in Scheme 2, that is, control on the cloud is dominant, and manage the resources under the cloud through the E-HPC control platform to ensure that the original equipment is not wasted.
◾ Application scenario: The local machine room will no longer be expanded, and the subsequent construction will be mainly on cloud;
◾ Shared storage: cloud based;
◾ Security boundary: On cloud security control is primary, and there is no local exit;
◾ Scheduler deployment: the scheduler is deployed online, and the on cloud and off cloud hybrid cloud scheduling is achieved by loading agents;
◾ License deployment: the license server is deployed offline and authorized to use by cloud nodes.
03 EDA cloud recommended products
1. High performance computing products on the cloud - elastic computing example introduction
Alibaba Cloud uses industry-leading hardware solutions based on the latest hardware adaptation optimization; At the same time, Alibaba Cloud fully supports the multi-core capability of one cloud, and can provide Intel, AMD, ARM and other CPUs.
◾ For front-end businesses, Alibaba Cloud provides a variety of instance types with high primary frequency and large memory. These instance types are based on the DPCA architecture, providing high reliability and superior performance;
◾ For back-end businesses, Alibaba Cloud provides bare metal products with super large memory, especially for demand scenarios over 2T, and instance products based on persistent memory. These instances can significantly increase the memory capacity of a single machine, reduce procurement costs, and improve the concurrency of server back-end operations.
2. Cloud based high-performance storage product - CPFS introduction
◾ Alibaba Cloud file storage CPFS is a large-scale parallel file system designed for high-performance computing, with a fully parallel architecture, million IOPS and OPS, and Tbps level throughput;
◾ Supports Filesets, which can use multiple enterprise level functions, including snapshots, quotas, data flows, life cycles, QoS, etc;
◾ Support ACL, file audit, encryption and other enterprise level functions;
◾ It supports the data flow function, making CPFS a high-performance accelerator of OSS data. Applications can easily access massive low-cost data in OSS through the high-performance file interface of CPFS;
3. Multi site security R&D environment - introduction to Wuying products
Alibaba Cloud's shadowless products provide shadowless access for multi site, home and business development scenarios.
(1) Secure code
◾ No code landing;
◾ Operation log;
◾ Screen recording audit;
◾ Virus vulnerability scanning;
(2) Improve development efficiency
◾ Quickly build the development environment;
◾ Pre install/distribute development tools;
◾ Provide efficient management console;
(3) Safe and efficient data transfer
◾ Desktop, development and production are transmitted in the cloud, which is more efficient and secure;
◾ Integration of multi regional and multinational office environment;
◾ Safe and reliable home office;
Related Articles
-
A detailed explanation of Hadoop core architecture HDFS
Knowledge Base Team
-
What Does IOT Mean
Knowledge Base Team
-
6 Optional Technologies for Data Storage
Knowledge Base Team
-
What Is Blockchain Technology
Knowledge Base Team
Explore More Special Offers
-
Short Message Service(SMS) & Mail Service
50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00