The Data Science service is developed based on the latest E-MapReduce (EMR) software stack. Some artificial intelligence (AI) components provided by Alibaba Cloud Machine Learning Platform for Artificial Intelligence are integrated in Data Science. The AI components include Alink (machine learning algorithm platform), Faiss (vector calculation engine), and TensorFlow or PyTorch (deep learning framework).

Intended users

The Data Science service is intended for the following users:
  • Users of open source big data systems
  • Users of intelligent recommendation and risk management solutions powered by the AI technology of Alibaba Cloud

Create a cluster

Log on to the EMR console and create a Data Science cluster. For more information, see Create a cluster.

  1. In the Software Settings step, select a zone, an EMR version, and optional services.
    • Zone: Only some zones are supported. You can view the supported zones of each region on the buy page.
    • EMR Version: Only the latest EMR version is supported.
    • Optional Services: We recommend that you select TensorFlow.
    Create_Data_Science
  2. In the Hardware Settings step, specify a VPC, a VSwitch, and a security group. If you need to create a security group, you must go to the ECS console for creation.network

    You must enable port 8443 to allow access to web UIs of related components. For more information, see Grant access to a security group.

  3. In the Basic Settings step, add a Knox account. It is used to log on to the Knox service.Knox

    The added Knox account is a RAM user under your Alibaba Cloud account. For more information about how to add a Knox account, see Manage user accounts.

View cluster details

After the cluster is created, you can view the running status of the cluster on the Cluster Management page.Cluster management