All Products
Search
Document Center

AnalyticDB for MySQL:Develop an interactive Jupyter job

Last Updated:Dec 29, 2023

AnalyticDB for MySQL Spark allows you to use a Docker image to start the interactive JupyterLab development environment. This environment helps you connect to AnalyticDB for MySQL Spark and perform interactive testing and computing based on elastic resources.

Prerequisites

Usage notes

  • AnalyticDB for MySQL Spark supports interactive Jupyter jobs only in Python 3.7 or Scala 2.12.

  • If an interactive Jupyter job remains idle for a time-to-live (TTL) period of 1,200 seconds after the last code snippet is executed, the job is automatically released. You can use the spark.adb.sessionTTLSeconds parameter to specify the TTL period for interactive Jupyter jobs.

Procedure

  1. Install and start a Docker image. For more information, see the Docker documentation.

  1. Pull the Jupyter image of AnalyticDB for MySQL. Sample command:

    docker pull registry.cn-hangzhou.aliyuncs.com/adb-public-image/adb-spark-public-image:livy.0.2.pre
  1. Start the interactive JupyterLab development environment.

    Command syntax:

    docker run -it -p {Host port}:8888 -v {Host file path}:{Docker file path} registry.cn-hangzhou.aliyuncs.com/adb-public-image/adb-spark-public-image:livy.0.2.pre -d {ADB Instance Id} -r {Resource Group Name} -e {api endpoint} -i {AkId} -k {aksec}

    The following table describes the parameters.

    Parameter

    Required

    Description

    -p

    No

    Maps a host port to a container port. Specify the parameter in the -p {Host port}:{Container port} format.

    Specify a random value for the host port and set the container port to 8888. Example: -p 8888:8888.

    -v

    No

    If you do not mount the host path and disable the Docker container, the editing files may be lost. After you disable the Docker container, the container attempts to terminate all interactive Spark jobs that are running. You can use one of the following methods to prevent losing the editing files:

    • When you start the interactive JupyterLab development environment, mount the host path to the Docker container and store the job files in the corresponding file path. Specify the parameter in the -v {Host file path}:{Docker file path} format. Specify a random value for the file path of the Docker container. Recommended value: /root/jupyter.

    • Before you disable the Docker container, make sure that all files are copied and stored.

    Example: -v /home/admin/notebook:/root/jupyter. In this example, the host files that are stored in the /home/admin/notebook path are mounted to the /root/jupyter path of the Docker container.

    Note

    Save the editing notebook files to the /tmp folder. After you disable the Docker container, you can view the corresponding files in the /home/admin/notebook path of the host. After you re-enable the Docker container, you can modify and execute the files. For more information, see Volumes.

    -d

    Yes

    The ID of the AnalyticDB for MySQL Data Lakehouse Edition (V3.0) cluster.

    You can log on to the AnalyticDB for MySQL console and go to the Clusters page to view cluster IDs.

    -r

    Yes

    The name of the resource group in the AnalyticDB for MySQL cluster.

    You can log on to the AnalyticDB for MySQL console, choose Cluster Management > Resource Management in the left-side navigation pane, and then click the Resource Groups tab to view resource group names.

    -e

    Yes

    The endpoint of the AnalyticDB for MySQL cluster.

    For more information, see Endpoints.

    -i

    Yes

    The AccessKey ID of the Resource Access Management (RAM) user.

    For information about how to view the AccessKey ID, see Accounts and permissions.

    -k

    Yes

    The AccessKey secret of the RAM user.

    For information about how to view the AccessKey secret, see Accounts and permissions.

    Example:

    docker run -it  -p 8888:8888 -v /home/admin/notebook:/root/jupyter registry.cn-hangzhou.aliyuncs.com/adb-public-image/adb-spark-public-image:livy.0.2.pre -d amv-bp164l3xt9y3**** -r test -e adb.aliyuncs.com -i LTAI55stlJn5GhpBDtN8**** -k DlClrgjoV5LmwBYBJHEZQOnRF7****

    After you start the interactive JupyterLab development environment, the following information is returned. You can copy and paste the http://127.0.0.1:8888/lab?token=1e2caca216c1fd159da607c6360c82213b643605f11ef291 URL to your browser and use JupyterLab to connect to AnalyticDB for MySQL Spark.

    [I 2023-11-24 09:55:09.852 ServerApp] nbclassic | extension was successfully loaded.
    [I 2023-11-24 09:55:09.852 ServerApp] sparkmagic extension enabled!
    [I 2023-11-24 09:55:09.853 ServerApp] sparkmagic | extension was successfully loaded.
    [I 2023-11-24 09:55:09.853 ServerApp] Serving notebooks from local directory: /root/jupyter
    [I 2023-11-24 09:55:09.853 ServerApp] Jupyter Server 1.24.0 is running at:
    [I 2023-11-24 09:55:09.853 ServerApp] http://419e63fc7821:8888/lab?token=1e2caca216c1fd159da607c6360c82213b643605f11ef291
    [I 2023-11-24 09:55:09.853 ServerApp]  or http://127.0.0.1:8888/lab?token=1e2caca216c1fd159da607c6360c82213b643605f11ef291
    [I 2023-11-24 09:55:09.853 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
    Note

    If an error message is returned when you start the interactive JupyterLab development environment, you can view the proxy_{timestamp}.log file for troubleshooting.