Fully managed Flink allows you to edit and run jobs by using JAR code. This topic describes how to perform various operations, such as activating fully managed Flink, creating JAR streaming jobs and batch jobs, and publishing and running these jobs.

Prerequisites

  • An Alibaba Cloud account is created and has a sufficient balance.
    • For more information about how to create an Alibaba Cloud account, go to the Sign up to Alibaba Cloud page page.
    • The Alibaba Cloud account has a balance of at least USD 100 or a voucher or coupon of the equivalent value.
  • A role is assigned to your Alibaba Cloud account. For more information, see Assign a role to an Alibaba Cloud account.
  • The test JAR package and input data file are downloaded to your on-premise machine.
    • You can click FlinkQuickStart-1.0-SNAPSHOT.jar to download the test JAR package.
      Note This test JAR package is used to collect the number of times a word appears. If you want to analyze the source code, click FlinkQuickStart.zip to download the package and compile the code.
    • Click Shakespeare to download the input data file Shakespeare.

Procedure

  1. Step 1: Create a fully managed Flink instance
    Create a fully managed Flink instance that uses the pay-as-you-go billing method in the China (Beijing) region.
  2. Step 2: Create a JAR job and configure the job information
    Fully managed Flink allows you to create JAR streaming jobs and batch jobs. You can create a job on the Draft Editor page based on your business requirements.
  3. Step 3: Start the job and view the computing result of the job
    After you start the created job on the Deployments page, you can view the computing result of the job.

Step 1: Create a fully managed Flink instance

  1. Log on to the Realtime Compute for Apache Flink console.
  2. In the Fully Managed Flink section, click Purchase.
  3. On the buy page, enter the configuration information.
    Activate the service
    Section Parameter Example value Description
    Basic Configurations Billing Method Pay-as-you-go The Subscription and Pay-as-you-go billing methods are supported.
    The region where fully managed Flink is supported. Beijing The Singapore (Singapore) region is supported.
    Note We recommend that you select the same region as the upstream and downstream storage.
    Zone Zone F We recommend that you select the same zone as the upstream and downstream storage.
    Network SLB Service - The SLB service is selected and activated by default.
    VPC flink-test-vpc We recommend that you select the same VPC as the upstream and downstream storage.
    vSwitch flinktest-vsw-2ze4fyq366itq6xqp**** An IP address is assigned to each TaskManager instance and each JobManager instance of a Flink job. You can select one to five vSwitches to properly plan the Classless Inter-Domain Routing (CIDR) blocks based on the scale of Flink jobs.
    Workspace Workspace Name flink-test The name must be 1 to 60 characters in length and can contain letters, digits, and hyphens (-). It must start with a letter.
    Storage OSS Bucket flink-test-oss The OSS bucket is used to store job information, such as checkpoints, logs, and JAR packages. The fully managed Flink service creates the following directories in the bucket that you select to store different types of data:
    • artifacts: The uploaded JAR files are stored in this directory.
    • flink-jobs: The high availability (HA) information and checkpoints of Flink jobs are stored in this directory.
    • flink-savepoints: If you click Savepoint in the console of fully managed Flink, the savepoint operation is triggered and the final savepoint file is stored in this directory.
    • logs: If you set Log Template to OSS for your job, the logs of your job are stored in this directory.
    • sql-artifacts: Files on which user-defined functions (UDFs) and connectors depend are stored in this directory.
    • plan: In Expert mode, the configured resource information is stored in this directory.
    • flink-sessionclusters: The HA information and checkpoints of session clusters are stored in this directory.
    Note
    • After the fully managed Flink service is activated, OSS Bucket cannot be changed.
    • The OSS bucket must be in the same region as the fully managed Flink service.
    • For more information about how to select an OSS bucket, see Usage notes.
    Monitoring Monitoring Service - Prometheus Service is selected and activated by default.
  4. Click Confirm Order and complete the payment to activate the fully managed Flink service.
    Note After you complete the payment, click Console. On the Fully Managed Flink tab, you can view the workspace that is being created. In most cases, the workspace can be created in 5 to 10 minutes after you complete the payment.

Step 2: Create a JAR job and configure the job information

Streaming job

  1. Create a JAR streaming job.
    1. In the left-side navigation pane, click Draft Editor.
    2. Click New.
    3. In the New Draft dialog box, configure the parameters of the job. The following table describes the parameters.
      Parameter Example value Description
      Name flink-streaming-test-jar The name of the job.
      Note The job name must be unique in the current project.
      Type STREAM / JAR Streaming jobs and batch jobs support the following file types:
      • SQL
      • JAR
      • PYTHON
      Deployment Target vvp-workload The name of the Flink cluster to which you want to deploy the job. Fully managed Flink supports per-job clusters and session clusters. For more information about the differences between the two types of clusters, see Configure a development and test environment (session cluster).
      Locate Development The folder in which you want to store the code file of the job. By default, the code file of the job is stored in the Development folder.

      You can also click the Create Folder icon on the right side of an existing folder to create a subfolder.

    4. Click OK.
  2. On the Draft Editor page, configure basic settings.
    You can directly configure basic settings. You can also click YAML in the lower-right corner of the page to modify the existing settings. The following table describes the parameters.
    Parameter Example value Description
    Deployment Target vvp-workload You can change the cluster that you selected when you create the job.
    Jar URI oss://flink-test-oss/artifacts/namespaces/flink-test-default/FlinkQuickStart-1.0-SNAPSHOT.jar You can click FlinkQuickStart-1.0-SNAPSHOT.jar to download the test JAR package, and click the Upload icon on the right side of the Jar URI field to select the test JAR package and upload the package.
    Entrypoint Class org.example.WordCountStreaming The entry point class of the program. If you do not specify a main class for the JAR package, enter a standard path in the Entrypoint Class field.
    Note In this example, the test JAR package contains both streaming job code and batch job code. Therefore, you must configure this parameter to specify a program entry point for the streaming job.
    Entrypoint main args --input oss://flink-test-oss/artifacts/namespaces/flink-test-default/Shakespeare The OSS directory in which you want to store the input data file.
    Note
    • In this example, the input data file and the test JAR package are stored in a bucket named flink-test-oss in the OSS console.
    • You can click Shakespeare to download the input data file Shakespeare. You must also upload the input data file Shakespeare to the specified OSS bucket on the Artifacts page in the console of fully managed Flink. In this example, the uploaded file is stored in the oss://flink-test-oss/artifacts/namespaces/flink-test-default directory.
    Additional Dependencies OSS bucket in which the required dependency file is stored or URL of the dependency file You can enter the OSS bucket in which the required dependency file is stored or the URL of the dependency file.
    Parallelism 1 The number of jobs that run in parallel.
  3. Click Publish.
  4. Click Confirm.

Batch job

  1. Create a JAR batch job.
    1. In the left-side navigation pane, click Draft Editor.
    2. Click New.
    3. In the New Draft dialog box, configure the parameters of the job. The following table describes the parameters.
      Parameter Example value Description
      Name flink-batch-test-jar The name of the job.
      Note The job name must be unique in the current project.
      Type BATCH / JAR Streaming jobs and batch jobs support the following file types:
      • SQL
      • JAR
      • PYTHON
      Deployment Target vvp-workload The name of the Flink cluster to which you want to deploy the job. Fully managed Flink supports per-job clusters and session clusters. For more information about the differences between the two types of clusters, see Configure a development and test environment (session cluster).
      Locate Development The folder in which you want to store the code file of the job. By default, the code file of the job is stored in the Development folder.

      You can also click the Create Folder icon on the right side of an existing folder to create a subfolder.

    4. Click OK.
  2. On the Draft Editor page, configure basic settings.
    You can directly configure the basic settings. You can also click YAML in the lower-right corner of the page to modify the existing settings. The following table describes the parameters.
    Parameter Example value Description
    Deployment Target vvp-workload You can change the cluster that you selected when you create the job.
    JAR URI oss://flink-test-oss/artifacts/namespaces/flink-test-default/FlinkQuickStart-1.0-SNAPSHOT.jar You can click FlinkQuickStart-1.0-SNAPSHOT.jar to download the test JAR package, and click the Upload icon on the right side of the Jar URI field to select the test JAR package and upload it.
    Entrypoint Class org.example.WordCountBatch The entry point class of the program. If you do not specify a main class for the JAR package, enter a standard path in the Entrypoint Class field.
    Note In this example, the test JAR package contains both streaming job code and batch job code. Therefore, you must configure this parameter to specify a program entry point for the batch job.
    Entrypoint main args --input oss://flink-test-oss/artifacts/namespaces/flink-test-default/Shakespeare --output oss://flink-test-oss/artifacts/namespaces/flink-test-default/batch-quickstart-test-output.txt Enter the directory in which the input data file and the output data file are stored.
    Note
    • In this example, the input data file, output file, and test JAR package are stored in a bucket named flink-test-oss in the OSS console.
    • This example shows how to configure this parameter to write the output data to the specified OSS bucket. You need only to specify the directory and name of the output data file. You do not need to create a directory in advance.
    • Click Shakespeare to download the input data file Shakespeare. You must also upload the input data file of Shakespeare to the specified directory in OSS on the Artifacts page in the console of fully managed Flink. In this example, the uploaded file is stored in the oss://flink-test-oss/artifacts/namespaces/flink-test-default directory.
    Additional Dependencies OSS bucket in which the required dependency file is stored or URL of the dependency file You can enter the OSS bucket in which the required dependency file is stored or the URL of the dependency file.
    Parallelism 1 The number of jobs that run in parallel.
  3. Click Publish.
  4. Click Confirm.

Step 3: Start the job and view the computing result of the job

  1. In the left-side navigation pane, click Deployments.
  2. Find the job that you want to start and click Start in the Actions column.
  3. Click Confirm Running.
    After you click Start, you can view the transition process from a current state to a desired state and the final result. When the state changes to RUNNING, the job is running properly. Status change
    Notice If you want to start a batch job, you must change the job type from STREAM to BATCH in the upper-right corner of the Deployments page. By default, the system displays streaming jobs.
  4. View the computing result of the job.
    • Computing result of a streaming job: On the Deployments page, click the name of the job whose computing result you want to query. On the Task Manager tab, view the computing result in logs. Computing result of a streaming job
    • Computing result of a batch job: Log on to the OSS console and view the result in the directory in which the output data file is stored.
      In this example, the output data file is stored in the oss://flink-test-oss/artifacts/namespaces/flink-test-default/batch-quickstart-test-output.txt directory. Computing result of a batch job

References