This topic describes how to publish a DataStream job.

Prerequisites

A Realtime Compute for Apache Flink project is created.
Notice
  • Only Blink 3.2.2 and later of Realtime Compute for Apache Flink in exclusive mode supports Flink DataStream. We recommend that you use Blink 3.4.0 or later.
  • DataStream jobs do not support resource configuration optimization or the setting of a start offset. For Blink versions earlier than 3.4.0, we recommend that you use the default configurations when you publish and start a job.

Procedure

  1. Log on to the Realtime Compute for Apache Flink console. In the top navigation bar, click Development.
  2. On the Development page, click the Create File icon.
  3. In the Create File dialog box, configure job parameters.
    Parameter Description
    File Name The name of the custom job. The file name must be unique in the current project.
    File Type The type of the job. Set it to FLINK_STREAM / DATASTREAM.
    Note You must set File Type to FLINK_STREAM / DATASTREAM for both DataStream jobs and Table API jobs. If this type is not displayed, submit a ticket.
    Storage Path The path where the job is stored.
  4. In the left-side navigation pane, click the Resources tab.
  5. On the Resources tab, click Create Resource. In the dialog box that appears, configure required parameters, upload the JAR package of the developed DataStream job, and then click OK.
    Note The maximum size of the JAR package that can be uploaded is 300 MB. If the JAR package exceeds 300 MB, you must upload it to the Object Storage Service (OSS) bucket that is bound to your cluster or use APIs to upload it.
  6. In the left-side navigation pane, find the uploaded package, click More in the Actions column, and click Reference.
  7. On the Development page for the job, configure required parameters.
    blink.main.class=<Complete main class name>
    -- The complete function class name, for example, com.alibaba.realtimecompute.DemoTableAPI.
    blink.job.name=<Job name> 
    -- For example, datastream_test.
    blink.main.jar=<Resource name of the JAR package of the complete main class name>
    -- Resource name of the JAR package of the complete main class name, for example, blink_datastream.jar.
    • blink.main.class and blink.job.name are required. Make sure that the value of blink.job.name is the same as the file name you entered in Step 3. If they are different, the file name you entered in Step 3 is used.
    • You must set the blink.main.jar parameter when you upload multiple JAR packages.
    • You can configure other parameters as required and then reference them in Realtime Compute for Apache Flink. For more information about how to configure custom parameters and how to obtain parameter values from the code.
    • Do not use spaces when you configure parameters.
    • In Blink 3.2.0 and later versions, you do not need to set the directory where the checkpoint file is stored. The system automatically generates the directory.
    • In Blink 3.4.0 and later versions, the parameter configurations in the code of the JAR package take precedence over the parameter configurations in Realtime Compute for Apache Flink. For example:
      • If the statebackend parameter is configured both in custom parameters and the code of the JAR package, the configuration of this parameter in the code of the JAR package is used.
      • If the statebackend parameter is not configured in custom parameters and the code of the JAR package, the default parameter niagara statebackend in the job template of the Realtime Compute for Apache Flink development platform is used.
        Note Exercise caution when you delete default parameters from the job template. If you delete the default parameters, checkpoint generation and fault tolerance for the job may fail. The blink.job.name parameter is an exception. The job name configured in env.execute("jobname") in the code will be replaced with the job name that you configured when you create the job. This ensures that the value of the blink.job.name parameter is consistent with the job name you configured. The job names in metrics that include custom metrics must also be the same as the job names you configured when you create the jobs.
  8. Publish the job.
    • Blink versions earlier than 3.4.0
      1. Configure resources.
        Specify the resource configuration mode as required. We recommend that you use the default configuration if you start the job for the first time.
        Note Realtime Compute for Apache Flink supports manual resource configuration. For more information, see Performance optimization by manual configuration.
      2. Check data.

        Check parameter settings and click Next.

      3. Publish the job.

        Click Publish.

    • Blink 3.4.0 and later
      1. Click Publish.
      2. Specify Resource Configuration Method.
        • Code Configuration: uses the resource configurations in the code. This resource configuration method is consistent with that used by the open source Flink.
        • Manual: uses the resource configurations that are manually adjusted on the Resource Configuration page.
          1. In the right side of the Development page, click the Configurations tab. Then, choose Configurations > Reacquire Configuration.
          2. Modify the configurations as required.
          3. Choose Configurations > Apply to save the configurations.
          Note During manual configuration, the resource configurations in the code take precedence over those displayed on the Realtime Compute for Apache Flink development platform. For example, if the resources of some operators are configured in the code, the configurations of the resources of these operators on the Realtime Compute for Apache Flink development platform become invalid. During job running, the resource configurations of the operators in the code are used. For resource configurations that are not displayed in the code, the configurations displayed on the Realtime Compute for Apache Flink development platform are used.
      3. Click Next to check the setting or click Skip Check.
      4. Click Publish.
  9. On the Administration page, find the target job and click Start in the Actions column.