This guide shows you how to create, deploy, and start streaming and batch Flink JAR jobs to demonstrate the workflow in Realtime Compute for Apache Flink.
Prerequisites
-
If you use a RAM user or RAM role, ensure you have the required permissions for the Flink console. For more information, see Permission management.
-
You have created a Flink workspace. For more information, see Activate Realtime Compute for Apache Flink.
Step 1: Develop the JAR package
The Development Console of Realtime Compute for Apache Flink does not provide an integrated development environment (IDE) for JAR packages. You must develop, compile, and package your job locally. For more information about how to configure dependencies, use connectors, and read dependent files from Object Storage Service (OSS), see Develop a Flink JAR job.
Ensure that the Flink version used for local development matches the engine version that you select in Step 3: Deploy the JAR job. Also, be mindful of the dependency package scope.
To help you get started quickly with Flink JAR jobs, this guide provides a sample word-counting JAR package and a data file. You can download these files to complete the following steps.
-
Click FlinkQuickStart-1.0-SNAPSHOT.jar to download the test JAR package.
If you are interested in the source code, click FlinkQuickStart.zip to download and compile it.
-
Click Shakespeare to download the data file.
Step 2: Upload the JAR package and data file
-
Log on to the Realtime Compute console.
-
Find the target Flink workspace and click Console in the Actions column.
-
In the left-side navigation pane, click Artifacts.
-
Click Upload Artifact to upload the JAR package and data file.
In this tutorial, upload the FlinkQuickStart-1.0-SNAPSHOT.jar and Shakespeare files that you downloaded in Step 1. For more information about file storage paths, see Artifacts.
Step 3: Deploy the JAR job
Streaming job
-
On the page, click Create Deployment and select JAR Deployment.
-
Configure the deployment parameters.
Parameter
Description
Example
Deployment mode
Select Stream Mode as the deployment mode.
Stream Mode
Deployment name
Enter a name for the JAR job.
flink-streaming-test-jar
Engine version
The Flink engine version for the job.
We recommend that you use a version with the RECOMMENDED or STABLE tag for better reliability and performance. For more information, see Release notes and Engine versions.
vvr-8.0.9-flink-1.17
JAR URI
Select the FlinkQuickStart-1.0-SNAPSHOT.jar file that you uploaded in Step 2. You can also click the
icon to upload your own JAR package.If the file already exists on the Artifacts page, you can select it directly.
NoteRealtime Compute for Apache Flink engine VVR 8.0.6 and later can only access the bucket bound to the workspace.
-
Entry point class
The program's entry point. If the JAR package does not specify a main class, you must enter its fully qualified class name.
The test JAR package provided in this document contains code for both a streaming job and a batch job. Therefore, you must specify the entry point for the streaming job.
org.example.WordCountStreaming
Entry point main arguments
The arguments to pass to the main method.
For this tutorial, enter the storage path of the
Shakespeareinput data file.--input oss://<Your-OSS-Bucket-Name>/artifacts/namespaces/<Your-Namespace>/ShakespeareYou can copy the full path of the Shakespeare file from the Artifacts page.
Deployment target
From the drop-down list, select a target resource queue or session cluster. For more information, see Manage resource queues and Step 1: Create a session cluster.
ImportantJobs that are deployed to a session cluster do not support alert monitoring, alert configuration, or auto tuning. Do not use session clusters in production environments. Session clusters are intended for development and testing. For more information, see Debug a job.
default-queue
For more information about configuration parameters, see Deploy a job.
-
Click Deploy.
Batch job
-
On the page, click Create Deployment and select JAR Deployment.
-
Configure the deployment parameters.
Parameter
Description
Example
Deployment mode
Select Batch Mode as the deployment mode.
Batch Mode
Deployment name
Enter a name for the JAR job.
flink-batch-test-jar
Engine version
The Flink engine version for the job.
We recommend that you use a version with the RECOMMENDED or STABLE tag for better reliability and performance. For more information, see Release notes and Engine versions.
vvr-8.0.9-flink-1.17
JAR URI
Select the FlinkQuickStart-1.0-SNAPSHOT.jar file that you uploaded in Step 2. You can also click the
icon to upload your own JAR package.-
Entry point class
The entry class of the program. If your JAR package does not specify a main class, enter the fully qualified name of your Endpoint Class here.
The test JAR package provided in this document contains code for both a streaming job and a batch job. Therefore, you must specify the entry point for the batch job.
org.example.WordCountBatch
Entry point main arguments
The arguments to pass to the main method.
For this tutorial, enter the storage paths for the
Shakespeareinput data file and thebatch-quickstart-test-output.txtoutput data file.NoteSpecify the full path for the output file. The system creates this file automatically. In this tutorial, the output path places the file in the same directory as the input file.
--input oss://<Your-OSS-Bucket-Name>/artifacts/namespaces/<Your-Namespace>/Shakespeare--output oss://<Your-OSS-Bucket-Name>/artifacts/namespaces/<Your-Namespace>/batch-quickstart-test-output.txtYou can copy the full path of the Shakespeare file from the Artifacts page.
Deployment target
From the drop-down list, select a target resource queue or session cluster. For more information, see Manage resource queues and Step 1: Create a session cluster.
ImportantJobs that are deployed to a session cluster do not support alert monitoring, alert configuration, or auto tuning. Do not use session clusters in production environments. Session clusters are intended for development and testing. For more information, see Debug a job.
default-queue
For more information about configuration parameters, see Deploy a job.
-
Click Deploy.
Step 4: Start the job and view the results
Streaming job
-
On the page, find the target job and click Start in the Actions column.
-
Select Stateless Start and click Start. For more information about how to start a job, see Start a job.
-
After the job status changes to RUNNING, view the results of the streaming job.
In the TaskManager log file that ends with .out, search for
shakespeareto view the Flink results.To find the log file, go to the Logs tab, select a running TaskManager under Running Task Managers, and then click the Log List tab. Open the
flink.outfile and search forshakespeareto locate the results, such as (shakespeare,1).
Batch job
-
On the page, find the target job and click Start in the Actions column.
-
In the Start Job dialog box, click Start. For more information about how to start a job, see Start a job.
-
After the job status changes to FINISHED, view the results of the batch job.
Log on to the OSS console and view the results in the oss://<Your-OSS-Bucket-Name>/artifacts/namespaces/<Your-Namespace>/batch-quickstart-test-output.txt file.
The following is an example of the results from the
batch-quickstart-test-output.txtfile.a 164 abhor 2 abide 2 able 1 about 1 above 4 absence 5 absent 4 abundance 4 abundant 1 abuse 3 abused 1 abuses 1 abysm 1 accents 1 acceptable 1 acceptance 1
The number of result entries for the streaming job and the batch job may differ because the TaskManager.out log displays a maximum of 2,000 entries. For more information about this limitation, see Print.
(Optional) Step 5: Stop the job
To apply changes to a job (such as code modifications, WITH parameter updates, or version changes), you must redeploy, stop, and then restart it. A restart is also required for a stateless start or to apply non-dynamic configuration changes. For more information about stopping a job, see Stop a job.
Related documents
-
You can configure job resources before you start a job or modify the resources after the job is running. Basic (coarse-grained) and Expert (fine-grained) resource modes are supported. For more information, see Configure job resources.
-
Learn how to dynamically update job parameters and resources to apply changes faster and reduce interruptions from job restarts. For more information, see Dynamic scaling and parameter updates.
-
Configure log levels and specify separate outputs for different levels. For more information, see Configure job log output.
-
Follow a simple example to learn the complete development workflow for a Flink SQL job. For more information, see Flink SQL jobs.