This topic describes the background information, limits, methods, and connector usage of Python API job development in fully managed Flink.
- Only Apache Flink 1.12 is supported.
- Python 3.7.9 is pre-installed in fully managed Flink clusters, and common Python libraries such as pandas, NumPy, and PyArrow are pre-installed in the Python environment. Therefore, you must develop code in Python 3.7.
- Java Development Kit (JDK) 1.8 is used in the running environment of fully managed Flink. If your Python job depends on a third-party JAR package, make sure that the JAR package is compatible with JDK 1.8.
- Only open source Scala 2.11 is supported. If your Python job depends on a third-party JAR package, make sure that the JAR package that corresponds to Scala 2.11 is used.
Develop a job
Debug a job
After logs are generated, you can view the logs in the log file of TaskManager.
@udf(result_type=DataTypes.BIGINT()) def add(i, j): logging.info("hello world") return i + j
Use a connector
- Log on to the Realtime Compute for Apache Flink console.
- On the Fully Managed Flink tab, find the workspace that you want to manage, and click Console in the Actions column.
- In the left-side navigation pane, click Artifacts.
- Click Upload Artifact and select the JAR package of the connector that you want to use.
You can upload the JAR package of your self-managed connector or the JAR package of a connector provided by fully managed Flink. For the download links of the official JAR packages provided by fully managed Flink, see Connectors.
- In the Additional Dependencies section of the Drafts page, select the JAR package of the connector that you want to use.
- On the right side of the Drafts page, click the Advanced tab and enter the relevant configurations in the Additional Configuration section.
For example, if the job depends on the two JAR packages of a connector that are named connector-1.jar and connector-2.jar, add the following configuration information: