This topic outlines the workflow, environment specifications, requirements, and connector management for Flink Python API (PyFlink) jobs.
Overview
PyFlink jobs follow a specific lifecycle: you develop the business logic in your local environment and subsequently deploy and execute the job in the Flink console. For a comprehensive walkthrough of this process, see Quick Start for Flink Python jobs.
VVR-11
The following software packages are installed in the Flink workspace.
Software package | Version |
apache-beam | 2.48.0 |
avro-python3 | 1.10.2 |
brotlipy | 0.7.0 |
certifi | 2022.12.7 |
cffi | 1.15.1 |
charset-normalizer | 2.0.4 |
cloudpickle | 2.2.1 |
conda | 22.11.1 |
conda-content-trust | 0.1.3 |
conda-package-handling | 1.9.0 |
crcmod | 1.7 |
cryptography | 38.0.1 |
Cython | 3.0.12 |
dill | 0.3.1.1 |
dnspython | 2.7.0 |
docopt | 0.6.2 |
exceptiongroup | 1.3.0 |
fastavro | 1.12.1 |
fasteners | 0.20 |
find_libpython | 0.5.0 |
grpcio | 1.56.2 |
grpcio-tools | 1.56.2 |
hdfs | 2.7.3 |
httplib2 | 0.22.0 |
idna | 3.4 |
importlib_metadata | 8.7.0 |
iniconfig | 2.1.0 |
isort | 6.1.0 |
numpy | 1.24.4 |
objsize | 0.6.1 |
orjson | 3.9.15 |
packaging | 25.0 |
pandas | 2.3.3 |
pemja | 0.5.5 |
pip | 22.3.1 |
pluggy | 1.0.0 |
proto-plus | 1.26.1 |
protobuf | 4.25.8 |
py-spy | 0.4.0 |
py4j | 0.10.9.7 |
pyarrow | 11.0.0 |
pyarrow-hotfix | 0.6 |
pycodestyle | 2.14.0 |
pycosat | 0.6.4 |
pycparser | 2.21 |
pydot | 1.4.2 |
pymongo | 4.15.4 |
pyOpenSSL | 22.0.0 |
pyparsing | 3.2.5 |
PySocks | 1.7.1 |
pytest | 7.4.4 |
python-dateutil | 2.9.0 |
pytz | 2025.2 |
regex | 2025.11.3 |
requests | 2.32.5 |
ruamel.yaml | 0.18.16 |
ruamel.yaml.clib | 0.2.14 |
setuptools | 70.0.0 |
six | 1.16.0 |
tomli | 2.3.0 |
toolz | 0.12.0 |
tqdm | 4.64.1 |
typing_extensions | 4.15.0 |
tzdata | 2025.2 |
urllib3 | 1.26.13 |
wheel | 0.38.4 |
zipp | 3.23.0 |
zstandard | 0.25.0 |
VVR-8
The following software packages are pre-installed in the Flink workspace environment.
Software package | Version |
apache-beam | 2.43.0 |
avro-python3 | 1.9.2.1 |
certifi | 2025.7.9 |
charset-normalizer | 3.4.2 |
cloudpickle | 2.2.0 |
crcmod | 1.7 |
Cython | 0.29.24 |
dill | 0.3.1.1 |
docopt | 0.6.2 |
fastavro | 1.4.7 |
fasteners | 0.19 |
find_libpython | 0.4.1 |
grpcio | 1.46.3 |
grpcio-tools | 1.46.3 |
hdfs | 2.7.3 |
httplib2 | 0.20.4 |
idna | 3.10 |
isort | 6.0.1 |
numpy | 1.21.6 |
objsize | 0.5.2 |
orjson | 3.10.18 |
pandas | 1.3.5 |
pemja | 0.3.2 |
pip | 22.3.1 |
proto-plus | 1.26.1 |
protobuf | 3.20.3 |
py4j | 0.10.9.7 |
pyarrow | 8.0.0 |
pycodestyle | 2.14.0 |
pydot | 1.4.2 |
pymongo | 3.13.0 |
pyparsing | 3.2.3 |
python-dateutil | 2.9.0 |
pytz | 2025.2 |
regex | 2024.11.6 |
requests | 2.32.4 |
setuptools | 58.1.0 |
six | 1.17.0 |
typing_extensions | 4.14.1 |
urllib3 | 2.5.0 |
wheel | 0.33.4 |
zstandard | 0.23.0 |
VVR-6
The following software packages are pre-installed in the Flink workspace environment.
Software package | Version |
apache-beam | 2.27.0 |
avro-python3 | 1.9.2.1 |
certifi | 2024.8.30 |
charset-normalizer | 3.3.2 |
cloudpickle | 1.2.2 |
crcmod | 1.7 |
Cython | 0.29.16 |
dill | 0.3.1.1 |
docopt | 0.6.2 |
fastavro | 0.23.6 |
future | 0.18.3 |
grpcio | 1.29.0 |
hdfs | 2.7.3 |
httplib2 | 0.17.4 |
idna | 3.8 |
importlib-metadata | 6.7.0 |
isort | 5.11.5 |
jsonpickle | 2.0.0 |
mock | 2.0.0 |
numpy | 1.19.5 |
oauth2client | 4.1.3 |
pandas | 1.1.5 |
pbr | 6.1.0 |
pemja | 0.1.4 |
pip | 20.1.1 |
protobuf | 3.17.3 |
py4j | 0.10.9.3 |
pyarrow | 2.0.0 |
pyasn1 | 0.5.1 |
pyasn1-modules | 0.3.0 |
pycodestyle | 2.10.0 |
pydot | 1.4.2 |
pymongo | 3.13.0 |
pyparsing | 3.1.4 |
python-dateutil | 2.8.0 |
pytz | 2024.1 |
requests | 2.31.0 |
rsa | 4.9 |
setuptools | 47.1.0 |
six | 1.16.0 |
typing-extensions | 3.7.4.3 |
urllib3 | 2.0.7 |
wheel | 0.42.0 |
zipp | 3.15.0 |
Usage notes
When you develop PyFlink jobs, be aware of the following requirements imposed by deployment and network environments:
Only Apache Flink 1.13 and later are supported.
The Flink workspace comes with a pre-installed Python environment containing common libraries, such as pandas, NumPy, and PyArrow.
NoteVerverica Runtime (VVR) versions earlier than 8.0.11 includes Python 3.7.9. VVR 8.0.11 or later include Python 3.9.21. If you upgrade to VVR 8.0.11 or later, you must retest, redeploy, and restart any PyFlink jobs originally created with an earlier VVR version.
The Flink runtime environment supports only JDK 8 and JDK 11. If your PyFlink job utilizes third-party JARs, ensure the JARs are compatible with the supported JDK versions.
VVR 4.x supports only Scala 2.11, while VVR 6.x and later support only Scala 2.12. If your Python job relies on third-party JARs, ensure the JAR dependencies match the appropriate Scala version.
Develop a job
Development reference
Use the following resources to develop Flink business logic locally. Once development is complete, upload the code to the Flink console for deployment.
For how to develop business logic for Apache Flink 1.20, see Flink Python API Development Guide.
For common issues and solutions for Apache Flink coding, see FAQ.
Debug a job
When implementing a Python User-Defined Function (UDF), use the logging module to output log information for troubleshooting. The following code snippet demonstrates this:
@udf(result_type=DataTypes.BIGINT())
def add(i, j):
logging.info("hello world")
return i + jOnce logged, you can view the output in the TaskManager log files.
Use connectors
For a list of connectors that Flink supports, see Supported connectors. To use a connector, follow these steps:
Log on to the Realtime Compute for Apache Flink console.
Click Console in the Actions column of the target workspace.
In the left navigation pane, click Artifacts.
Click Upload Artifact and select the Python package for the target connector.
You can upload custom connector or Realtime Compute for Apache Flink's built-in connector. For the download addresses of the official Python packages for Flink connectors, see Connector list.
On the page, click . In the Additional Dependencies field, select the connector JAR, configure the other parameters, and then deploy the job.
Click the name of the deployed job. On the Configuration tab, find the Parameters section and click Edit. In the Other Configuration field, enter the path to the Python connector package.
If your job depends on multiple connector packages (for example,
connector-1.jarandconnector-2.jar), configure them as follows:pipeline.classpaths: 'file:///flink/usrlib/connector-1.jar;file:///flink/usrlib/connector-2.jar'
References
For a complete walkthrough of the PyFlink job development process, see Quick Start for Flink Python jobs.
For details on using custom Python virtual environments, third-party Python libraries, JARs, and data files in Flink Python jobs, see Use Python dependencies.
Realtime Compute for Apache Flink also supports SQL and DataStream jobs. For development guides on these job types, see Job development map and Develop a JAR job.