By default, Realtime Compute for Apache Flink cannot access the Internet. This topic provides answers to some frequently asked questions about service access over the Internet, access across virtual private clouds (VPCs), domain name resolution, and network connectivity test.
How do I troubleshoot network issues?
A Realtime Compute for Apache Flink workspace is deployed in a VPC. You cannot change the VPC that you select when you purchase a Realtime Compute for Apache Flink workspace. If the source or the sink is not in the same VPC as the Realtime Compute for Apache Flink workspace, the source or the sink is disconnected from the Realtime Compute for Apache Flink workspace and data cannot be read from the source or written to the sink. If data cannot be read from the source or written to the sink, perform the following steps to check whether a network issue exists:
Check the network connectivity between the upstream and downstream storage services and the Realtime Compute for Apache Flink workspace. You can test the network connectivity in the development console of Realtime Compute for Apache Flink console. For more information, see the How do I use the network detection feature? section in this topic.
By default, Realtime Compute for Apache Flink can access only services that are deployed in the same VPC and the same region as Realtime Compute for Apache Flink. If you want to access resources across VPCs or access Realtime Compute for Apache Flink over the Internet, use the following methods:
To access resources across VPCs, you can use one of the methods that are described in the How does Realtime Compute for Apache Flink access a service across VPCs? section of this topic.
To access Realtime Compute for Apache Flink over the Internet, you can use NAT gateways that are provided by Alibaba Cloud to enable communications between VPCs and the Internet. For more information, see How does Realtime Compute for Apache Flink access the Internet?
Check whether the CIDR block of the vSwitch to which the Realtime Compute for Apache Flink workspace belongs is added to the whitelists of the upstream and downstream storage services. For more information, see How do I configure a whitelist?
If a network timeout error persists, the network issue may be caused by a connection timeout. In this case, increase the value of the connect.timeout parameter in the WITH clause. The default value of this parameter is 30, in seconds.
How do I use the network detection feature?
Realtime Compute for Apache Flink supports the network detection feature. To use the network detection feature, perform the following steps in the development console of Realtime Compute for Apache Flink:
Log on to the management console of Realtime Compute for Apache Flink.
Find the workspace that you want to manage and click Console in the Actions column.
In the top navigation bar of the Overview page, click the Network detection icon.

In the Network detection dialog box, configure the Host parameter to specify an IP address or endpoint to check whether the running environment of a Realtime Compute for Apache Flink deployment is connected to the upstream and downstream storage services.
ImportantIf you specify an endpoint, remove
:<port>from the end of the endpoint and enter <port> in the Port field in the Network detection dialog box.
If the error message
"connect timed out"appears, check whether the endpoint that you access is the endpoint of the Internet or another VPC. By default, Realtime Compute for Apache Flink can access only services that are deployed in the same VPC as Realtime Compute for Apache Flink. If you want to access resources across VPCs or access Realtime Compute for Apache Flink over the Internet, see the How does Realtime Compute for Apache Flink access a service across VPCs? and How does Realtime Compute for Apache Flink access the Internet? sections in this topic.
How do I obtain the endpoint of a Hologres instance?
Log on to the Hologres console. In the left-side navigation pane, click Instances. On the Instances page, find the desired instance and click the name of the instance.
In the Network Information section of the Instance Details page, obtain the endpoint of the instance.
You can obtain the related endpoint based on the network type.
Network type
Use scenario
Specified VPC (recommended)
A private network that connects with the specified VPC.
Same VPC (recommended): If the Hologres instance and the Realtime Compute for Apache Flink workspace reside in the same VPC, a network connection is established between the Hologres instance and the workspace.
Different VPCs: If the Hologres instance and the Realtime Compute for Apache Flink workspace reside in different VPCs, you must configure network settings before you can use Realtime Compute for Apache Flink to access the Hologres instance across VPCs. For more information, see Connect to other VPCs.
Internet
The Internet. You can select this network type if you want to access a Hologres instance with no limits on network connections. The Internet may experience uncertain latency compared to internal networks.
You can use Network Address Translation (NAT) gateways of Alibaba Cloud to set up connections between VPCs and the Internet. For more information, see Configure an Internet connection.
(Optional). Test network connectivity in the development console of Realtime Compute for Apache Flink. For more information, see How do I use the network detection feature?
If the network connectivity passes, the endpoint that you obtain is valid.
If the network connectivity fails, you must check whether the Hologres instance and the Realtime Compute for Apache Flink workspace reside in different VPCs or Realtime Compute for Apache Flink needs to access the Hologres instance over the Internet. You can configure network settings based on your business requirements before you can use Realtime Compute for Apache Flink to access the Hologres instance. For more information, see Connect to other VPCs and Configure an Internet connection.
How does Realtime Compute for Apache Flink access the Internet?
The Internet may experience uncertain latency compared to internal networks. If your business requires low network latency and high stability, we recommend that you access a service over a VPC.
Alibaba Cloud provides Network Address Translation (NAT) gateways to enable communications between VPCs and the Internet. This way, Realtime Compute for Apache Flink can access the Internet.
How do I view the public bandwidth?
If the metric values of the deployment are normal and no backpressure exists in the deployment during data reading or writing over the Internet, you can view the public bandwidth to check whether a bottleneck issue occurs. To view the public bandwidth, perform the following steps:
Log on to the management console of Realtime Compute for Apache Flink. Find the desired workspace and choose More > Workspace Details in the Actions column. In the Workspace Details message, view the VPC ID.
Log on to the VPC console. In the left-side navigation pane, click VPC. On the VPC page, find the desired VPC and click its ID.
On the Resource Management tab of the details page of the VPC, click the value of Internet NAT Gateway in the Access to Internet section.
NoteIf the value of Internet NAT Gateway in the Access to Internet section is 0, you must create an Internet NAT gateway. For more information, see Internet NAT gateway.
On the Internet NAT Gateway page, click the ID of the Internet NAT gateway.
On the Associated EIP tab, click the ID of the EIP.
On the page that appears, click the Monitoring and O&M tab to view the public bandwidth.
How does Realtime Compute for Apache Flink access a service across VPCs?
If the service that you want Realtime Compute for Apache Flink to access is in the early planning stages or can be replaced, we recommend that you purchase the same resource of the service that resides in the same VPC as Realtime Compute for Apache Flink. You can also release the current Realtime Compute for Apache Flink workspace and then purchase another workspace that resides in the same VPC as the service.
You can also use an appropriate method to allow Realtime Compute for Apache Flink to access a service across VPCs.
How do I configure a whitelist?
In most cases, the upstream and downstream storage services that are supported by Realtime Compute for Apache Flink do not allow access from external systems. Therefore, you must perform the following steps to add the CIDR block of the vSwitch of Realtime Compute for Apache Flink to the whitelist of the storage service that Realtime Compute for Apache Flink needs to access.
Log on to the management console of Realtime Compute for Apache Flink.
Find the workspace that you want to manage and choose in the Actions column.
In the Workspace Details dialog box, view the CIDR block of the vSwitch to which the Realtime Compute for Apache Flink workspace belongs.
Add the CIDR block of the vSwitch of Realtime Compute for Apache Flink to the whitelist of the storage service that Realtime Compute for Apache Flink needs to access.
For example, you must configure a whitelist for an ApsaraDB RDS for MySQL database. For more information, see Configure IP address whitelist.
NoteIf you add a vSwitch later, you must also add the CIDR block of the new vSwitch to the whitelist of the storage service that Realtime Compute for Apache Flink needs to access.
If your vSwitch is not in the same zone as the upstream and downstream storage services, the network can be connected after you add the CIDR block of the vSwitch to the whitelist.
How do I resolve the domain name of the service on which a Realtime Compute for Apache Flink deployment depends?
If your Realtime Compute for Apache Flink deployment depends on the domain name of the service, a domain name resolution failure is reported when you migrate the service data to Realtime Compute for Apache Flink. To solve this issue, you can use one of the following methods based on the scenario:
You have a self-managed DNS. Flink can connect to the self-managed DNS service over a VPC, and the self-managed DNS can normally resolve domain names.
In this case, you can perform DNS resolution by using the deployment template of Realtime Compute for Apache Flink. For example, the IP address of your self-managed DNS is 192.168.0.1. Perform the following steps:
Log on to the management console of Realtime Compute for Apache Flink.
Find the workspace that you want to manage and click Console in the Actions column.
In the left-side navigation pane, click Configurations. On the Deployment Defaults tab, add the following code to the Other Configuration field:
env.java.opts: >- -Dsun.net.spi.nameservice.provider.1=default -Dsun.net.spi.nameservice.provider.2=dns,sun -Dsun.net.spi.nameservice.nameservers=192.168.0.1NoteIf your self-managed DNS has multiple IP addresses, we recommend that you separate the IP addresses with commas (,).
Click Save Changes.
Create a draft and run the deployment for the draft in the development console of Realtime Compute for Apache Flink.
If the UnknownHostException error persists, domain names cannot be resolved. In this case, contact Alibaba Cloud for technical support.
After self-managed DNS is configured, the deployment frequently fails, and the error message "JobManager heartbeat timeout" appears. For more information about the troubleshooting method, see What do I do if the error message "JobManager heartbeat timeout" appears?
You do not deploy self-managed DNS or Realtime Compute for Apache Flink cannot connect to self-managed DNS over a VPC.
In this case, you must use Alibaba Cloud DNS PrivateZone to resolve domain names. For example, the VPC in which Realtime Compute for Apache Flink resides is named vpc-flinkxxxxxxx, and the domain names that your Realtime Compute for Apache Flink deployment needs to access are aaa.test.com 127.0.0.1, bbb.test.com 127.0.0.2, and ccc.test.com 127.0.0.3. To resolve the domain names, perform the following steps:
Activate Alibaba Cloud DNS PrivateZone. For more information, see Activate Alibaba Cloud DNS PrivateZone.
Add a zone and use the common suffix of the service that your Realtime Compute for Apache Flink deployment needs to access as the zone name. For more information, see Add a zone.
Associate the zone with the VPC in which Realtime Compute for Apache Flink resides. For more information, see Associate a zone with a VPC or disassociate a zone from a VPC.
Add DNS records to the zone. For more information, see Add DNS records.
In the development console of Realtime Compute for Apache Flink, create and run a deployment or stop and rerun an existing deployment.
If the UnknownHost error persists, domain names cannot be resolved. In this case, contact Alibaba Cloud for technical support.
Realtime Compute for Apache Flink is connected to Kafka, but the timeout error is reported. For more information, see Why does the error message "timeout expired while fetching topic metadata" appear even if a network connection is established between Realtime Compute for Apache Flink and ApsaraMQ for Kafka?
What do I do if the error message "JobManager heartbeat timeout" appears?
Problem description
After self-managed DNS is configured, the deployment frequently fails, and the error message "JobManager heartbeat timeout" appears.
Cause
The network latency to self-managed DNS is high.
Solution
Change the value of
jobmanager.retrieve-taskmanager-hostnameto false in the deployment code to disable DNS for the TaskManagers of the deployment. After the configuration is changed, the deployment can still be connected to external services by using the domain name. For more information about how to configure this parameter, see How do I configure custom parameters for deployment running?
Why does the "timeout expired while fetching topic metadata" error message appear even if a network connection is established between Realtime Compute for Apache Flink and Kafka?
Realtime Compute for Apache Flink may be unable to read data from Kafka even if a network connection is established between the two systems. To ensure that the services are connected and data can be read from Kafka, you must use the endpoint that is described in the cluster metadata returned by Kafka brokers during bootstrapping. For more information, visit Kafka network connection issues. To check the network connection, perform the following steps:
Use zkCli.sh or zookeeper-shell.sh to log on to the ZooKeeper service that is used by the Kafka cluster.
Run the
ls /brokers/idscommand to obtain the IDs of all Kafka brokers.Run the
get /brokers/ids/{your_broker_id}command to view the metadata information of Kafka brokers.The endpoint is displayed in listener_security_protocol_map.
Check whether Realtime Compute for Apache Flink can connect to the endpoint.
If the endpoint contains a domain name, configure the DNS service for Realtime Compute for Apache Flink. For more information about how to configure the DNS service for Realtime Compute for Apache Flink, see the "How do I resolve the domain name of the service on which a Flink deployment depends?" section of the Reference topic.