All Products
Search
Document Center

MaxCompute:DNS resolution fails due to high concurrency

Last Updated:Mar 26, 2026

This topic describes how to address DNS resolution failures that occur when numerous concurrent requests attempt to access a peer service's domain name.

When a User-Defined Function (UDF) or Spark job sends a large number of concurrent requests to a peer service's domain name, DNS resolution can fail. In Logview, the error appears as:

name or service not known

Root cause

Each request in the job's execution phase triggers a separate DNS lookup. Under high concurrency, the volume of simultaneous DNS queries can overwhelm the DNS resolver, causing lookups to fail. Resolving the domain name once during initialization and reusing the result eliminates this bottleneck.

Solution

Resolve the domain name to an IP address in the __init__ method, before the execution phase begins. Store the result as a class variable and use it for all subsequent HTTP requests.

# -*- coding:UTF-8 -*-
from odps.udf import annotate

@annotate("string->string")
class test_udf(object):
    __ip_address = ''

    def __init__(self):
        # Resolve the domain name once during initialization to avoid
        # concurrent DNS queries during job execution
        import socket
        retries = 3
        for i in range(retries):
            try:
                self.__ip_address = socket.gethostbyname("xxx-vpc.cn-shanghai.aliyuncs.com")
                print(self.__ip_address)
            except socket.gaierror as e:
                print('Failed to resolve domain' + e.strerror)
                if i < retries:
                    print('resolve domain retry: ' + str(i + 1))
                    continue
                else:
                    raise
            break

    def evaluate(self, inputPath):
        import requests
        output = 'false'
        retries = 3
        print(self.__ip_address)
        url = f"http://{self.__ip_address}/{inputPath}"
        print(url)
        for i in range(retries):
            try:
                response = requests.get(url)
                if response.status_code == 200:
                    output = 'true'
                else:
                    raise
            except Exception as e:
                if i < retries:
                    print('connect retry: ' + str(i + 1))
                    print('error: ' + e.message)
                    continue
                else:
                    raise
            break

        return output

Replace xxx-vpc.cn-shanghai.aliyuncs.com with your actual service domain name.