This topic describes how to troubleshoot issues if you view error logs in the Managed Service for OpenTelemetry console or log files or cannot view data in the Managed Service for OpenTelemetry console after you use an open source client to report data to Managed Service for OpenTelemetry.
Possible causes:
The client cannot connect to the Managed Service for OpenTelemetry server.
The reporting feature is disabled in the Managed Service for OpenTelemetry console.
Data fails to be reported to a Simple Log Service data source.
Data fails to be reported over Google Remote Procedure Call (gRPC).
Trace data exceptions:
Check the network connectivity
Check whether the endpoint over which you want to report data in the code is a private endpoint or a public endpoint. If you use a private endpoint, make sure that the endpoint belongs to the same virtual private cloud (VPC) as the server to which you want to report data. You cannot report data across regions.
In the reporting environment, run the
curl
ortelnet
command to check whether the endpoint and port over which you want to report data are available. If not, check the security group settings of the Elastic Compute Service (ECS) instance that you use.
The following example describes how to check the connectivity to a gRPC endpoint of OpenTelemetry in the China (Hangzhou) region.
Run the following command in the terminal window:
telnet [endpoint] [port number]
Example:
telnet tracing-analysis-dc-hz.aliyuncs.com 8090
If the following information is returned, the client can connect to the Managed Service for OpenTelemetry server.
If the system stays in the
Trying [IP]
phase or theUnable to connect to remote host
error message appears, the client cannot connect to the Managed Service for OpenTelemetry server. In this case, check the security group settings and network settings. For more information, see Connect to Managed Service for OpenTelemetry and authenticate clients and Overview.
Check whether the reporting feature is enabled
You can modify reporting settings for all applications globally or for a specific application in the Managed Service for OpenTelemetry console. Check whether the reporting feature is enabled for all applications globally or for a specific application.
Check whether the amount of reported data of an application reaches the configured quota. If the amount of reported data of the application reaches the configured quota, no more data can be reported.
Check whether the reporting feature is enabled for all applications globally
In the left-side navigation pane of the Managed Service for OpenTelemetry console, click Cluster Configurations. In the Data Capturing Settings section of the Cluster Configurations tab, check whether Enable All or Enable by Default is selected.
Check the quota configuration
In the left-side navigation pane of the Managed Service for OpenTelemetry console, click Cluster Configurations. In the Quota configuration section of the Cluster Configurations tab, check whether the amount of reported data reaches the configured quota. If the amount of reported data reaches the quota, increase the quota.
Check whether the reporting feature is enabled for a specific application
On the Applications page, click the name of the application that you want to manage. In the left-side navigation pane of the application details page, click Application Settings. On the Custom Configuration tab, check whether Enable or Don't Set is selected as Capture Data in the Data Capturing Settings section.
If you select Enable All or Disable All in the Data Capturing Settings section of the Cluster Configurations tab, the configuration in the Data Capturing Settings section of a specific application does not take effect. The global configuration prevails.
If you select Don't Set as Capture Data for an application, the reported status of the application is the same as the reported status of the cluster to which the application belongs.
Check the Simple Log Service data source
The data of Managed Service for OpenTelemetry is stored in Simple Log Service projects within your account. If the number of projects in Simple Log Service reaches the upper limit, data fails to be reported.
Solutions:
Release Simple Log Service projects that are not in use.
Submit a ticket to increase the maximum number of projects in Simple Log Service.
Check the status of the monitoring task
If you are prompted that the monitoring task is in an abnormal state or is not enabled, submit a ticket.
Troubleshoot issues if data fails to be reported over HTTP
Troubleshoot issues based on the returned HTTP status code
Check the returned HTTP status code in the Managed Service for OpenTelemetry console or log files, and troubleshoot the failure accordingly.
HTTP status code 403: The server refuses to authorize the request.
The endpoint or token that you enter is incorrect. To obtain the correct endpoint and token, perform the following operations: In the left-side navigation pane of the Managed Service for OpenTelemetry console, click Cluster Configurations. On the page that appears, click the Access point information tab.
The URL that you enter for the Zipkin client contains
/v2/spans
, which needs to be removed.
HTTP status code 405: The amount of data reported reaches the upper limit that you specify.
You can increase the upper limit in the Quota configuration section of the Cluster Configurations tab.
HTTP status code 406: The cluster collection feature is disabled.
You can enable cluster collection in the Ingestion Configuration section of the Cluster Configurations tab.
HTTP status code 400: The data format is not supported.
The value of the Content-Type header can only be
application/json
,application/x-thrift
, orBad Request
.Examples:
Both the key and value of a tag must be in the string format, but the reported value is a JSON array.
The spans must be in the JSON array format, but the reported data is a JSON object.
Troubleshoot issues based on the returned HTTP error message
If the following message appears, APISIX fails to report data by using OpenTelemetry:
The origin server did not find a current representation for the target resource or is not willing to disclose that one exists.
APISIX cannot directly report data to Managed Service for OpenTelemetry by using OpenTelemetry. You must use OpenTelemetry Collector to forward data to Managed Service for OpenTelemetry.
Troubleshoot issues if data fails to be reported over gRPC
If you report data over gRPC, you can check the returned gRPC status code in the Managed Service for OpenTelemetry console or log files. For more information about gRPC status codes, see Status codes and their use in gRPC.
This section describes how to troubleshoot common gRPC-related issues.
The reporting times out
Error message:
Failed to export spans. The request could not be executed. Full error message: timeout
Troubleshooting:
Check the network connectivity.
Increase the timeout period for reporting data by using an SDK or agent based on your business requirements.
The permission verification fails
Error message:
Failed to export spans. Server responded with gRPC status code 7. Error message:
Troubleshooting:
Check whether the value of the Authentication field in the gRPC request header is consistent with the authentication token in the Managed Service for OpenTelemetry console.
MeterSender of the SkyWalking client reports an error
Error message:
MeterSender : Send meters to collector fail with a grpc internal exception. org.apache.skywalking.apm.dependencies.io.grpc.StatusRuntimeException: UNIMPLEMENTED: Method not found: skywalking.v3.MeterReportService/collect
Cause: The SkyWalking client sends metrics to the Managed Service for OpenTelemetry server.
Solution: Disable the SkyWalking client from reporting metrics.
Troubleshoot issues if the trace data in the console is not as expected
Use the agent or SDK for SkyWalking to report data
Why are the events of some frameworks and middleware in an application not tracked?
Check whether some of the framework plug-ins in the plug-in path of the SkyWalking agent are not tracked. Check whether the versions of these plug-ins are the same as those of the frameworks used in the application.
For example, the plug-in path of SkyWalking v8 or later is ${agent-path}/agent-8.x/plugins. If no plug-in is available in the plugins directory, you can copy the required plug-ins in the bootstrap-plugins or optional-plugins folder to the plugins directory, or download the required plug-ins in the community.
Check whether only the SkyWalking agent is added to the application. Conflicts may occur in the instrumentation logic if you add multiple agents to the application.
Why is my trace broken?
Check whether asynchronous tracing is used. For more information about how to resolve a broken trace in asynchronous tracing, see Trace Cross Thread.
Why is the length of my entire trace shorter than expected?
You can increase the maximum number of spans that can be reported by the SkyWalking agent by modifying the value of collector.agent.service_graph.batch_size in the ${agent-path}/agent-8.x/config/agent.config file.
Use the agent or SDK for OpenTelemetry to report data
Why is my trace broken?
Check whether asynchronous tracing is used. To resolve a broken trace in asynchronous tracing, we recommend that you update the version of OpenTelemetry or use the SpanLinks API of OpenTelemetry. You can specify a span as the parent span of another span to correlate the two spans, or explicitly pass the trace context to another application based on context propagation.