×
Community Blog Best Practices for Serverless Monitoring on Alibaba Cloud

Best Practices for Serverless Monitoring on Alibaba Cloud

This article outlines best practices and tools for monitoring Alibaba Cloud serverless workloads.

1

While serverless architectures entail unrivalled scalability, cost-effectiveness, and development agility, they, however, pose a challenge regarding monitoring. The fact that functions are ephemeral by nature and that they are executed in a distributed manner across many services necessitates a proactive and very comprehensive monitoring strategy, as well as the "black box" abstraction of the underlying infrastructure.

Alibaba Cloud presents a very strong offering of services that, when well utilized, can give you valuable insight into your serverless applications. In this post, I will outline the guidelines on serverless monitoring for Alibaba Cloud so that your applications run flawlessly and you can detect and fix issues quickly.

Pillars of Serverless Observability

The Metrics: Indications of the happenings in the system in relation to invocation, errors, latencies, memory usage, CPU usage, etc. Metrics are quantifiable data points that say something about what is happening.

Logs: A detailed time-stamped record of the events that explain what happened. The importance stems from the debugging of events as well as from understanding the behavior of the application.

Traces: End-to-end views of requests as they traverse multiple serverless functions and other services, helping you understand the flow and pinpoint performance bottlenecks.

Alibaba Cloud Services for Serverless Monitoring

Alibaba Cloud provides some comprehensive services that can constitute the whole monitoring strategy for serverless applications:

Function Compute (FC): The core product of Alibaba Cloud's FaaS offering, which has built-in monitoring capabilities.

CloudMonitor: An all-in-one monitoring service that collects metrics from various Alibaba Cloud products, including Function Compute. It offers dashboards, alarms, and event monitoring.

● Simple Log Service (SLS): A full-fledged log management service for log data collection, ingestion, delivery, querying, and analysis.

● The Application Real-Time Monitoring Service (ARMS) is a cloud-native observability platform with application monitoring, browser monitoring, Prometheus service, etc., for end-to-end tracking and performance insight.

Tracing Analysis: A distributed tracing service to help you monitor and diagnose the performance bottlenecks in your distributed applications, including serverless functions.

EventBridge: A serverless event bus that can traverse events from different sources to varied targets in an automated response mechanism.

Best Practices for Effective Serverless Monitoring

1. Leverage Built-in Function Compute Metrics and Alarms

Alibaba Cloud Function Compute automatically provides a number of essential metrics through CloudMonitor.

Monitor Core Metrics: Focus on some key performance indicators (KPIs), such as:

Invocations: Number of times your function executes.

Errors: Number of failed invocations.

Latency/Execution Duration: How long did it take for your function to execute?

Cold Starts: Number of times your function's new instance has to be initialized (this would affect its latency).

Memory Usage: How much memory is taken by your function?

Concurrent Invocations: Number of executions being executed together.

Set Up Smart Alarms: Set alarms in CloudMonitor for critical thresholds. For example:

● High error rate on a specific function.
● Latency spikes.
● Unusual drop or increase in invocation counts (may indicate a problem with the trigger).
● Too many cold starts.
● Memory usage nearing function limits.

Utilize Dashboards: Set up custom dashboards in CloudMonitor to visualize these metrics in real-time so that you can gain a holistic view of your serverless application's health.

2. Centralize and Analyze Logs with Simple Log Service (SLS)

Logs are your first line of defense in debugging serverless applications.

Standardize Logging Format: Make sure that structured logs (e.g. JSON) with consistent fields are emitted by your functions so that they can be easily parsed and analyzed in SLS. Include information like request_id, function_name, timestamp, log_level, error_message, etc.

Use Logtail for Automatic Collection: For Function Compute, the SLS can be set to automatically collect the logs. Set up Logtail to extract logs from your functions into a designated Logstore.

Create Log Indexes: Create indexes in SLS for the fields most frequently queried, to make searches and analyses faster.

Create Dashboards and Alarms on the SLS: In addition to running trivial queries, build SLS dashboards to represent trends in the logs (error log count over time) and set up alarms on particular patterns in the logs (messages about "critical errors").

Correlate Logs to Other Data: Leverage the request_id or trace_id in your logs to connect them to metrics and traces for a holistic view of an event.

3. Monitor External Dependencies

● Serverless functions aren't solitary by any means; they frequently interact with databases, message queues, APIs, and even other cloud services.

● Monitor Dependent Services: You can track outsider services to ensure your serverless functions depend on OSS, ApsaraDB RDS, and Message Queue monitoring. CloudMonitor can help here.

● Track API Gateway Data: Where functions have usage through the API Gateway, monitor API Gateway metrics such as request count, latency, and error rates.

● Trace Distribution Over Dependencies: Ensure tracing setup extends to external service calls to give you an all-encompassing view of complete transactions.

Some other tools to integrate with Alibaba tools for better performance and insights:

Honeycomb

Middleware

Prometheus

Conclusion:

Serverless monitoring at Alibaba Cloud is not a "do and forget" activity; rather, it is an action that requires careful and thoughtful action involving the efficient combination of robust capabilities from CloudMonitor, Simple Log Service, ARMS, and Tracing Analysis. Implementing these best practices can give you deep insights into what is happening with your serverless applications, quickly identify and resolve issues, and thus optimize the operation of your entire cloud-native workloads on Alibaba Cloud.


Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.

0 1 0
Share on

Neel_Shah

31 posts | 3 followers

You may also like

Comments