
While serverless architectures entail unrivalled scalability, cost-effectiveness, and development agility, they, however, pose a challenge regarding monitoring. The fact that functions are ephemeral by nature and that they are executed in a distributed manner across many services necessitates a proactive and very comprehensive monitoring strategy, as well as the "black box" abstraction of the underlying infrastructure.
Alibaba Cloud presents a very strong offering of services that, when well utilized, can give you valuable insight into your serverless applications. In this post, I will outline the guidelines on serverless monitoring for Alibaba Cloud so that your applications run flawlessly and you can detect and fix issues quickly.
The Metrics: Indications of the happenings in the system in relation to invocation, errors, latencies, memory usage, CPU usage, etc. Metrics are quantifiable data points that say something about what is happening.
Logs: A detailed time-stamped record of the events that explain what happened. The importance stems from the debugging of events as well as from understanding the behavior of the application.
Traces: End-to-end views of requests as they traverse multiple serverless functions and other services, helping you understand the flow and pinpoint performance bottlenecks.
Alibaba Cloud provides some comprehensive services that can constitute the whole monitoring strategy for serverless applications:
● Function Compute (FC): The core product of Alibaba Cloud's FaaS offering, which has built-in monitoring capabilities.
● CloudMonitor: An all-in-one monitoring service that collects metrics from various Alibaba Cloud products, including Function Compute. It offers dashboards, alarms, and event monitoring.
● Simple Log Service (SLS): A full-fledged log management service for log data collection, ingestion, delivery, querying, and analysis.
● The Application Real-Time Monitoring Service (ARMS) is a cloud-native observability platform with application monitoring, browser monitoring, Prometheus service, etc., for end-to-end tracking and performance insight.
Tracing Analysis: A distributed tracing service to help you monitor and diagnose the performance bottlenecks in your distributed applications, including serverless functions.
EventBridge: A serverless event bus that can traverse events from different sources to varied targets in an automated response mechanism.
Alibaba Cloud Function Compute automatically provides a number of essential metrics through CloudMonitor.
Monitor Core Metrics: Focus on some key performance indicators (KPIs), such as:
● Invocations: Number of times your function executes.
● Errors: Number of failed invocations.
● Latency/Execution Duration: How long did it take for your function to execute?
● Cold Starts: Number of times your function's new instance has to be initialized (this would affect its latency).
● Memory Usage: How much memory is taken by your function?
● Concurrent Invocations: Number of executions being executed together.
Set Up Smart Alarms: Set alarms in CloudMonitor for critical thresholds. For example:
● High error rate on a specific function.
● Latency spikes.
● Unusual drop or increase in invocation counts (may indicate a problem with the trigger).
● Too many cold starts.
● Memory usage nearing function limits.
Utilize Dashboards: Set up custom dashboards in CloudMonitor to visualize these metrics in real-time so that you can gain a holistic view of your serverless application's health.
Logs are your first line of defense in debugging serverless applications.
● Standardize Logging Format: Make sure that structured logs (e.g. JSON) with consistent fields are emitted by your functions so that they can be easily parsed and analyzed in SLS. Include information like request_id, function_name, timestamp, log_level, error_message, etc.
● Use Logtail for Automatic Collection: For Function Compute, the SLS can be set to automatically collect the logs. Set up Logtail to extract logs from your functions into a designated Logstore.
● Create Log Indexes: Create indexes in SLS for the fields most frequently queried, to make searches and analyses faster.
● Create Dashboards and Alarms on the SLS: In addition to running trivial queries, build SLS dashboards to represent trends in the logs (error log count over time) and set up alarms on particular patterns in the logs (messages about "critical errors").
● Correlate Logs to Other Data: Leverage the request_id or trace_id in your logs to connect them to metrics and traces for a holistic view of an event.
● Serverless functions aren't solitary by any means; they frequently interact with databases, message queues, APIs, and even other cloud services.
● Monitor Dependent Services: You can track outsider services to ensure your serverless functions depend on OSS, ApsaraDB RDS, and Message Queue monitoring. CloudMonitor can help here.
● Track API Gateway Data: Where functions have usage through the API Gateway, monitor API Gateway metrics such as request count, latency, and error rates.
● Trace Distribution Over Dependencies: Ensure tracing setup extends to external service calls to give you an all-encompassing view of complete transactions.
Some other tools to integrate with Alibaba tools for better performance and insights:
Serverless monitoring at Alibaba Cloud is not a "do and forget" activity; rather, it is an action that requires careful and thoughtful action involving the efficient combination of robust capabilities from CloudMonitor, Simple Log Service, ARMS, and Tracing Analysis. Implementing these best practices can give you deep insights into what is happening with your serverless applications, quickly identify and resolve issues, and thus optimize the operation of your entire cloud-native workloads on Alibaba Cloud.
Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.
From Silos to Synergy: Why DevOps Is Essential for Digital Transformation
Unlocking the Future: The Complete Guide to Quantum Computing Programming
Alibaba Cloud Native - October 9, 2021
Alibaba Clouder - July 20, 2018
Alibaba Developer - September 6, 2021
Alibaba Cloud Community - November 25, 2021
Alibaba Clouder - November 25, 2019
Alibaba Cloud Serverless - August 4, 2021
Best Practices
Follow our step-by-step best practices guides to build your own business case.
Learn More
CloudMonitor
Automate performance monitoring of all your web resources and applications in real-time
Learn More
Function Compute
Alibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.
Learn More
Application Real-Time Monitoring Service
Build business monitoring capabilities with real time response based on frontend monitoring, application monitoring, and custom business monitoring capabilities
Learn MoreMore Posts by Neel_Shah