Orchestrating and managing containerized application environments with virtual management tools, such as Kubernetes and Docker Swarm, allows developers to manage and monitor enterprise production and development environments made up of multiple of containerized applications that have been built directly into the Alibaba Cloud Container Service.
In this tutorial, we will show you how to build a Kubernetes application environment which runs on Alibaba Cloud's Container Service for Kubernetes. Alibaba Cloud's Container Service for Kubernetes is made up of a Virtual Private Cloud (VPC) containing clusters of Alibaba Cloud Elastic Compute Service (ECS) instances on which Kubernetes orchestrates and manages industry scale containerized environments.
Before you can run Kubernetes containerized application environments on Alibaba Cloud's Container Service for Kubernetes, you need an Alibaba Cloud Container Service Kubernetes cluster.
Alibaba Cloud's Container Service is a scalable and reliable, high-performance, container management service that allows you to orchestrate and manage containerized application lifecycles with either Kubernetes or Docker Swarm.
Alibaba Cloud's Container Service offers multiple application release methods, including continuous integration, and it also supports a microservices architecture. Container Service for Kubernetes provides enterprise-level performance and flexibility for the management of Kubernetes containerized applications at every stage of the development lifecycle.
Alibaba Cloud's Container Service for Kubernetes simplifies cluster creation, management, and allows for easy upscale too. It also auto-integrates with Alibaba Cloud's virtualization, storage, network, and security services which improve and simplify the overall running environment for Kubernetes containerized applications.
Alibaba Cloud's Container Service is one of the first cloud container services to have passed the Certified Kubernetes Conformance Program.
You can refer to Create a Containerized App on Alibaba Cloud Container Service for Kubernetes to learn more about What Is Kubernetes (K8s)?
This article describes how to rapidly construct a GPU monitoring solution based on Prometheus and Grafana on Alibaba Cloud Container Service for Kubernetes.
Whenever you are training artificial intelligence (AI) models using an Alibaba Cloud Container Service for Kubernetes cluster constructed based on GPU ECS hosts, you need to know the GPU status of each pod. For example, you may need to know the video memory usage, GPU usage, and GPU temperature to ensure the stability of services. This document describes how to rapidly construct a GPU monitoring solution based on Prometheus and Grafana on Alibaba Cloud.
What Is Prometheus?
Prometheus is an open-source service monitoring system and a time series database. Since its inception in 2012 and open source placement on GitHub in 2015, Prometheus has attracted many companies and organizations. Prometheus joined the Cloud Native Computing Foundation (CNCF) in 2016 as the second hosted project, after Kubernetes. It graduated from the CNCF in August, 2018.
As a next-generation open-source solution, Prometheus has a lot of O&M ideas that happen to coincide with those of Google SRE.
Set Up Container Service for Kubernetes
Prerequisites: You have created a Kubernetes cluster consisting of GPU ECS hosts through Container Service.
Log on to the Container Service console and select Container Service - Kubernetes. Choose Application > Deployment and click Create by Template.
Select your GPU cluster and namespace. (For example, you can select the kube-system namespace.) Fill the YAML configuration template to deploy Prometheus and GPU-Exporter.
This article describes some common methods and best practices for container log processing by taking Docker as an example.
Docker, Inc. (formerly known as dotCloud, Inc) released Docker as an open source project in 2013. Then, container products represented by Docker quickly became popular around the world due to multiple features, such as their good isolation performance, high portability, low resource consumption, and rapid startup. The following figure shows the search trends for Docker and OpenStack starting from 2013.
The container technology brings about many conveniences, such as application deployment and delivery. It also brings about many challenges for log processing, such as:
This article describes some common methods and best practices for container log processing by taking Docker as an example. Concluded by the Alibaba Cloud Log Service team through hard work in the log processing field for many years, these methods and practices are:
To collect logs, you must first figure out where the logs are stored. This article shows you how to collect NGINX and Tomcat container logs.
Log files generated by NGINX are access.log and error.log. NGINX Dockerfile respectively redirect access.log and error.log to STDOUT and STDERR.
Tomcat generates multiple log files, including catalina.log, access.log, manager.log, and host-manager.log. tomcat Dockerfile does not redirect these logs to the standard output. Instead, they are stored inside the containers.
In this tutorial, we will see how Cloud Monitor can help us notified in case of an exception in an application.
Alibaba Cloud Monitor service allows monitoring of cloud resource usage and health of the application/resources by collecting various monitoring metrics. In this tutorial, we will see how Cloud Monitor can help us notified in case of an exception in an application. Sample application, which is used here to demonstrate the cloud monitor use case, is a simple web application for uploading file into OSS bucket.
This application will write error logs into Alibaba Log Service and logs are being pulled into Cloud Monitor to monitor sample application exceptions.
There are few more use cases possible using the same idea.
In this below design, I have implemented a scenario where system administrator is notified by email if there is any failure while using file upload web application. The failure could be loss of connection between OSS and application or any other OSS technical/permission issue.
File upload web application is designed to write the error log message into "Log Service". Cloud Monitor monitors the log message written into Log Service. If Log Monitor identified number of error log exceeds the threshold count, then it will automatically send email alert to the system administrator stating that web application encountered an error.
This tutorial can easily be understood and followed if you are familiar with the below technologies and concepts,
Click here to know basics of Cloud Message service
Python Programing Knowledge and Flask Web Framework
This course is designed to help IT companies that want to containerize business applications, as well as cloud computing engineers and operations & maintenance engineers who want to understand and learn how to diagnose problems and monitor the containerized application. By learning this course, you can fully understand what the problem diagnosis of containerized applications is, the common problems of containerized applications, the basic workflow of diagnosing problems, the monitoring scheme and common tools of containerized applications, and the visual monitoring scheme based on Alibaba Cloud Container Service.
This course helps you quickly master Alibaba Cloud Monitoring & Management related services, so that you can efficiently and quickly manage resources on the Alibaba Cloud. This course mainly explains the functions and basic usage of two services：Alibaba Cloud ActionTrail and Cloud Monitor, and impresses you by demonstrating the operation.
This course is designed to help IT companies who want to containerize business applications, as well as cloud computing engineers and operations & maintenance engineers who want to understand and learn about performance testing and optimizing of containerized applications. By learning this course, you can fully understand what the performance testing and optimizing are, the main object of performance testing and optimizing, the common methods, basic procedures, and common tools of performance testing and optimizing for containerized applications, and how to realize performance testing based on Alibaba Cloud Container Service.
Resource monitoring is the most common monitoring operation in Container Service for Kubernetes. You can conveniently monitor the usage of resources such as CPU, memory, and network. Container Service has integrated CloudMonitor for resource monitoring. By default, CloudMonitor is installed and integrated for all newly created clusters.
Event monitoring is another monitoring method in Kubernetes. It makes up for the disadvantages of resource monitoring in timeliness, accuracy, and scenarios. The core design concept of Kubernetes is state machine. The transitions between different states generate corresponding events. Specifically, there will be Normal events when the state machine changes to a desired state and Warning events when the state machine changes to an unexpected state. Developers can obtain events to diagnose cluster exceptions and problems in real time.
Maintained by Alibaba Cloud Container Service, kube-eventer is an open-source event emitter that sends Kubernetes events to systems such as DingTalk and Log Service. It also provides filter conditions of different levels to realize real-time event collection, targeted alerting, and asynchronous archiving. For more information, see kube-eventer.
This course is designed to help IT companies who want to containerize their business applications, as well as cloud computing engineers and cloud computing enthusiasts who want to learn container technologies.
Navicat Monitor applies agentless architecture to monitor your MySQL and MariaDB servers, and collect metrics at regular intervals. It collects process metrics such as CPU load, RAM usage, and a variety of other resources over SSH/SNMP. Navicat Monitor can be installed on any local computer or virtual machine and does not require any software installation on the servers being monitored.
A PaaS platform for a variety of application deployment options and microservices solutions to help you monitor, diagnose, operate and maintain your applications
CloudMonitor collects monitor metrics of Alibaba Cloud resources and custom metrics. The service can be used to detect the availability of your service and allows you to set alarms on specific metrics. CloudMonitor enables you to view and fully understand the usage of the cloud resources, and the status and health of your business, so that you can act promptly to ensure the availability of your application when an alarm is triggered.
Alibaba Clouder - September 7, 2020
Alibaba Clouder - December 18, 2017
Alibaba Clouder - September 3, 2020
Alibaba Clouder - June 11, 2020
Alibaba Cloud Blockchain Service Team - October 25, 2018
Alibaba Clouder - November 22, 2019
Alibaba Cloud Container Service for Kubernetes is a fully managed cloud container management service that supports native Kubernetes and integrates with other Alibaba Cloud products.Learn More
Build business monitoring capabilities with real time response based on frontend monitoring, application monitoring, and custom business monitoring capabilitiesLearn More
More Posts by Alibaba Clouder