Community Blog Getting Started with OpenKruiseGame

Getting Started with OpenKruiseGame

This article introduces OpenKruiseGame and its installation and design concept.


OpenKruiseGame (OKG) is a multicloud-oriented, open source Kubernetes workload specialized for game servers. It is a sub-project of the open source workload project OpenKruise of the Cloud Native Computing Foundation (CNCF) in the gaming field. OpenKruiseGame makes the cloud-native transformation of game servers easier, faster, and stabler.


What Is OpenKruiseGame?

OpenKruiseGame is a custom Kubernetes workload designed specially for game server scenarios. It simplifies the cloud-native transformation of game servers. Compared with the built-in workloads of Kubernetes, such as Deployment and StatefulSet, OpenKruiseGame provides common game server management features, such as hot update, in-place update, and management of specified game servers. In addition, OpenKruiseGame connects game servers to cloud service providers, matchmaking services, and O&M platforms. It automatically integrates features such as logging, monitoring, network, storage, elasticity, and matching by using low-code or zero-code technologies during the cloud-native transformation of game servers. With the consistent delivery standard of Kubernetes, OpenKruiseGame implements centralized management of clusters on multiple clouds and hybrid clouds. OpenKruiseGame is a fully open source project. It allows developers to customize workloads and build the release and O&M platforms for game servers by using custom development. OpenKruiseGame can use Kubernetes templates or call APIs to use or extend features. It can also connect to delivery systems, such as KubeVela, to implement the orchestration and full lifecycle management of game servers on a GUI.

Why Is OpenKruiseGame Needed?

Kubernetes is an application delivery and O&M standard in the cloud-native era. The capabilities of Kubernetes such as declarative resource management, auto scaling, and consistent delivery in a multi-cloud environment can provide support for game server scenarios that cover fast server activation, cost control, version management, and global reach. However, certain features of game servers make it difficult to adapt game servers for Kubernetes. For example:

  • Hot update or hot reload To ensure a better game experience for players, many game servers are updated by using hot update or hot reload. However, for various workloads of Kubernetes, the lifecycle of pods is consistent with that of images. When an image is published, pods are recreated. When pods are recreated, issues may occur such as interruptions to player battles and changes in the network metadata of player servers.
  • O&M for specified game servers Game servers are stateful in most scenarios. For example, when a player versus player (PVP) game is updated or goes offline, only game servers without online active players can be changed; when game servers for a player versus environment (PVE) game are suspended or merged, you can perform operations on game servers with specific IDs.
  • Network models suitable for games The network models in Kubernetes are implemented by declaring Services. In most cases, the network models are applicable to stateless scenarios. For network-sensitive game servers, a solution with high-performance gateways, fixed IP addresses and ports, or lossless direct connections is more suitable for actual business scenarios.
  • Game server orchestration The architecture of game servers has become increasingly complex. The player servers for many massive multiplayer online role-playing games (MMORPGs) are combinations of game servers with different features and purposes, such as a gateway server responsible for network access, a central server for running the game engine, and a policy server responsible for game scripts and gameplay. Different game servers have different capacities and management policies. Hence, it is difficult to describe and quickly deliver all the game servers by using a single workload type. The preceding challenges make it difficult to implement cloud-native transformation of game servers. OpenKruiseGame is aimed to abstract the common requirements of the gaming industry, and use the semantic method to make the cloud-native transformation of various game servers simple, efficient, and secure.

List of Core Features

OpenKruiseGame has the following core features:

  • Hot update based on images and hot reload of configurations
  • Update, deletion, and isolation of specified game servers
  • Multiple built-in network models (fixed IP address and port, lossless direct connection, and global acceleration)
  • Auto scaling
  • Automated O&M (service quality)
  • Independent of cloud service providers
  • Complex game server orchestration

Install OpenKruiseGame

Installation Instructions

Installing OpenKruiseGame requires Kruise and Kruise-Game to be installed and Kubernetes version >= 1.16.

Install Kruise

We recommend that you use Helm V3.5 or later to install Kruise.

# Firstly add openkruise charts repository if you haven't do this.
$ helm repo add openkruise https://openkruise.github.io/charts/
# [Optional]
$ helm repo update
# Install the latest version.
$ helm install kruise openkruise/kruise --version 1.5.0

Install Kruise-Game

$ helm install kruise-game openkruise/kruise-game --version 0.6.1

Upgrade Kruise-Game

$ helm upgrade kruise-game openkruise/kruise-game --version 0.6.1 [--force]


Optional: Install/Upgrade with Customized Configurations

The following table lists the configurable parameters of the kruise-game chart and their default values.


Specify each parameter using the --set key=value[,key=value] argument to helm install.

For example,

Optional: The Local Image for China

If you are in China and have problem to pull image from official DockerHub, you can use the registry hosted on Alibaba Cloud:

$ helm install kruise-game https://... --set image.repository=registry.cn-hangzhou.aliyuncs.com/acs/kruise-game-manager

Uninstall OpenKruiseGame

Note that this will lead to all resources created by kruise-game, including webhook configurations, services, namespace, CRDs and CR instances kruise-game controller, to be deleted!

Please do this ONLY when you fully understand the consequence.

To uninstall kruise-game if it is installed with helm charts:

$ helm uninstall kruise-game
release "kruise-game" uninstalled


Q: Error no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"

A: This is because the cluster does not have the prometheus operator installed. enabling the playsuit monitoring feature requires the prometheus operator to be installed on the Kubernetes cluster. If you do not use this feature, you can set prometheus.enabled to false during installation (the default is true)

Q: Error CustomResourceDefinition "poddnats.alibabacloud.com" in namespace "" exists and cannot be imported into the cureent release

A: This is because the CRD is already installed in the cluster and you can set cloudProvider.installCRD to false during installation (default is true)

Design Concept

Purpose of Open Source OpenKruiseGame

I have been working on cloud-native services from Swarm to Kubernetes since 2015. The types of loads running on container clusters range from websites and API services in the early phase to transcoding and AI training later, and then to Metaverse, Web3, and graphical applications. We have witnessed how cloud-native technology is changing industries one by one. However, gaming is a very special industry. A large-scale game includes different roles such as gateways, platform servers, game servers, and matching services. Many game companies have performed cloud-native transformation on services such as platform servers and gateways. However, the containerization of game servers is relatively slow. After I talked with many game developers and O&M personnel, I have found that this situation is roughly attributable to the following key reasons:

  1. Replacing the deployment architecture of a running game server leads to an excessively high risk return ratio.
  2. Some core features, such as game rolling update and merger and suspension of specified servers, are missing during cloud-native transformation of game servers.
  3. We lack best practices and success stories on cloud-native transformation of game servers.

To solve the above problems, we have joined forces with a number of game companies such as Lingxi Games, abstracted the general capabilities for cloud-native transformation of game servers, and developed the open source project OpenKruiseGame. We hope to deliver best practices on cloud-native transformation of game servers to more game developers in an open source project that is independent of all cloud vendors. We also hope that more and more game companies, studios, and developers can join the community, discussing practical problems and scenarios with others and sharing experience in cloud-native transformation of game servers.

— Liu Zhongwei, initiator of the OpenKruiseGame project, Alibaba Cloud Container Service for Kubernetes(ACK)

Lingxi Games has fully embraced cloud-native architecture. In the process of cloud-native transformation, we have clearly realized that game servers are different from other Web applications, and game server management in Kubernetes clusters is very complex. The management feature provided by Kubernetes-native workloads can hardly meet the daily O&M requirements of game servers. Deployment cannot generate fix IDs and thus is not appropriate for StatefulSet features. Whereas, StatefulSet cannot perform management on specified game servers flexibly. To overcome these difficulties, we have developed Platform as a Service (PaaS), which provides game server orchestration and management features to realize efficient O&M actions such as server activation and updates.

— Feng Moujie, head of Lingxi Games Container Cloud, Alibaba Group

As a large-scale game distribution platform, Bilibili has a large number of internal and external game projects with a heterogeneous architecture, and these projects need to be managed and maintained. In the context of reducing cost and enhancing efficiency, it is imperative to migrate game projects from traditional virtual machines to Kubernetes. However, the native Kubernetes is relatively weak in the face of scenarios such as game rolling updates, multi-environment management, abstraction of partition servers for roll server games, and service access to Internet traffic. A low-cost and efficient cross-cloud solution is required to improve the preceding situation. OpenKruiseGame is developed based on OpenKruise and provides features such as fixed ID and in-place upgrade which are appealing to gaming scenarios. This offers an alternative choice for the containerization of game servers.

— Li Ning, head of Game O&M Team, Bilibili

When you try to perform cloud-native transformation of game servers, take network conditions as your primary concern. As game servers are migrated from virtual machines to containers, fixed IP addresses are required to ensure that the machine IP-based O&M method works in Kubernetes. The external service mode also becomes more complex, because it is no longer as simple as exposing ports on virtual machines. Besides, the state of each process of a game server in the pod can hardly be perceived. The recreation policy of native Kubernetes affects the game stability. Therefore, a specific perception policy is required for you to take different actions for different detection results.

— Sheng Hao, head of Game Cloud Platform, Guanying Mutual Entertainment

Why Is OpenKruiseGame a Workload?


The key to cloud-native transformation of game servers is to address two concerns: lifecycle management and O&M management of game servers. Kubernetes has built-in general workload models such as Deployment, StatefulSet, and Job. However, the management of game server states is more fine-grained and more precise. For example, for game servers, a rolling update mechanism is required to ensure a shorter game interruption, and in-place updates are required to ensure that the network-based metadata information remains unchanged. A mechanism to ensure logouts occur only in zero-player game servers during auto scaling and the capability to allow to manually perform O&M, diagnosis, and isolation on game servers are required. The preceding requirements cannot be met by using only the built-in workloads of Kubernetes.

The workloads in Kubernetes also act as an important hub for seamless integration with infrastructure. For example, you can use the Annotations fields to implement automatic connection of the monitoring system and log system to applications, use the nodeSelector field to schedule underlying resources and bind applications to these resources, and use the labels fields to record metadata information such as groups so as to replace the traditional Configuration Management Database (CMDB) system. Therefore, custom workloads are suitable for different types of applications in Kubernetes. OpenKruiseGame is a Kubernetes workload that is designed for gaming scenarios and allows developers to perform better lifecycle management and O&M management of game servers. Moreover, developers can make advantages of capabilities of cloud products by using OpenKruiseGame without the need to develop any additional code.

Design Concept of OpenKruiseGame

OpenKruiseGame consists of only two CustomResourceDefinition (CRD) objects: GameServerSet and GameServer. The design concept of OpenKruiseGame is based on state control, which divides different responsibilities into different workloads for control.

  • GameServerSet (lifecycle management)

Refers to the abstraction of lifecycle management for a group of game servers. It is mainly used for lifecycle control such as replica number management and game server launch.

  • GameServer (management and O&M operations on specified game servers)

Refers to the abstraction of O&M and management operations on a specified game server. It is mainly used for O&M and management operations such as update sequence control, state control of the game server, and network changes of the game server.

After understanding the design concept of OpenKruiseGame, you can quickly draw some interesting inferences. Here are some examples:

  • Will a game server be deleted if a GameServer object is accidentally deleted?

No. A game server is not deleted if a GameServer object is accidentally deleted. GameServer only records the state information about different O&M operations on a game server. If GameServer is deleted, another GameServer object that uses the default settings is created. In this case, your GameServer is recreated based on the default configurations of the game server template defined in GameServerSet.

  • How do we better integrate the matching service and auto scaling to prevent players from being forced to log out?

The service quality capability can be used to convert players' tasks of a game to the state of GameServer. The matching service perceives the state of GameServer and controls the number of replicas for the scale-in or scale-out. GameServerSet also determines the sequence of deletion based on the state of GameServer, thus achieving smooth logout.

Deployment Architecture of OpenKruiseGame


The deployment model of OpenKruiseGame consists of three parts:

1. OpenKruiseGame controller
It performs lifecycle management of GameServerSet and GameServer. OpenKruiseGame controller has a built-in Cloud Provider module to adapt to the differences of different cloud service providers in scenarios such as network plug-ins. This allows OpenKruiseGame to universally deploy a set of codes for all scenarios.

2. OpenKruise controller
It performs lifecycle management of pods. It is a dependent component of OpenKruiseGame and the OpenKruiseGame users and developers do not need to manage the controller.

3.  OpenKruiseGame O&M console [to be built]
It provides the O&M console and APIs for developers who want to use OpenKruiseGame in a visualized way. It allows you to perform lifecycle management and orchestration on game servers.

If you like OpenKruiseGame, give it a star on GitHub!

0 0 0
Share on

You may also like


Related Products

  • Function Compute

    Alibaba Cloud Function Compute is a fully-managed event-driven compute service. It allows you to focus on writing and uploading code without the need to manage infrastructure such as servers.

    Learn More
  • Gaming Solution

    When demand is unpredictable or testing is required for new features, the ability to spin capacity up or down is made easy with Alibaba Cloud gaming solutions.

    Learn More
  • Managed Service for Prometheus

    Multi-source metrics are aggregated to monitor the status of your business and services in real time.

    Learn More
  • Cloud Database Solutions for Gaming

    Alibaba Cloud’s world-leading database technologies solve all data problems for game companies, bringing you matured and customized architectures with high scalability, reliability, and agility.

    Learn More