edit-icon download-icon

Scenarios

Last Updated: Jan 16, 2018

As one of the most widely used middleware products, ACM has been adopted extensively to manage configurations of Alibaba’s internal applications since 2008. ACM has a wide range of use cases in many core scenarios. This document describes a few typical scenarios where ACM is used.

Configuration management under the microservice application architecture

Under the microservice architecture, configuration management (such as DB_URL access information, service connection pool, and internal cache size of services) becomes cumbersome as the number of applications and machines increases. In this case, configuration distribution across multiple machines in a single application, and application-to-application configuration dependency can be great challenges.

In traditional architecture, the entire application must be re-packaged and published again even though only one configuration item is modified. This process is complex and error-prone, as shown in the following figure.

ACM_0003

In an ACM-based microservice scenario, important application configuration information is published to ACM. The release of new configurations does not require configuration packaging. The applications take effect immediately after the new configurations are released, as shown in the following figure.

ACM_0004

The use of ACM as a configuration center brings microservices the following benefits:

  • All configurations are centralized, which makes it easier to manage the configurations of large amount of applications.
  • Configuration publishing is not dependent on version updates, making it more flexible to modify configurations.
  • ACM supports gray release and rollback, improving the security of configuration updates in a microservice architecture.

Service governance under the distributed architecture

Under various distributed architectures, it is critical to optimize service governance based on a certain RPC framework (such as RESTful, HSF, and Dubbo). Service governance functions such as service routing, rate limiting and throttling, service degradation, and authentication can be achieved using the configuration center.

Take request throttling and service degradation as an example. During Alibaba’s Double 11 events, each operation related to request throttling or service degradation requires response within seconds. This can be achieved using ACM.

In this process, the server of each RPC listens for the service rate limiting information by registering listeners through ACM. When an application requires rate throttling, the administrator performs request control in the service governance console. Then, the service governance system pushes the throttling information through ACM to the target application server to enable the corresponding configuration to take appropriate throttling action.

ACM_0005

ACM brings the following benefits to service governance under distributed architectures:

  • Good performance. Listens for service governance information through configuration push without compromising performance.
  • Quick response. Service governance information can be pushed in seconds.
  • Quick error correction. When service rate throttling or degradation information is pushed incorrectly, configuration can be rolled back in seconds.

Dynamic configuration push in business scenarios

Another typical scenario of ACM is to help speed up the response of web pages to promotion activities, so as to reduce development costs and improve efficiency.

Take e-commerce operations as an example. By embedding the ACM configuration (such as the third-party library version number and the static resource URL) in the frontend Javascript, operation staff can modify ACM configuration rules using operation tools to bring frontend Javascript presentation into effect when running promotional activities.

ACM_0011

ACM brings the following benefit to configuration push in business scenarios:

  • Decoupling static business code from business scenarios using configurations to greatly optimize the operation-related application publishing process.

Algorithm adjustment for real-time big data computing

In real-time big data computing, calculation parameters are adjusted dynamically to obtain the most accurate real-time calculation results.

Take an APM monitoring system in Alibaba as an example. The monitoring system dynamically adjusts the threshold values of businesses to control the real-time computing system and calculate business alarms. The threshold value modification must be completed in real time without application downtime. The calculated threshold values of the monitoring system are pushed under ACM rules.

ACM_0006

ACM brings the following benefit to real-time big data computing scenarios:

  • Dynamic configuration of calculation parameters, which can take effect quickly with little impact on performance.

Multi-region active-active architecture in enterprise Internet architectures

Multi-region active-active solution is an advanced disaster recovery architecture in enterprise Internet architectures. Compared with the traditional disaster recovery architecture, it features quick business recovery, low capacity requirements, and effective and simple maintenance. Currently, the multi-region active-active architecture has been widely used in companies like Alibaba and Ele.me.

In Alibaba, the core algorithms, ID shards and relevant routing rules of the multi-region active-active architecture are all pushed by ACM dynamically. The related clients and servers, such as RPC, MQ, and DB are embedded with the routing path. During a disaster recovery drill test or when a real disaster occurs, the administrator only needs to dynamically push the rules, and these rules will have an affect on every architecture component. This is shown in the following figure.

acm_dr

The use of ACM brings the following benefits to applications in multi-region active-active architectures:

  • The infrastructure and the disaster recovery logic are decoupled. The specific routing logic is determined by the disaster recovery switching rules.
  • Theoretically, disaster recovery switching rules can be pushed to hundreds of thousands of machines and take effect in seconds.
Thank you! We've received your feedback.