All Products
Document Center

Well-Architected Framework:Failure Emergency Management

Last Updated:Sep 25, 2023

The Failure management system is a series of control processes taken throughout the life cycle of a fault, including fault basic data management (fault level definition, emergency scenario monitoring coverage, service group & duty roster management, fault subscription management), fault discovery (24/7 monitoring duty, intelligent baseline alarm), fault emergency coordination (fault notification and update, fault emergency coordination group), fault recovery (recommendation for root cause and rapid recovery), and fault review (fault review specification, fault data operation).