PolarDB scheduled O&M events, such as database software upgrades and hardware maintenance, will notify you via text message, voice, email, or internal message, and also through the console. You can view details of each scheduled event, including the event type, task ID, cluster name, and switchover time. Additionally, you have the option to manually change the switchover time.
Usage notes
Events fall into the following levels based on urgency:
[S0: Urgent] Risk fixing: Events at this level are unexpected events that need to be fixed at the earliest opportunity to prevent faults in most cases, such as urgent replacement, upgrades, or updates of faulty versions, host exception fixes, and SSL certificate upgrades before expiration. Event notifications may be sent three days or more in advance and the window for changing the scheduled switchover time is short.
[S1: Scheduled] System maintenance: Events at this level are resolution of low-risk issues or scheduled upgrades of software and hardware in most cases. Event notifications are sent more than three days in advance and you can cancel the events.
To ensure that you can receive notifications of scheduled O&M events, select the notification methods and configure the contacts for ApsaraDB Fault or Maintenance Notifications in the Message Center console. We recommend that you specify database O&M personnel as the contacts. The notification methods include Email and Internal Messages. We recommend that you select Email to improve the success rate of notifications.
Figure 1 Entry for Message Settings in the Message Center console
Figure 2 Notification settings for ApsaraDB Fault or Maintenance Notifications
If you want to be informed of O&M events at the earliest opportunity or want to customize event-driven O&M automation, you can use CloudMonitor to configure system event subscriptions. Then, cloud database services push CloudMonitor system events related to the lifecycle of O&M events, such as subscription, start, end, and cancellation. For more information, see Manage event subscription policies (recommended). For information about CloudMonitor system events to which you can subscribe, see the "Appendix 1 CloudMonitor-related system events" section of this topic.
Sample CloudMonitor event:
{ "eventId": "c864b30b-7f69-5f04-b0e7-8dfb0eabcfd9", // The event ID. The same event has the same ID. "product": "RDS", // The service code. "reason": "Host software/hardware upgrade", // The cause of the event. "extra": { "impactEn": "Transient instance disconnection", // The impact of the event. "eventCode": "rds_apsaradb_transfer", // The code of the type of the O&M event. "eventNameEn": "Instance migration", // The name of the O&M event. "switchTime": "2024-09-15T01:30:00+08:00", // The scheduled switchover time, which is the time when a transient connection occurs on the instance if a switchover is performed. "startTime": "2024-09-14T21:30:00+08:00", // The scheduled start time of the event, which is the time when the event enters the scheduling queue and waits to be executed. "cancelCode": "OutOfGoodPerfBySoftHardwareUpgrade", // The cancellation risk code. For more information, see the "Appendix 2 Detailed cause codes and cancellation risks" section of this topic. "detailCode": "HostSoftHardwareUpgrade", // The detailed cause code. For more information, see the "Appendix 2 Detailed cause codes and cancellation risks" section of this topic. "instanceInfo": "" }, "instanceId": "rm-2ze9d66o65q1g02g6", // The instance ID. "eventType": "Maintenance", "instanceComment": "rm-2ze9d66o65q1g02g6", // The alias of the instance. "instanceType": "Instance", "publishTime": "2024-09-10T16:01:47+08:00" }
Procedure
Log on to the console of the database service of the instance or cluster that you want to manage.
In the left-side navigation pane, choose . In the top navigation bar, select the region in which the instance or cluster resides.
On the Scheduled Events page, view the information about events. By default, events in the Planned state are displayed. You can click the Completed and Canceled tabs to switch between historical completed and canceled events. The following table describes the event attributes.
Attribute
Example
Description
Event type
Risk fixing
Events fall into the risk fixing and system maintenance levels based on urgency.
Status
Pending
The scheduling status of the event. Take note of the following statuses:
Waiting Setting Time: The execution time of the event is empty and you must configure the time settings based on your business requirements. If you do not configure the time settings by the specified deadline, the system automatically cancels the execution of the event and does not automatically execute the event.
Pending: The event waits until the scheduled start time is reached.
Executing: The event is being executed as scheduled. In this case, you cannot perform manual intervention. To terminate the event in an urgent manner, submit a ticket. Unknown risks may occur if non-standard operations are performed.
Successful: The event is successfully executed.
Canceled: The execution of the event fails or is canceled. The following list describes common cancellation causes:
User cancellation (UserCancel): The execution of the event is canceled in the console or by calling API operations.
User response timeout (UserResponseTimeout): The event is automatically canceled because the time settings of the event are not configured by the deadline.
Cancellation for database management (SupervisorCancel): The event initiator cancels the execution of the event for database management.
On-demand avoidance cancellation (AvoidCancel): The event does not need the execution because the risk is mitigated or the current status of the instance or cluster no longer necessitates the execution of this event. For example, no update is required if the instance or cluster is already of the latest version.
Automatic cancellation by the system (AutoCancel): The execution of the event is canceled because the system determines that the instance or cluster does not meet the conditions for execution during regular checks on scheduled events. For example, the current status of the instance or cluster is abnormal and action commands cannot be issued.
Execution timeout (ExecuteTimeout): The event enters the execution queue but the execution is not complete within the expected time.
Execution failure (ExecuteFail): The event fails during execution due to an unknown exception.
Event type
Minor version update
The type of the event. For more information, see the "Event types and impacts" section of this topic.
Cause
-
The cause of the event. For more information, see the "Appendix 2 Detailed cause codes and cancellation risks" section of this topic.
Business impact
Transient connections
The business impact of the event. Different types of events have different impacts on your business. For more information, see the "Event types and impacts" section of this topic.
O&M suggestions
Make sure that your applications are automatically configured to reconnect to your instance or cluster and pay attention to the impacts on your business.
The O&M suggestions for the event. The O&M suggestions vary based on events. For more information, see the "Appendix 1 CloudMonitor-related system events" section of this topic.
Start time
-
The scheduled start time of the event, which is the time when the event enters the scheduling queue. Before the start time, the event does not affect the instance or cluster. After the start time, you can still access the instance or cluster. However, you cannot perform instance-level or cluster-level operations, such as changing instance or cluster configurations and migrating the instance or cluster across zones. This attribute is empty if the event is in the Waiting Setting Time state.
Scheduled switchover time
-
The scheduled switchover time, which is the time when a transient connection occurs on the instance or cluster if a primary/secondary switchover or link switchover is performed. The time is an estimated value. Switchovers are expected to occur around the time. In extreme cases such as switching back to the original zone, two switchovers may occur.
NoteConsidering that an amount of preparation time is required to perform steps such as event scheduling and data preparation before the switchover in most cases, the start time and the switchover time have a time difference. The time difference may vary based on database services.
Deadline
-
The latest time by which you can configure the time settings for execution of the event. The switchover time that you want to use cannot be later than this time.
Cancelable
Yes
To block this event, you can cancel it. In most cases, this feature is available for system O&M events.
ImportantIn most cases, scheduled events are issued by the cloud database management system during regular inspections. If you cancel an event once, a new event may be issued during the next inspection cycle. Frequent cancellations may result in increased risks. We recommend that you select an appropriate time to execute an event based on your business conditions rather than canceling the event. For information about the cancellation risks, see the "Appendix 2 Detailed cause codes and cancellation risks" section of this topic.
Schedule changeable
Yes
In most cases, you can change the execution time of events. In few scenarios where the window for urgently fixing high risks is short, you cannot change the execution time of events.
(Optional) Reschedule events.
Select the events whose execution time you want to change and click Schedule Event. On the page that appears, configure one of the following settings:
Immediate execution: specifies the current time as the start time of the events. Then, the events enter the execution queue and are immediately executed.
Switchover at a specified time: allows you to select an appropriate switchover time based on the configurable switchover time range. The start time is automatically calculated based on the switchover time. The new start time cannot be earlier than the current time. Otherwise, the switchover time cannot be changed.
(Optional) Change the recurring time window settings.
Click Recurring Time Window Settings in the upper right corner of the event list.
In most cases, the execution time of a scheduled event of an instance or cluster is automatically calculated based on the maintenance window of the instance or cluster. For information about how to configure the maintenance window for an ApsaraDB RDS instance, a Tair (Redis OSS-compatible) instance, an ApsaraDB for MongoDB instance, and a PolarDB cluster, see Configure a maintenance window, Configure a maintenance window, Specify a maintenance window, and Set a maintenance window. You can also specify a custom recurring time window based on your O&M requirements. If the system initiates an event, the execution time of the event is preferentially calculated based on the specified time window.
You can set the recurring time window by month or week. For example, if you set the recurring time window to 02:00 to 03:00 on Monday and Tuesday every week and the time window for a scheduled event to this Tuesday through next Sunday, the range for the switchover time of the event includes 02:00 to 03:00 on this Tuesday and 02:00 to 03:00 on next Monday. In most cases, the switchover is preferentially performed on this Tuesday.
ImportantThis configuration is valid only for the new events. If you want to change the execution time of an existing event, click Configure Execution Time.
This configuration helps calculate the execution time of events only at the system maintenance level. The actual execution time is subject to the time displayed in the event list.
This configuration is an account-level configuration. The configuration takes effect on all database services that support Recurring Time Window Settings.
(Optional) Cancel scheduled events.
Select the events that you want to cancel and click Cancel Scheduled Event. On the page that appears, read and confirm the cancellation risks, and then click Confirm.
Event types and impacts
Event Type |
Impact Type |
Impact Description |
Cluster Migration Note
Planned O&M operations initiated due to host risks, hardware warranty expiration, or operating system upgrades. The system will migrate the cluster to a new server node, including non-high availability clusters and read-only clusters. |
Cluster Transient Connection |
After the scheduled switchover time starts, the following impacts will occur: Note
Pending events usually result in cluster switchover operations, which will be executed during the cluster maintenance window after the scheduled switchover time.
|
Primary/Secondary Switchover Note
Planned O&M operations initiated due to host risks, hardware warranty expiration, or operating system upgrades. The system will initiate a primary/secondary node switchover operation, only including high availability clusters. |
||
Cluster Parameter Adjustment Note
Planned O&M operations initiated due to known parameter risks. The system will initiate parameter modification operations for the cluster. If the issued parameters include parameters that require a restart, the cluster will be restarted. |
||
Host Risk Fix Note
Fix the fault risks existing in the host to which the cluster belongs. |
||
SSL Certificate Renewal Note
To ensure that the cluster continues to provide excellent security and stability, this operation will be initiated when the SSL certificate of the cluster is about to expire. |
||
Backup Mode Upgrade Note
To ensure that the cluster provides faster backup and recovery capabilities, switch the backup mode of the cluster from logical backup to physical database table backup. |
||
Zone Migration Note
Upgrade and technical transformation of physical infrastructure in some existing regions and zones. |
||
Minor Version Upgrade Note
To improve user experience, cloud databases will release minor versions of clusters from time to time to enrich cloud product features or fix known issues. |
Cluster Transient Connection |
After the scheduled switchover time starts, the following impacts will occur: Note
Pending events usually result in cluster switchover operations, which will be executed during the cluster maintenance window after the scheduled switchover time.
|
Differences between minor engine versions |
Different minor engine versions have distinct updates. It's important to understand the differences between the current minor engine version and the version you plan to update to. For more information, see the relevant release notes. Release notes are available for the following services:
|
|
Proxy Minor Version Upgrade Note
To improve user experience, cloud databases will release minor versions of proxy nodes from time to time to enrich proxy service features or fix known issues. | Transient Cluster Connection |
After the scheduled switchover time begins, the following impacts will be observed: Note
Pending events typically lead to cluster switchover operations, which are executed during the cluster maintenance window following the scheduled switchover time.
|
Differences Between Minor Versions | Different minor versions feature various updates. It is important to understand the differences between the current minor version and the one you plan to update to. For more information, consult the relevant release notes. Release notes are provided for the following services only:
| |
Network Upgrade Note
Upgrade network hardware to enhance the performance and stability of the cluster's network. |
Cluster Transient Connection |
After the scheduled switchover time begins, the following impacts will occur: Note
Pending events typically lead to cluster switchover operations, which are carried out during the maintenance window following the scheduled switchover time.
|
Change of Virtual IP Addresses (VIPs) | During certain network upgrades involving cross-zone migration, the Virtual IP (VIP) address of the cluster may change. This can result in a connection interruption if the client relies on the VIP to connect to the cloud database. Note To prevent any impact, you should utilize the domain name format of the connection address provided by the cluster and ensure the DNS cache is disabled on both the application and its host server. | |
Storage Gateway Upgrade Note Upgrading the storage gateway can enhance the storage performance and stability of the cluster. | I/O Jitter | You may experience temporary I/O jitter, and SQL latency could increase, but these effects should not last longer than 3 seconds. |
Activate Seamless Migration Feature Note Activate seamless migration to enhance the user experience. | Parameter Adjustment | No Impact Note This adjustment does not require a restart, ensuring no disruption to your ongoing operations. |
Proxy Migration Note To enhance the stability of the proxy node, upgrade or maintain the host where the proxy resides. | Proxy Node Migration | During proxy node migration, the cluster endpoint and custom endpoint may experience a brief interruption, not exceeding 10 seconds. |
FAQ
FAQ about notifications
FAQ about the start time and switchover time
FAQ about event operations
FAQ about other issues
API references
API |
Description |
DescribePendingMaintenanceActions - Retrieves the number of pending maintenance actions for various task types |
Retrieves the number of pending maintenance actions for various task types. |
ModifyPendingMaintenanceAction - Modifies the rescheduling time for a pending maintenance action |
Modifies the rescheduling time for a pending maintenance action. |
DescribePendingMaintenanceAction - Provides details on scheduled maintenance actions |
Provides details on scheduled maintenance actions. |