SAP System High Availability Environment Maintenance Guide - SAP

Overview of SAP High Availability Environment Maintenance
Common Maintenance Scenarios for SAP HANA High Availability

Version management

Version

Revision Date

Change Type

Effective Date

1.0

2019/4/15

1.1

2019/7/30

1. Updated fault count description

2. Updated start and stop sequence instructions

2019/7/30

Overview of SAP High Availability Environment Maintenance

This document is applicable to SAP system applications based on SUSE HAE 12 cluster deployment or SAP HANA ECS instances that require operational tasks such as ECS instance type upgrades or downgrades, SAP application/database upgrades, routine maintenance of primary/standby nodes, and pre- and post-processing instructions for scenarios like node anomalies and switches.

For SAP systems managed by SUSE HAE, maintenance tasks on cluster nodes may require stopping the resources running on that node, shifting these resources, or shutting down or restarting the node. Additionally, you may need to temporarily take control of the resources in the cluster.

The scenarios listed below use SAP HANA high availability as an example, with similar operations for SAP application high availability maintenance.

Important

This document does not replace the standard SUSE and SAP installation/management documents. For more guidance on high availability environment maintenance, please refer to the official SUSE and SAP documentation.

For the SUSE HAE operation manual, please refer to:

For the SAP HANA HSR configuration manual, please refer to:

How To Perform System Replication for SAP HANA

Common maintenance scenarios for SAP HANA high availability

SUSE Pacemaker offers various options for different maintenance needs:

Set the cluster to maintenance mode

Using the global cluster property maintenance-mode, you can place all resources in maintenance mode at once. The cluster will stop monitoring these resources.

Set the node to maintenance mode

Place all resources running on the specified node in maintenance mode at once. The cluster will stop monitoring these resources.

Set the node to standby mode

A node in standby mode can no longer run resources. All resources running on that node will be moved or stopped (if no other node is available to run the resources). Additionally, all monitoring operations on that node will stop (except for operations set with role="Stopped").

If you need to stop a node in the cluster while continuing to provide services running on another node, you can use this option.

Set the resource to maintenance mode

After setting a resource to this mode, no monitoring operations will be triggered for that resource. If you need to manually adjust the services managed by this resource and do not want the cluster to run any monitoring operations on that resource during this period, you can use this option.

Set the resource to unmanaged mode

Using the is-managed property, you can temporarily "release" a resource from being managed by the cluster stack. This means you can manually adjust the services managed by this resource. However, the cluster will continue to monitor the resource and report error information. If you want the cluster to stop monitoring the resource as well, use the resource maintenance mode instead.

1. Handling Primary Node Abnormalities

Important

When the primary node is abnormal, HAE will trigger a primary-standby switch. The original standby node Node B will be promoted to primary, but the original primary node Node A remains in the primary role. Therefore, after crash recovery of the original primary node Node A and before starting the Pacemaker service, you need to manually reconfigure HANA HSR and register the original primary node Node A as Secondary.

Note

In this example, the initial state primary node is saphana-01, and the standby node is saphana-02.

1.1 Query the normal status of SUSE HAE

Log on to any node and use the crm status command to query the normal status of HAE.

# crm status
Stack: corosync
Current DC: saphana-01 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 14:33:22 2019
Last change: Mon Apr 15 14:33:19 2019 by root via crm_attribute on saphana-01

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-01
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-01
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-01 ]
     Slaves: [ saphana-02 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

After the primary node becomes abnormal, HAE automatically promotes the standby node to primary.

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 14:40:43 2019
Last change: Mon Apr 15 14:40:41 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-02 ]
OFFLINE: [ saphana-01 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Stopped: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-02 ]
     Stopped: [ saphana-01 ]

1.2 Re-register HSR and fix the original primary node fault

Warning

Before reconfiguring HSR, be sure to confirm the primary and standby nodes. Configuration errors may lead to data being overwritten or even lost.

Log on to the original primary node with the SAP HANA instance user and configure HSR.

h01adm@saphana-01:/usr/sap/H01/HDB00> hdbnsutil -sr_register --remoteHost=saphana-02 --remoteInstance=00 --replicationMode=syncmem --name=saphana-01 --operationMode=logreplay
adding site ...
checking for inactive nameserver ...
nameserver saphana-01:30001 not responding.
collecting information ...
updating local ini files ...
done.

1.3 Check SBD status

If the status of the node slot is not "clear," you need to set it to "clear."

# sbd -d /dev/vdc list
0       saphana-01      reset   saphana-02
1       saphana-02      reset   saphana-01

# sbd -d /dev/vdc message saphana-01 clear
# sbd -d /dev/vdc message saphana-02 clear

# sbd -d /dev/vdc list
0       saphana-01      clear   saphana-01
1       saphana-02      clear   saphana-01

1.4 Start the Pacemaker service

Execute the following command to start the Pacemaker service. After starting the Pacemaker service, HAE will automatically start the SAP HANA service.

# systemctl start pacemaker

At this point, the original standby node becomes the new primary node, and the current HAE status is as follows:

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:10:58 2019
Last change: Mon Apr 15 15:09:56 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

1.5 Check SAP HANA HSR status

Check using the built-in SAP HANA Python script

Log on to the primary node with the SAP HANA instance user and ensure that the Replication Status of all SAP HANA processes is ACTIVE.

saphana-02:~ # su - h01adm
h01adm@saphana-02:/usr/sap/H01/HDB00> cdpy
h01adm@saphana-02:/usr/sap/H01/HDB00/exe/python_support> python systemReplicationStatus.py 
| Database | Host       | Port  | Service Name | Volume ID | Site ID | Site Name  | Secondary  | Secondary | Secondary | Secondary  | Secondary     | Replication | Replication | Replication    | 
|          |            |       |              |           |         |            | Host       | Port      | Site ID   | Site Name  | Active Status | Mode        | Status      | Status Details | 
| -------- | ---------- | ----- | ------------ | --------- | ------- | ---------- | ---------- | --------- | --------- | ---------- | ------------- | ----------- | ----------- | -------------- | 
| SYSTEMDB | saphana-02 | 30001 | nameserver   |         1 |       2 | saphana-02 | saphana-01 |     30001 |         1 | saphana-01 | YES           | SYNCMEM     | ACTIVE      |                | 
| H01      | saphana-02 | 30007 | xsengine     |         3 |       2 | saphana-02 | saphana-01 |     30007 |         1 | saphana-01 | YES           | SYNCMEM     | ACTIVE      |                | 
| H01      | saphana-02 | 30003 | indexserver  |         2 |       2 | saphana-02 | saphana-01 |     30003 |         1 | saphana-01 | YES           | SYNCMEM     | ACTIVE      |                |

status system replication site "1": ACTIVE
overall system replication status: ACTIVE

Local System Replication State
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

mode: PRIMARY
site id: 2
site name: saphana-02

Check the replication status using the SAPHanaSR tool provided by SUSE and ensure that the sync_state of the standby node is SOK.

saphana-02:~ # SAPHanaSR-showAttr
Global cib-time                 
--------------------------------
global Mon Apr 15 15:17:12 2019 


Hosts      clone_state lpa_h01_lpt node_state op_mode   remoteHost roles                            site       srmode  standby sync_state version                vhost      
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
saphana-01 DEMOTED     30          online     logreplay saphana-02 4:S:master1:master:worker:master saphana-01 syncmem         SOK        2.00.020.00.1500920972 saphana-01 
saphana-02 PROMOTED    1555312632  online     logreplay saphana-01 4:P:master1:master:worker:master saphana-02 syncmem off     PRIM       2.00.020.00.1500920972 saphana-02

1.6 (Optional) Reset fault count

If a resource fails, it will automatically restart, but each failure will increase the fault count of the resource. If a migration-threshold is set for the resource, the node will no longer be allowed to run the resource before the fault count reaches the threshold, so we need to manually clean up this fault count.

The command to clean up the fault count is as follows:

# crm resource cleanup [resouce name] [node]

For example, the resource rsc_SAPHana_HDB on node saphana-01 has been repaired, and we need to clean up this monitoring alarm. The command is as follows:

crm resource cleanup rsc_SAPHana_HDB saphana-01

2. Handling Standby Node Abnormalities

Important

When the standby node is abnormal, the primary node is not affected and will not trigger a primary-standby switch action. After the standby node failback, starting the Pacemaker service will automatically start the SAP HANA service, and the primary-standby roles will not change, requiring no manual intervention.

Note

In this example, the initial state primary node is saphana-02, and the standby node is saphana-01.

2.1 Query the normal status of HAE

Log on to any node with the normal status of SUSE HAE and use the crm status command to query the normal status of HAE.

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:34:52 2019
Last change: Mon Apr 15 15:33:50 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

2.2 Restart Pacemaker

After the standby node failback, first check SBD, then restart Pacemaker.

# systemctl start pacemaker

HSR maintains the original primary-standby relationship, and the current HAE status is as follows:

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:43:28 2019
Last change: Mon Apr 15 15:43:25 2019 by root via crm_attribute on saphana-01

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

2.3 Check SAP HANA HSR status

For detailed operations, see 1.5 Check SAP HANA H HSR status.

2.4 Reset fault count (optional)

3. Shutdown Maintenance for Primary and Standby Nodes

Important

Set the cluster to maintenance mode and shut down the standby and primary nodes in sequence.

Note

In this example, the initial state primary node is saphana-02, and the standby node is saphana-01.

3.1 Query the normal status of HAE

Log on to any node with the normal status of SUSE HAE and use the crm status command to query the normal status of HAE.

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:34:52 2019
Last change: Mon Apr 15 15:33:50 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

3.2 Set the cluster and master/slave resource set to maintenance mode

Log on to the primary node and set the cluster to maintenance mode.

# crm configure property maintenance-mode=true

Set the master/slave resource set to maintenance mode. In this example, the master/slave resource set is rsc_SAPHana_HDB and rsc_SAPHanaTopology_HDB.

# crm resource maintenance rsc_SAPHana_HDB true
Performing update of 'maintenance' on 'msl_SAPHana_HDB', the parent of 'rsc_SAPHana_HDB'
Set 'msl_SAPHana_HDB' option: id=msl_SAPHana_HDB-meta_attributes-maintenance name=maintenance=true

# crm resource maintenance rsc_SAPHanaTopology_HDB true
Performing update of 'maintenance' on 'cln_SAPHanaTopology_HDB', the parent of 'rsc_SAPHanaTopology_HDB'
Set 'cln_SAPHanaTopology_HDB' option: id=cln_SAPHanaTopology_HDB-meta_attributes-maintenance name=maintenance=true

The current HAE status is as follows:

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 16:02:13 2019
Last change: Mon Apr 15 16:02:11 2019 by root via crm_resource on saphana-02

2 nodes configured
6 resources configured

              *** Resource management is DISABLED ***
  The cluster will not attempt to start, stop or recover services

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02 (unmanaged)
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02 (unmanaged)
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB] (unmanaged)
     rsc_SAPHana_HDB    (ocf::suse:SAPHana):    Slave saphana-01 (unmanaged)
     rsc_SAPHana_HDB    (ocf::suse:SAPHana):    Master saphana-02 (unmanaged)
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB] (unmanaged)
     rsc_SAPHanaTopology_HDB    (ocf::suse:SAPHanaTopology):    Started saphana-01 (unmanaged)
     rsc_SAPHanaTopology_HDB    (ocf::suse:SAPHanaTopology):    Started saphana-02 (unmanaged)

3.3 Stop the SAP HANA service on the standby-primary nodes and shut down ECS

Log on to both nodes with the SAP HANA instance user, stop the SAP HANA service on the standby node first, and then stop the SAP HANA service on the primary node.

saphana-01:~ # su - h01adm
h01adm@saphana-01:/usr/sap/H01/HDB00> HDB stop
hdbdaemon will wait maximal 300 seconds for NewDB services finishing.
Stopping instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function Stop 400

15.04.2019 16:46:42
Stop
OK
Waiting for stopped instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function WaitforStopped 600 2


15.04.2019 16:46:54
WaitforStopped
OK
hdbdaemon is stopped.

saphana-02:~ # su - h01adm
h01adm@saphana-02:/usr/sap/H01/HDB00> HDB stop
hdbdaemon will wait maximal 300 seconds for NewDB services finishing.
Stopping instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function Stop 400

15.04.2019 16:47:05
Stop
OK
Waiting for stopped instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function WaitforStopped 600 2


15.04.2019 16:47:35
WaitforStopped
OK
hdbdaemon is stopped.

3.4 Start the SAP HANA ECS primary and standby nodes and restore the cluster and resource set to normal mode

Log on to the primary and standby nodes in sequence and execute the following command to start the Pacemaker service.

# systemctl start pacemaker

Restore the cluster and resource set to normal mode.

# crm configure property maintenance-mode=false
# crm resource maintenance rsc_SAPHana_HDB false
Performing update of 'maintenance' on 'msl_SAPHana_HDB', the parent of 'rsc_SAPHana_HDB'
Set 'msl_SAPHana_HDB' option: id=msl_SAPHana_HDB-meta_attributes-maintenance name=maintenance=false
# crm resource maintenance rsc_SAPHanaTopology_HDB false
Performing update of 'maintenance' on 'cln_SAPHanaTopology_HDB', the parent of 'rsc_SAPHanaTopology_HDB'
Set 'cln_SAPHanaTopology_HDB' option: id=cln_SAPHanaTopology_HDB-meta_attributes-maintenance name=maintenance=false

The SUSE HAE cluster will automatically start the SAP HANA service on the primary and standby nodes and maintain the original primary-standby roles.

The current HAE status is as follows:

# crm status
Stack: corosync
Current DC: saphana-01 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 16:56:49 2019
Last change: Mon Apr 15 16:56:43 2019 by root via crm_attribute on saphana-01

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-01
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

3.5 Check SAP HANA HSR status

For detailed operations, see 1.5 Check SAP HANA HSR status.

3.6 Reset fault count (optional)

For detailed operations, see 1.6 (Optional) Reset fault count.

4. Shutdown Maintenance for Primary Node

Important

The primary node will be set to standby mode, and the cluster will trigger a switch.

Note

In this example, the initial state primary node is saphana-02, and the standby node is saphana-01.

4.1 Query the normal status of SUSE HAE

Log on to any node and use the crm status command to query the normal status of HAE.

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:34:52 2019
Last change: Mon Apr 15 15:33:50 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

4.2 Set the primary node to standby mode

In this example, the primary node is saphana-02.

# crm node standby saphana-02

The cluster will stop the SAP HANA on the saphana-02 node and set the SAP HANA on the saphana-01 node as the primary node.

The current HAE status is as follows:

# crm status
Stack: corosync
Current DC: saphana-01 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 17:07:56 2019
Last change: Mon Apr 15 17:07:38 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Node saphana-02: standby
Online: [ saphana-01 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-01
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-01
 Clone Set: msl_SAPHana_HDB [rsc_SAPHana_HDB] (promotable)
     Masters: [ saphana-01 ]
     Stopped: [ saphana-02 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 ]
     Stopped: [ saphana-02 ]

4.3 Shut down ECS and perform shutdown maintenance tasks

4.4 Start the maintenance node and re-register HSR

Log on to the maintained node and register HSR.

# hdbnsutil -sr_register --remoteHost=saphana-01 --remoteInstance=00 --replicationMode=syncmem --name=saphana-02 --operationMode=logreplay

4.5 Start the Pacemaker service and restore the standby node to online mode

# systemctl start pacemaker
# crm node online saphana-02

The SUSE HAE cluster will automatically start the SAP HANA service on the standby node.

The current HAE status is as follows:

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 18:02:33 2019
Last change: Mon Apr 15 18:01:31 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-01
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-01
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-01 ]
     Slaves: [ saphana-02 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

4.6 Check SAP HANA HSR status

For detailed operations, see 1.5 Check SAP HANA HSR status.

4.7 Reset fault count (optional)

For detailed operations, see 1.6 (Optional) Reset fault count.

5. Shutdown Maintenance for Standby Node

Important

Set the standby node to maintenance mode.

Note

In this example, the initial state primary node is saphana-02, and the standby node is saphana-01.

5.1 Query the normal status of HAE.

Log on to any node with the normal status of SUSE HAE and use the crm status command to query the normal status of HAE.

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 15:34:52 2019
Last change: Mon Apr 15 15:33:50 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

5.2 Set the standby node to maintenance mode

# crm node maintenance saphana-01

After the setting takes effect, the HAE status is as follows:

Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 18:18:10 2019
Last change: Mon Apr 15 18:17:49 2019 by root via crm_attribute on saphana-01

2 nodes configured
6 resources configured

Node saphana-01: maintenance
Online: [ saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     rsc_SAPHana_HDB    (ocf::suse:SAPHana):    Slave saphana-01 (unmanaged)
     Masters: [ saphana-02 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     rsc_SAPHanaTopology_HDB    (ocf::suse:SAPHanaTopology):    Started saphana-01 (unmanaged)
     Started: [ saphana-02 ]

5.3 Stop the SAP HANA service on the standby node and shut down ECS for shutdown maintenance tasks

Log on to the standby node with the SAP HANA instance user and stop the SAP HANA service.

saphana-01:~ # su - h01adm
h01adm@saphana-01:/usr/sap/H01/HDB00> HDB stop
hdbdaemon will wait maximal 300 seconds for NewDB services finishing.
Stopping instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function Stop 400

15.04.2019 16:47:05
Stop
OK
Waiting for stopped instance using: /usr/sap/H01/SYS/exe/hdb/sapcontrol -prot NI_HTTP -nr 00 -function WaitforStopped 600 2


15.04.2019 16:47:35
WaitforStopped
OK
hdbdaemon is stopped.

5.4 Start the SAP HANA ECS standby node and restore the node to normal mode

Log on to the standby node and start the Pacemaker service.

# systemctl start pacemaker

Restore the standby node to normal mode.

saphana-02:~ # crm node ready saphana-01

The SUSE HAE cluster will automatically start the SAP HANA service on the standby node and maintain the original primary-standby roles.

The current HAE status is as follows:

# crm status
Stack: corosync
Current DC: saphana-02 (version 1.1.16-4.8-77ea74d) - partition with quorum
Last updated: Mon Apr 15 18:02:33 2019
Last change: Mon Apr 15 18:01:31 2019 by root via crm_attribute on saphana-02

2 nodes configured
6 resources configured

Online: [ saphana-01 saphana-02 ]

Full list of resources:

rsc_sbd (stonith:external/sbd): Started saphana-02
rsc_vip (ocf::heartbeat:IPaddr2):       Started saphana-02
 Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
     Masters: [ saphana-02 ]
     Slaves: [ saphana-01 ]
 Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
     Started: [ saphana-01 saphana-02 ]

5.5 Check SAP HANA HSR status

For detailed operations, see 1.5 Check SAP HANA HSR status.

5.6 Reset fault count (optional)

For detailed operations, see 1.6 (Optional) Reset fault count.