cluster resource for pacemaker
There is a cluster by pacemaker/corosync. The cluster consist of 3 node and doesn't have any cluster resources.
Let's try to set dummy resource.
Before load crm file, initial status is below
[root@vm01 ~]# crm_mon -1fA Stack: corosync Current DC: vm03.localdomain (version 1.1.21-1.el7-f14e36f) - partition with quorum Last updated: Wed Apr 29 17:10:15 2020 Last change: Wed Apr 29 16:58:02 2020 by hacluster via crmd on vm03.localdomain 3 nodes configured 0 resources configured Online: [ vm01.localdomain vm02.localdomain vm03.localdomain ] No active resources Node Attributes: * Node vm01.localdomain: * Node vm02.localdomain: * Node vm03.localdomain: Migration Summary: * Node vm03.localdomain: * Node vm02.localdomain: * Node vm01.localdomain: [root@vm01 ~]#
This is crm configuration file.
[root@vm01 ~]# cat dummy.crm ### Cluster Option ### property stonith-enabled="false" ### Resource Defaults ### rsc_defaults resource-stickiness="INFINITY" \ migration-threshold="1" ### Group Configuration ### group grp \ resource1 \ resource2 ### Clone Configuration ### clone clnResource \ resource3 ### Primitive Configuration ### primitive resource1 ocf:heartbeat:Dummy \ op start interval="0s" timeout="300s" on-fail="restart" \ op monitor interval="10s" timeout="60s" on-fail="restart" \ op stop interval="0s" timeout="300s" on-fail="block" primitive resource2 ocf:heartbeat:Dummy \ op start interval="0s" timeout="300s" on-fail="restart" \ op monitor interval="10s" timeout="60s" on-fail="restart" \ op stop interval="0s" timeout="300s" on-fail="block" primitive resource3 ocf:heartbeat:Dummy \ op start interval="0s" timeout="300s" on-fail="restart" \ op monitor interval="10s" timeout="60s" on-fail="restart" \ op stop interval="0s" timeout="300s" on-fail="block" ### Resource Location ### location rsc_location-1 grp \ rule 300: #uname eq vm01.localdomain \ rule 200: #uname eq vm02.localdomain \ rule 100: #uname eq vm03.localdomain ### Resource Colocation ### colocation rsc_colocation-1 INFINITY: grp clnResource ### Resource Order ### order rsc_order-1 0: clnResource grp symmetrical=false [root@vm01 ~]#
Load crm configuration file
[root@vm01 ~]# crm configure load update dummy.crm [root@vm01 ~]#
After load crm file, Let's check.
[root@vm01 ~]# crm_mon -1fA Stack: corosync Current DC: vm03.localdomain (version 1.1.21-1.el7-f14e36f) - partition with quorum Last updated: Wed Apr 29 17:10:30 2020 Last change: Wed Apr 29 17:10:27 2020 by root via cibadmin on vm01.localdomain 3 nodes configured 5 resources configured Online: [ vm01.localdomain vm02.localdomain vm03.localdomain ] Active resources: Resource Group: grp resource1 (ocf::heartbeat:Dummy): Started vm01.localdomain resource2 (ocf::heartbeat:Dummy): Started vm01.localdomain Clone Set: clnResource [resource3] Started: [ vm01.localdomain vm02.localdomain vm03.localdomain ] Node Attributes: * Node vm01.localdomain: * Node vm02.localdomain: * Node vm03.localdomain: Migration Summary: * Node vm03.localdomain: * Node vm02.localdomain: * Node vm01.localdomain: [root@vm01 ~]#
and confirm configuration by "crm configure show" command.
[root@vm01 ~]# crm configure show node 1: vm01.localdomain node 2: vm02.localdomain node 3: vm03.localdomain ### Primitive Configuration ### primitive resource1 Dummy \ op start interval=0s timeout=300s on-fail=restart \ op monitor interval=10s timeout=60s on-fail=restart \ op stop interval=0s timeout=300s on-fail=block primitive resource2 Dummy \ op start interval=0s timeout=300s on-fail=restart \ op monitor interval=10s timeout=60s on-fail=restart \ op stop interval=0s timeout=300s on-fail=block primitive resource3 Dummy \ op start interval=0s timeout=300s on-fail=restart \ op monitor interval=10s timeout=60s on-fail=restart \ op stop interval=0s timeout=300s on-fail=block ### Group Configuration ### group grp resource1 resource2 ### Clone Configuration ### clone clnResource resource3 ### Resource Location ### location rsc_location-1 grp \ rule 300: #uname eq vm01.localdomain \ rule 200: #uname eq vm02.localdomain \ rule 100: #uname eq vm03.localdomain ### Resource Colocation ### colocation rsc_colocation-1 inf: grp clnResource ### Resource Order ### order rsc_order-1 0: clnResource grp symmetrical=false property cib-bootstrap-options: \ have-watchdog=false \ dc-version=1.1.21-1.el7-f14e36f \ cluster-infrastructure=corosync \ stonith-enabled=false ### Resource Defaults ### rsc_defaults rsc-options: \ resource-stickiness=INFINITY \ migration-threshold=1 [root@vm01 ~]#
try failover
delete a file to occur failure of cluster resouce as followings.
[root@vm01 ~]# rm -f /var/run/resource-agents/Dummy-resource1.state [root@vm01 ~]#
We can found failed action.
[root@vm01 ~]# crm_mon -f1A Stack: corosync Current DC: vm03.localdomain (version 1.1.21-1.el7-f14e36f) - partition with quorum Last updated: Wed Apr 29 17:27:35 2020 Last change: Wed Apr 29 17:10:27 2020 by root via cibadmin on vm01.localdomain 3 nodes configured 5 resources configured Online: [ vm01.localdomain vm02.localdomain vm03.localdomain ] Active resources: Resource Group: grp resource1 (ocf::heartbeat:Dummy): Started vm02.localdomain resource2 (ocf::heartbeat:Dummy): Started vm02.localdomain Clone Set: clnResource [resource3] Started: [ vm01.localdomain vm02.localdomain vm03.localdomain ] Node Attributes: * Node vm01.localdomain: * Node vm02.localdomain: * Node vm03.localdomain: Migration Summary: * Node vm03.localdomain: * Node vm02.localdomain: * Node vm01.localdomain: resource1: migration-threshold=1 fail-count=1 last-failure='Wed Apr 29 17:27:19 2020' Failed Resource Actions: * resource1_monitor_10000 on vm01.localdomain 'not running' (7): call=18, status=complete, exitreason='No process state file found', last-rc-change='Wed Apr 29 17:27:19 2020', queued=0ms, exec=0ms [root@vm01 ~]#
A point is "Failed Resource Actions".
Migration Summary: * Node vm03.localdomain: * Node vm02.localdomain: * Node vm01.localdomain: resource1: migration-threshold=1 fail-count=1 last-failure='Wed Apr 29 17:27:19 2020' Failed Resource Actions: * resource1_monitor_10000 on vm01.localdomain 'not running' (7): call=18, status=complete, exitreason='No process state file found', last-rc-change='Wed Apr 29 17:27:19 2020', queued=0ms, exec=0ms [root@vm01 ~]#
check fail-count for resource1 -> but value=0
[root@vm01 ~]# crm resource failcount resource1 show vm01.localdomain scope=status name=fail-count-resource1 value=0 [root@vm01 ~]#
try to cleanup -> OK.
[root@vm01 ~]# crm resource cleanup resource1 vm01.localdomain Cleaned up resource1 on vm01.localdomain .Cleaned up resource2 on vm01.localdomain Waiting for 1 reply from the CRMd. OK [root@vm01 ~]# [root@vm01 ~]# crm_mon -f1A Stack: corosync Current DC: vm03.localdomain (version 1.1.21-1.el7-f14e36f) - partition with quorum Last updated: Wed Apr 29 17:43:27 2020 Last change: Wed Apr 29 17:43:17 2020 by hacluster via crmd on vm01.localdomain 3 nodes configured 5 resources configured Online: [ vm01.localdomain vm02.localdomain vm03.localdomain ] Active resources: Resource Group: grp resource1 (ocf::heartbeat:Dummy): Started vm01.localdomain resource2 (ocf::heartbeat:Dummy): Started vm01.localdomain Clone Set: clnResource [resource3] Started: [ vm01.localdomain vm02.localdomain vm03.localdomain ] Node Attributes: * Node vm01.localdomain: * Node vm02.localdomain: * Node vm03.localdomain: Migration Summary: * Node vm03.localdomain: * Node vm02.localdomain: * Node vm01.localdomain: [root@vm01 ~]#
That's all.