diff --git a/docs/notes/ucs-dev-group-services.md b/docs/notes/ucs-dev-group-services.md index 6c963577a66679af112d27c1b1aa84dc95aa6d7a..2b8eeaab832c703c0eebd68a6d4bcc8ad1da9f0e 100644 --- a/docs/notes/ucs-dev-group-services.md +++ b/docs/notes/ucs-dev-group-services.md @@ -18,49 +18,90 @@ The services run by DevOps currently includes: The majority of the UCS dev group services are clustered using [Pacemaker](https://clusterlabs.org/pacemaker/man/). -To make a release without incurring downtime the following steps can be taken: +To make a release without incurring downtime the following steps can be taken. +Examples, refer to Lookup service (`ucs-ibis` package) and need substituting as +appropriate. + +### Start with the standby node + +!!! note + Which node is on standby can be determined by running `crm_mon -1` or looking at the service's + `/adm/status` page. 1. Put the cluster into [maintenance mode](https://xahteiwi.eu/resources/hints-and-kinks/maintenance-active-pacemaker-clusters/), this ensures pacemaker does not try to make changes to the cluster in response to a node being unavailable. - -``` -crm configure property maintenance-mode=true -``` - -2. On the standby node (can be determined by running `crm_mon`) run the software upgrade, in the - case of Lookup this would look like: - -``` -zypper ref -zypper lu -zypper up ucs-ibis -service tomcat6 stop -service tomcat6 start -``` - -3. Once the upgrade has been successfully deployed to the standby node, move the service - over to the current standby node by editing the pacemaker configuration: - -``` -crm configure edit -# shift node weights so that the current standby node is the preferred service -# verify that the service has moved to the previous standby: -crm_mon -``` - -4. Make the software upgrade on the node which has now become the standby. - -5. Reconfigure the cluster to its previous state, and move out of maintenance mode: - -``` -crm configure edit -# shift node weights so that the current standby node is the preferred service -# verify that the service has moved to the previous standby: -crm_mon -# move out of maintenance mode -crm configure property maintenance-mode=false -``` + ``` + crm configure property maintenance-mode=true + # Check for "unmanaged" status + crm_mon -1 + ``` + +2. Release the lock on the package. + ``` + # List locked packages + zypper ll + # Release lock on appropriate package + zypper rl ucs-ibis + ``` + +3. Run the software upgrade. + ``` + # Refresh repositories + zypper ref + # List updates available + zypper lu + # Update application specific package + zypper up ucs-ibis + # Restart tomcat (a single "restart" doesn't always work) + service tomcat6 stop + service tomcat6 start + ``` + +4. Check the service is running on the node with update. + - see `https://{node url}/adm/status` + - check application functionality + +5. Move the service out of maintenance mode. + ``` + crm configure property maintenance-mode=false + # Check for removal of "unmanaged" status + crm_mon -1 + ``` + +6. Reapply lock to package. + ``` + # Release lock on appropriate package + zypper al ucs-ibis + # List locked packages to check + zypper ll + ``` + +### Move to current live node + +1. Move service to already updated standby node + ``` + crm configure edit + # shift node weights so that the current standby node is the preferred service + # verify that the service has moved to the previous standby: + crm_mon -1 + ``` + +2. Repeat steps 1 to 6 above. i.e. + - put the service back in maintenance mode + - unlock the package + - update the package + - check success + - remove the service from maintenance mode + - relock the package + +3. Move service to back to this node + ``` + crm configure edit + # shift node weights so that this nde is the preferred service + # verify that the service has moved back: + crm_mon -1 + ``` Following the above should allow a software upgrade to be deployed without any downtime. These steps will not work if the upgrade includes a breaking change to an external data source, e.g.