hacluster #33

Corosync/Pacemaker

Overview

The hacluster subordinate charm provides corosync and pacemaker cluster
configuration for principle charms which support the hacluster, container
scoped relation.

The charm will only configure for HA once more that one service unit is
present.

Usage

NOTE: The hacluster subordinate charm requires multicast network support, so
this charm will NOT work in ec2 or in other clouds which block multicast
traffic. Its intended for use in MAAS managed environments of physical
hardware.

To deploy the charm:

juju deploy hacluster mysql-hacluster

To enable HA clustering support (for mysql for example):

juju deploy -n 2 mysql
juju deploy -n 3 ceph
juju set mysql vip="192.168.21.1"
juju add-relation mysql ceph
juju add-relation mysql mysql-hacluster

The principle charm must have explicit support for the hacluster interface
in order for clustering to occur - otherwise nothing actually get configured.

Settings

It is best practice to set cluster_count to the number of expected units in the
cluster. The charm will build the cluster without this setting, however, race
conditions may occur in which one node is not yet aware of the total number of
relations to other hacluster units, leading to failure of the corosync and
pacemaker services to complete startup.

Setting cluster_count helps guarantee the hacluster charm waits until all
expected peer relations are available before building the corosync cluster.

HA/Clustering

There are two mutually exclusive high availability options: using virtual
IP(s) or DNS.

To use virtual IP(s) the clustered nodes must be on the same subnet such that
the VIP is a valid IP on the subnet for one of the node's interfaces and each
node has an interface in said subnet. The VIP becomes a highly-available API
endpoint.

To use DNS high availability there are several prerequisites. However, DNS HA
does not require the clustered nodes to be on the same subnet.
Currently the DNS HA feature is only available for MAAS 2.0 or greater
environments. MAAS 2.0 requires Juju 2.0 or greater. The MAAS 2.0 client
requires Ubuntu 16.04 or greater. The clustered nodes must have static or
"reserved" IP addresses registered in MAAS. The DNS hostname(s) must be
pre-registered in MAAS before use with DNS HA.

The charm will throw an exception in the following circumstances:
If running on a version of Ubuntu less than Xenial 16.04

Usage for Charm Authors

The hacluster interface supports a number of different cluster configuration
options.

Mandatory Relation Data (deprecated)

Principle charms should provide basic corosync configuration:

corosync\_bindiface: The network interface to use for cluster messaging.
corosync\_mcastport: The multicast port to use for cluster messaging.

however, these can also be provided via configuration on the hacluster charm
itself. If configuration is provided directly to the hacluster charm, this
will be preferred over these relation options from the principle charm.

Resource Configuration

The hacluster interface provides support for a number of different ways
of configuring cluster resources. All examples are provided in python.

NOTE: The hacluster charm interprets the data provided as python dicts; so
it is also possible to provide these as literal strings from charms written
in other languages.

init_services

Services which will be managed by pacemaker once the cluster is created:

init_services = {
        'res_mysqld':'mysql',
    }

These services will be stopped prior to configuring the cluster.

resources

Resources are the basic cluster resources that will be managed by pacemaker.
In the mysql charm, this includes a block device, the filesystem, a virtual
IP address and the mysql service itself:

resources = {
    'res_mysql_rbd':'ocf:ceph:rbd',
    'res_mysql_fs':'ocf:heartbeat:Filesystem',
    'res_mysql_vip':'ocf:heartbeat:IPaddr2',
    'res_mysqld':'upstart:mysql',
    }

resource_params

Parameters which should be used when configuring the resources specified:

resource_params = {
    'res_mysql_rbd':'params name="%s" pool="images" user="%s" secret="%s"' % \
                    (config['rbd-name'], SERVICE_NAME, KEYFILE),
    'res_mysql_fs':'params device="/dev/rbd/images/%s" directory="%s" '
                   'fstype="ext4" op start start-delay="10s"' % \
                    (config['rbd-name'], DATA_SRC_DST),
    'res_mysql_vip':'params ip="%s" cidr_netmask="%s" nic="%s"' %\
                    (config['vip'], config['vip_cidr'], config['vip_iface']),
    'res_mysqld':'op start start-delay="5s" op monitor interval="5s"',
    }

groups

Resources which should be managed as a single set of resource on the same service
unit:

groups = {
    'grp_mysql':'res_mysql_rbd res_mysql_fs res_mysql_vip res_mysqld',
    }

clones

Resources which should run on every service unit participating in the cluster:

clones = {
    'cl_haproxy': 'res_haproxy_lsb'
    }

Configuration

corosync_mcastport
(int)
                            Default multicast port number that will be used to communicate between
HA Cluster nodes. Only used when corosync_transport = multicast.

                        
nagios_servicegroups
(string)
                            A comma-separated list of nagios servicegroups.
If left empty, the nagios_context will be used as the servicegroup

                        
stonith_enabled
(string)
                            Enable resource fencing (aka STONITH) for every node in the cluster.
This requires MAAS credentials be provided and each node's power
parameters are properly configured in its invenvory.

                        
False
corosync_key
(string)
                            This value will become the Corosync authentication key. To generate
a suitable value use:
.
  sudo corosync-keygen
  sudo cat /etc/corosync/authkey | base64 -w 0
.
This configuration element is mandatory and the service will fail on
install if it is not provided.  The value must be base64 encoded.

                        
64RxJNcCkwo8EJYBsaacitUvbQp5AW4YolJi5/2urYZYp2jfLxY+3IUCOaAUJHPle4Yqfy+WBXO0I/6ASSAjj9jaiHVNaxmVhhjcmyBqy2vtPf+m+0VxVjUXlkTyYsODwobeDdO3SIkbIABGfjLTu29yqPTsfbvSYr6skRb9ne0=
cluster_count
(int)
                            Number of peer units required to bootstrap cluster services.
.
If less that 3 is specified, the cluster will be configured to
ignore any quorum problems; with 3 or more units, quorum will be
enforced and services will be stopped in the event of a loss
of quorum. It is best practice to set this value to the expected
number of units to avoid potential race conditions.

                        
3
service_stop_timeout
(int)
                            Systemd override value for corosync and pacemaker service stop timeout in seconds.
Set value to -1 turn off timeout for the services.

                        
60
maas_credentials
(string)
                            MAAS credentials (required for STONITH).
                        
maas_url
(string)
                            MAAS API endpoint (required for STONITH).
                        
nagios_context
(string)
                            Used by the nrpe-external-master subordinate charm.
A string that will be prepended to instance name to set the host name
in nagios. So for instance the hostname would be something like:
.
    juju-postgresql-0
.
If you're running multiple environments with the same services in them
this allows you to differentiate between them.

                        
juju
service_start_timeout
(int)
                            Systemd override value for corosync and pacemaker service start timeout in seconds.
Set value to -1 turn off timeout for the services.

                        
180
corosync_transport
(string)
                            Two supported modes are multicast (udp) or unicast (udpu)

                        
unicast
netmtu
(int)
                            Specifies the corosync.conf network mtu. If unset, the default
corosync.conf value is used (currently 1500). See 'man corosync.conf' for
detailed information on this config option.

                        
corosync_mcastaddr
(string)
                            Multicast IP address to use for exchanging messages over the network.
If multiple clusters are on the same bindnetaddr network, this value
can be changed.  Only used when corosync_transport = multicast.

                        
226.94.1.1
monitor_interval
(string)
                            Time period between checks of resource health. It consists of a number
and a time factor, e.g. 5s = 5 seconds. 2m = 2 minutes.

                        
5s
maas_source
(string)
                            PPA for python3-maas-client:
.
  - ppa:maas/stable
  - ppa:maas/next
.
The last option should be used in conjunction with the key configuration
option.
Used when service_dns is set on the primary charm for DNS HA

                        
ppa:maas/stable
debug
(boolean)
                            Enable debug logging
                        
corosync_bindiface
(string)
                            Default network interface on which HA cluster will bind to communication
with the other members of the HA Cluster.  Defaults to the network
interface hosting the units private-address.  Only used when
corosync_transport = multicast.

                        
monitor_host
(string)
                            One or more IPs, separated by space, that will be used as a saftey check
for avoiding split brain situations. Nodes in the cluster will ping these
IPs periodicaly. Node that can not ping monitor_host will not run shared
resources (VIP, shared disk...).

                        
prefer-ipv6
(boolean)
                            If True enables IPv6 support. The charm will expect network interfaces
to be configured with an IPv6 address. If set to False (default) IPv4
is expected.
.
NOTE: these charms do not currently support IPv6 privacy extension. In
order for this charm to function correctly, the privacy extension must be
disabled and a non-temporary address must be configured/available on
your network interface.