Description

The CharmScaler is an autoscaler for Juju applications. Based on Elastisys'
autoscaling engine, it rightsizes your application deployments using
sophisticated auto-scaling algorithms to ensure that the application runs
cost-efficiently and is responsive at all times, even in the face of sudden
load spikes. At times of high anticipated load your charm is reinforced with
additional units -- units that are automatically decomissioned as the
pressure on your application goes down.

Overview

The Elastisys CharmScaler is an autoscaler for Juju applications. It
automatically scales your charm by adding units at times of high load and by
removing units at times of low load.

The initial edition of the CharmScaler features a simplified version of
Elastisys' autoscaling engine (described below),
without its
predictive capabilities
and with limited scaling metric support. Work is underway on a more
fully-featured CharmScaler, but no release date has been set yet.

The initial CharmScaler edition scales the number of units of your applications
based on the observed CPU usage. These CPU metrics are collected from your
application by a telegraf agent, which
pushes the metrics into an
InfluxDB backend, from
where they are consumed by the CharmScaler.

The CharmScaler is available both free-of-charge and as a subscription service.
The free version comes with a size restriction which currently limits the size
of the scaled application to four units. Subscription users will see no such
size restrictions. For more details refer to the Subscription
section below.

If you are eager to try out the CharmScaler, head directly to the
Quickstart section. If you want to learn more about the
Elastisys autoscaler, read on ...

Introducing the Elastisys Autoscaler

User experience is king. You want to offer your users a smooth ride. From a
performance perspective, this translates into providing them with a responsive
service. As response times increase you will see more and more users leaving,
perhaps for competing services.

An application can be tuned in many ways, but one critical aspect is to make
sure that it runs on sufficient hardware, capable of bearing the weight that is
placed on your system. However, resource planning is notoriously hard and
involves a lot of guesswork. A fixed "peak-dimensioned" infrastructure is
certain to have you overspending most of the time and, what's worse, you can
never be sure that it actually will be able to handle the next load surge.
Ideally, you want to run with just the right amount of resources at all times.
It is plain to see that such a process involves a lot of planning and manual
labor.

Elastisys automates this process with a sophisticated autoscaler. The Elastisys
autoscaler uses proactive scaling algorithms based on state-of-the-art
research, which, predictively offers just in time capacity. That is, it can
provision servers in advance so that the right amount of capacity is available
when it is needed, not when you realize that it's needed (by then your
application may already be suffering). Research has shown that there is no
single scaling algorithm to rule them all. Different workload patterns require
different algorithms. The Elastisys autoscaler is armed with a growing
collection of such algorithms.

The Elastisys autoscaler already supports a
wide range of clouds and platforms.
With the addition of the Juju CharmScaler, which can scale any Juju application
Charm, integration with your application has never been easier. Whether it’s a
Wordpress site, a Hadoop cluster, a Kubernetes cluster, or even OpenStack
compute nodes, or your own custom-made application charm, hooking it up to be
scaled by the Elastisys autoscaler is really easy.

Read more about Elastisys' cloud automation platform at
https://elastisys.com.

Subscription

The free edition places a constraint on the size of the scaled application to
four units. To remove this restriction you need to become a paying subscription
user. Juju is currently in beta, and does not yet support commercial charms.
Once Juju is officially released, the CharmScaler will be available as a
subscription service. Until then, you can contact us and we will help you set
up a temporary subscription arrangement.

For upgrading to a premium subscription, for a customized solution, or for
general questions or feature requests, feel free to contact Elastisys at
contact@elastisys.com.

Quickstart

If you can't wait to get started, the following minimal example (relying on
configuration defaults) will let you start scaling your charm right away. For a
description of the CharmScaler and further details on its configuration, refer
to the sections below.

Minimal config.yaml example

charmscaler:
  juju_api_endpoint: "[API address]:17070"
  juju_model_uuid: "[uuid]"
  juju_username: "[username]"
  juju_password: "[password]"

Deploy and relate the charms

juju deploy cs:~elastisys/charmscaler --config=config.yaml
juju deploy cs:~chris.macnaughton/influxdb
juju deploy telegraf
juju deploy [charm]

juju relate charmscaler:db-api influxdb:api
juju relate telegraf:influxdb-api influxdb:api
juju relate telegraf:juju-info [charm]:juju-info
juju relate charmscaler:juju-info [charm]:juju-info

How the CharmScaler operates

CharmScaler flow

The image above illustrates the flow of the CharmScaler when scaling a
Wordpress application. Scaling decisions executed by the CharmScaler are
dependent on a load metric. In this case it looks at the CPU usage of machines
where Wordpress instances are deployed.

Metrics are collected by the Telegraf agent which is deployed as a subordinate
charm attached to the Wordpress application. This means that whenever the
Wordpress application is scaled out, another Telegraf collector will be
deployed as well and automatically start pushing new metrics to InfluxDB.

The CharmScaler will ask InfluxDB for new metric datapoints at every poll
interval (configured using the metric_poll_interval option). From these load
metrics the CharmScaler decides how many units are needed by your application.

In the case of Wordpress it is necessary to distribute the load on all of the
units using a load balancer. If you haven't already, checkout the Juju
documentation page on
charm scaling.

Configuration explained

The CharmScaler's configuration is comprised of three main parts:
juju, scaling and alerts.

Juju

The CharmScaler manages the number of units of the scaled charm via the Juju
controller. To be able to do that it needs to authenticate with the controller.
Controller authentication credentials are passed to the CharmScaler through
options prefixed with juju_.

Note that in a foreseeable future, passing this kind of credentials to the
CharmScaler may no longer be necessary. Instead of requiring you to manually
type in the authentication details one could envision Juju giving the charm
access through relations or something similar.

Scaling

The CharmScaler has a number of config options that control the autoscaler's
behavior. Those options are prefixed with either scaling_ or metric_.
metric_ options control the way metrics are fetched and processed while the
scaling_ options control when and how the charm units are scaled.

The scaling algorithm available in this edition of the CharmScaler is a
rule-based one that looks at CPU usage. At each iteration (configured using the
scaling_interval option) the following rules are considered by the autoscaler
before making a scaling decision:

  1. scaling_cooldown - Has enough time passed since the last scale-event
    (scale in or out) occured?
  2. scaling_cpu_[max/min] - Is the CPU usage above/below the set limit?
  3. scaling_period_[up/down]scale - Has the CPU usage been above/below
    scaling_cpu_[max/min] for a long enough period of time?

If all three rules above are satisifed either a scale-out or a scale-in occurs
and the scaled charm will automatically add or remove a unit.

Note that configuring the scaling algorithm is a balancing act -- one always
needs to balance the need to scale "quickly enough" against the need to avoid
"jumpy behavior". Too frequent scale-ups/scale-downs could have a negative
impact on overall performance/system stability.

The default behavior adds a new unit when the average CPU usage (over all charm
units) has exceeded 80% for at least one minute. If you want to make the
CharmScaler quicker to respond to changes, you can, for example, lower the
threshold to 60% and the evaluation period to 30 seconds:

juju config charmscaler scaling_cpu_max=60
juju config charmscaler scaling_period_upscale=30

Similarly, the default behavior removes a new unit when the average CPU usage
has been under 20% (scaling_cpu_min) for at least two minutes
(scaling_period_downscale). Typically, it is preferable to allow the
application to be overprovisioned for some time to prevent situations where we
are too quick to scale down, only to realize that the load dip was only
temporary and that we need to scale back up again. We can, for instance, set
the evaluation period preceding scale-downs a bit longer (five minutes) via:

juju config charmscaler scaling_period_downscale=300

Finally, changing the amount of time required between two scaling decisions can
be done via:

juju config charmscaler scaling_cooldown=300

This parameter should, however, be kept long enough to give scaling decisions a
chance to take effect, before a new scaling decision is triggered.

Alerts

Lastly, the options with the alert_ prefix are used to enable CharmScaler
alerts (these are turned off by default).

Alerts are used to notify the outside world (such as the charm owner) of
noteable scaling events or error conditions. Alerts are, for example, sent
(with severity-level ERROR) if there are problems to reach the Juju
controller. Alerts are also sent (with severity-level INFO) when a scaling
decision has been made.

This edition of the CharmScaler includes email alerts which are configured by
entering the SMTP server details which the autoscaler is supposed to send the
alert email messages to.

Known limitations

When deploying on LXD provider

Due to missing support for the Docker LXC profile in Juju you need to apply it
manually.

See: https://bugs.launchpad.net/juju/+bug/1552815

InfluxDB co-location

The Docker layer currently
does not support installing Docker Compose in a virtual environment. Until
this is supported
InfluxDB cannot be co-located with the CharmScaler charm because of dependency
conflicts.


By using the Elastisys CharmScaler, you agree to its
license
and privacy statement.

Configuration

scaling_cpu_min
(int) CPU threshold where the load is considered low enough to scale down the number of units.
20
scaling_period_downscale
(int) Number of seconds that the CPU usage needs to be lower than the threshold before scaling down.
120
scaling_units_max
(int) Maximum amount of units to keep in pool
4
install_from_upstream
(boolean) Toggle installation from ubuntu archive vs the docker PPA
http_proxy
(string) URL to use for HTTP_PROXY to be used by Docker. Only useful in closed environments where a proxy is the only option for routing to the registry to pull images
juju_model_uuid
(string) Juju model UUID
docker-opts
(string) Extra options to pass to the docker daemon. e.g. --insecure-registry
scaling_interval
(int) Seconds between each scaling decision
10
juju_refresh_interval
(int) How often the charmscaler should sync against the Juju model.
5
juju_password
(string) Juju account password
metric_data_settling_interval
(int) The minimum age (in seconds) of requested data points. When requesting recent aggregate metric data points, there is always a risk of seeing partial/incomplete results before metric values from all sources have been registered. The value to set for this field depends on the reporting frequency of monitoring agents, but as a general rule-of-thumb, this value can be set to be about 1.5 times the length of the reporting-interval for monitoring agents.
15
juju_api_endpoint
(string) Juju controller API endpoint
no_proxy
(string) Comma-separated list of destinations (either domain names or IP addresses) that should be directly accessed, by opposition of going through the proxy defined above.
https_proxy
(string) URL to use for HTTPS_PROXY to be used by Docker. Only useful in closed environments where a proxy is the only option for routing to the registry to pull images
alert_enabled
(boolean) Toggle e-mail alerts on/off
alert_smtp_username
(string) Username to auth with the SMTP server
enable-cgroups
(boolean) Enable GRUB cgroup overrides cgroup_enable=memory swapaccount=1. WARNING changing this option will reboot the host - use with caution on production services
alert_sender
(string) E-mail address that alert mails should be sent from
nagios_servicegroups
(string) A comma-separated list of nagios servicegroups. If left empty, the nagios_context will be used as the servicegroup
alert_smtp_host
(string) SMTP hostname
metric_poll_interval
(int) Seconds between polls for new metric values
10
charmpool_url
(string) URL to the Charmpool component. By default both the autoscaler and the pool is run in the same Docker network and will reach eachother by their local hostnames.
http://charmpool:80
alert_smtp_ssl
(boolean) Use SSL when connecting to SMTP host
juju_username
(string) Juju account username
scaling_cooldown
(int) Time (in seconds) before making another scaling decision from the time of the last up- or downscale. This is useful to prevent extra resizes due to slow teardowns or, in perticular, upstarts.
300
port_autoscaler
(int) Port which the Autoscaler API should be served on.
8097
alert_smtp_port
(int) SMTP port
25
name
(string) The name of the service - mainly shows up in the alert e-mails Also useful to distinguish between multiple CharmScaler charms
CharmScaler
nagios_context
(string) Used by the nrpe subordinate charms. A string that will be prepended to instance name to set the host name in nagios. So for instance the hostname would be something like: juju-myservice-0 If you're running multiple environments with the same services in them this allows you to differentiate between them.
juju
alert_smtp_password
(string) Password to auth with the SMTP server
scaling_cpu_max
(int) CPU usage threshold at which the number of units should be scaled up.
80
alert_receivers
(string) Space separated list of e-mail addresses that should recieve alerts
scaling_units_min
(int) Minimum amount of units to keep in pool
1
alert_levels
(string) Alert levels that should trigger alert mails to be sent out
INFO NOTICE WARN ERROR FATAL
scaling_period_upscale
(int) Number of seconds that the CPU usage needs to be higher than the threshold before scaling up.
60