cassandra

Description

Cassandra is a distributed (peer-to-peer) system for the management and
storage of structured data.


Overview

The Apache Cassandra database is the right choice when you need scalability
and high availability without compromising performance. Linear scalability
and proven fault-tolerance on commodity hardware or cloud infrastructure
make it the perfect platform for mission-critical data. Cassandra's support
for replicating across multiple datacenters is best-in-class, providing lower
latency for your users and the peace of mind of knowing that you can survive
regional outages.

Cassandra's ColumnFamily data model offers the convenience of column indexes
with the performance of log-structured updates, strong support for materialized
views, and powerful built-in caching.

See cassandra.apache.org for more information.

Usage

Cassandra deployments are relatively simple in that they consist of a set of
Cassandra nodes which seed from each other to create a ring of servers:

juju deploy --repository . local:cassandra
juju add-unit -n 2 cassandra

The service units will deploy and will form a single ring.

The API to Cassandra is supported through Apache Thrift; Thrift is a software
framework for scalable cross-language services development.

See this documentation for more details of how to use this API.

Cassandra recommend using one of the many client options - see
ClientOptions for more details.

To relate the Cassandra charm to a service that understands how to talk to
Cassandra using thrift::

juju deploy --repository . local:service-that-needs-cassandra
juju add-relation service-that-needs-cassandra cassandra

Known Limitations and Issues

Changing the configuration of a deployed Cassandra cluster is supported; however
it will result in a restart of each Cassandra node as the changes are implemented
which may result in outages.

Configuration

Cassandra has a pretty good guess at configuring its Java memory settings to
fit the machine that it has been deployed on.

The charm does support manual configuration of Java memory settings - see the
config.yaml file for more details::

cassandra:
    auto-memory: false
    heap-size: 8G
    new-gen-size: 250M

However be aware that its recommended that Cassandra always remains in 'real'
memory and should never be swapped out to disk so keep this in mind when
changing these options.

Cassandra sets both is minimum and maximum heap size on startup so will
pre-allocate all memory to avoid freezes during operation (this happens
during normal operation as more memory is allocated to heap.

Contact Information

Cassandra

Configuration

extra-jvm-opts
(string) string to be appended to JVM_OPTS, e.g.: -javaagent:$CASSANDRA_HOME/lib/jamm-0.2.5.jar
cluster-port
(int) Cluster communication port
7000
io-scheduler
(string) Set kernel io scheduler for persistent storage. Only used when volume-ephemeral-storage is False. https://www.kernel.org/doc/Documentation/block/switching-sched.txt
cfq
use-simpleauth
(boolean) If True, it will use passwd-prop64 and access-prop64 configs (base64 encoded) to setup simple authentication by adding to JVM_OPTS: -Dpasswd.properties=/etc/casssandra/passwd.properties \ -Daccess.properties=/etc/casssandra/access.properties See http://www.datastax.com/docs/1.0/configuration/authentication
force-seed-nodes
(string) A comma separated list of seed nodes. This is useful if the cluster being created in this juju environment is part of a larger cluser and the seed nodes are remote.
volume-ephemeral-storage
(boolean) If False, a configure-error state will be raised if volume-map[$JUJU_UNIT_NAME] is not set (see "volume-map" below) - see "volume-map" below. If True, service units won't try to use "volume-map" (and related variables) to mount and use external (EBS) volumes, thus storage lifetime will equal VM, thus ephemeral. YOU'VE BEEN WARNED.
True
prefer_local
(boolean) Used with endpoint_snitch=GossipingPropertyFileSnitch to prefer the internal ip when possible, as the Ec2MultiRegionSnitch does. Used with cassandra >= 1.2.x
volume-dev-regexp
(string) Block device for attached volumes as seen by the VM, will be "scanned" for an unused device when "volume-map" is valid for the unit.
/dev/vd[b-z]
jmx-port
(int) JMX management port
7199
apt-repo-key
(string) Apt repository key, typically needed for apt-repo-spec.
4BD736A82B5C1B00
volume-map
(string) YAML map as e.g. "{ cassandra/0: vol-0000010, cassandra/1: vol-0000016 }". Service units will raise a "configure-error" condition if no volume-map value is set for it - it's expected a human to set it properly to resolve it.
auth-access64
(string) base64 encoded content to be written to /etc/casssandra/access.properties created by e.g.: juju set cassandra auth-access64="$(base64 ./access.props)"
new-gen-size
(string) Size of Java new generation memory, for example 100M. Only used if auto-memory = false.
100M
token-map-by-volid
(string) YAML map as e.g. "{ vol-00000012: 107950406921370402326527496543482482275, vol-00000013: 150485702786487710259449322472453508707 }". Set initial_token according to the name of the attached volume using this map. Can only be used when using persistent storage and cannot be used if token-map-by-unitname is also set. Useful when rebalancing a ring by hand
allow-single-node
(boolean) Allow cassandra to start in a single-node configuration. When deploying a new service with more than one initial unit (i.e. juju deploy -n 2), this should be set to false.
nagios_heapchk_warn_pct
(int) The pct of heap used to trigger a nagios warning
80
token-map-by-unitname
(string) YAML map as e.g. "{ cassandra/0: 107950406921370402326527496543482482275, cassandra/1: 150485702786487710259449322472453508707 }". Set initial_token according to the unit name using this map. Cannot be used if token-map-by-volid is also set. If persistent storage is being used then use token-map-by-volid instead. Useful when rebalancing a ring by hand
client-port
(int) Thrift clients port
9160
units-to-update
(string) Comma separated list of unit numbers to update (i.e. modify /etc setup and trigger cassandra restart on config-change or upgrade-charm), or "all".
all
stream-throughput
(int) Throttles all outbound streaming file transfers on nodes to the given total throughput in Mbps. This is necessary because Cassandra does mostly sequential IO when streaming data during bootstrap or repair, which can lead to saturating the network connection and degrading rpc performance. When unset, the default is 200 Mbps or 25 MB/s. 0 to disable throttling.
200
apt-repo-spec
(string) Apt repository to install cassandra package(s) from.
deb http://www.apache.org/dist/cassandra/debian 12x main
datacenter
(string) The datacenter used by the enpoint_snitch. i.e. "DC1"
auth-passwd64
(string) base64 encoded content to be written to /etc/casssandra/passwd.properties created by e.g.: juju set cassandra auth-passwd64="$(base64 ./passwd.props)"
endpoint_snitch
(string) The cassandra endpoint snitch to use. Currently supported by the charm, SimpleSnitch and GossipingPropertyFileSnitch.
org.apache.cassandra.locator.SimpleSnitch
auto-memory
(boolean) Automatically configure memory options based on specification of the server infrastructure its running on.
True
heap-size
(string) Total size of Java memory heap, for example 1G or 512M. Only used if auto-memory = false.
1G
nagios_context
(string) a string that will be prepended to instance name to set the host name in nagios. So for instance the hostname would be something like: juju-cassandra-0 If you're running multiple environments with the same services in them this allows you to differentiate between them.
juju
partitioner
(string) The cassandra partitioner to use
org.apache.cassandra.dht.RandomPartitioner
dc_suffix
(string) Add a suffix to a datacenter name. Used by the Ec2Snitch and Ec2MultiRegionSnitch to append a string to the EC2 region name. Used with cassandra >= 1.2.x
extra_packages
(string) Extra packages to install. A space delimited list of packages.
num-tokens
(int) Number of tokens per node for Cassandra 1.2+. Ignored for earlier versions. 0 disables.
256
cluster-name
(string) Name of the Cassandra Cluster - don't change yet!
Cassandra Cluster
compaction-throughput
(int) Throttles compaction to the given total throughput (in MB/sec) across the entire system. The faster you insert data, the faster you need to compact in order to keep the sstable count down, but in general, setting this to 16 to 32 times the rate you are inserting data is more than sufficient. Setting this to 0 disables throttling. Note that this account for all types of compaction, including validation compaction.
16
rack
(string) The rack used by the enpoint_snitch. i.e. "Rack1"
nagios_heapchk_crit_pct
(int) The pct of heap used to trigger a nagios critcal alert
90