Readme
Overview
--------
HBase is the Hadoop database. Think of it as a distributed scalable Big Data
store.
Use HBase when you need random, realtime read/write access to your Big Data.
This project's goal is the hosting of very large tables -- billions of rows X
millions of columns -- atop clusters of commodity hardware.
HBase is an open-source, distributed, versioned, column-oriented store modeled
after Google's Bigtable: A Distributed Storage System for Structured Data by
Chang et al. Just as Bigtable leverages the distributed data storage provided
by the Google File System, HBase provides Bigtable-like capabilities on top of
Hadoop and HDFS.
HBase provides:
* Linear and modular scalability.
* Strictly consistent reads and writes.
* Automatic and configurable sharding of tables
* Automatic failover support between RegionServers.
* Convenient base classes for backing Hadoop MapReduce jobs with HBase tables.
* Easy to use Java API for client access.
* Block cache and Bloom Filters for real-time queries.
* Query predicate push down via server side Filters
* Thrift gateway and a REST-ful Web service that supports XML, Protobuf,
and binary data encoding options
* Extensible jruby-based (JIRB) shell
* Support for exporting metrics via the Hadoop metrics subsystem to files
or Ganglia; or via JMX.
See http://hbase.apache.org for more information.
This charm provides the hbase master and regionserver roles which form
part of an overall hbase deployment
Usage
-----
A HBase deployment consists of a HBase master service and one or more
HBase RegionServer services::
juju deploy hbase hbase-master
juju deploy hbase hbase-regioncluster-01
In order to function correctly the hbase master and regionserver services
have a mandatory relationship with zookeeper - please use the zookeeper charm
to create a functional zookeeper quorum and then relate it to this charm::
juju deploy zookeeper hbase-zookeeper
juju add-units -n 2 hbase-zookeeper
juju add-relation hbase-master hbase-zookeeper
juju add-relation hbase-regioncluster-01 hbase-zookeeper
Remember that quorums come in odd numbers start from 3 (but it will work
with one BUT with no resilience).
The hbase services also require the services of an hdfs namenode; these are
provided by the hadoop charm.
HBase requires that append mode is enabled in DFS - this can be set by providing
a config.yaml file::
hdfs-namenode:
hbase: True
hdfs-datacluster-01:
hbase: True
Its really important to ensure that both the master and the slave services have
the same configuration in this deployment scenario::
juju deploy --config config.yaml hadoop hdfs-namenode
juju deploy --config config.yaml hadoop hdfs-datacluster-01
juju add-relation hdfs-namenode:namenode hdfs-datacluster-01:datanode
The hadoop services can also support mapreduce - please see the hadoop charm
for more details.
The namenode can then be related to the hbase deployment::
juju add-relation hdfs-namenode:namenode hbase-master:namenode
juju add-relation hdfs-namenode:namenode hbase-regioncluster-01:namenode
Once the hbase services have been related to both zookeeper and hdfs they
can be related to each other::
juju add-relation hbase-master:master hbase-regioncluster-01:regionserver
At this point the role of each service is fixed and CANNOT be changed. ever.
period.
Its also possible to run with more that one hbase master service unit::
juju add-unit hbase-master
The masters will coordinate through zookeeper to establish control of the
cluster and will re-coordinate if one of the master service units disappears.
You can also add additional regionservers::
juju add-unit -n 2 hbase-regioncluster-01
The charm also supports use of the thrift, avro and rest gateways. Any hbase
service can be used in this way by associating another service with it::
juju add-relation hush:thrift hbase-regioncluster-01:thrift
OR you can deploy a seperate gateway server::
juju deploy hbase hbase-thrift
juju add-relation hbase-thrift hbase-zookeeper
juju add-relation hush:thrift hbase-thrift:thrift
thrift, avro and rest all operate over HTTP and are stateless so use with
haproxy is possible::
juju deploy haproxy rest-gateway
juju add-relation rest-gateway hbase-regioncluster-01:rest
Rolling Restarts
----------------
Restarting a HBase deployment is potentially disruptive so the charm will NOT
automatically restart HBase when the following events occur:
* Zookeeper service units joining or departing relations.
* Upgrading the charm or changing the configuration.
However the charm will update configuration files and automatically sets up
SSH key authentication between nodes within a service deployment and from the
master service to regionserver services.
A rolling restart script is provided by the charm which will restart you HBase
deployment in a controlled fashion::
juju ssh hbase-master/0 hbase-rolling-restart
If any inconsistencies are found in HBase the restart will not happen. The script
also supports just restarting regionservers::
juju ssh hbase-master/0 hbase-rolling-restart --rs-only
or just masters::
juju ssh hbase-master/0 hbase-rolling-restart --master-only
This script must be run from a HBase master.
Changes
| 2012/05/22 James Page Made myself the maintainer (revno 19) |
| 2012/04/25 James Page Tweaked README for use from charmstore (revno 18) |
| 2012/04/24 James Page Switch to stable PPA (revno 17) |
| 2012/03/15 James Page Merged documentation changes from bbcmicrocomputer (revno 16) |
| 2012/03/12 James Page Tweaks for juju changes (revno 15) |
| 2012/03/03 James Page Fixed up rest, avro and thrift hooks and added support for installing pig with hbase (revno 14) |
| 2012/03/03 James Page Updated port details for avro (revno 13) |
| 2012/03/02 James Page Added jobtracker interface and handlers pt 2 (revno 12) |
| 2012/03/02 James Page Added jobtracker interface and handlers (revno 11) |