Apache Accumulo is a highly scalable structured and distributed key/value store for high performance data storage and retrieval..
Apache™ Accumulo is a high performance data storage and retrieval system with
cell-level access control. It is a scalable implementation of Google’s Big Table
design that works on top of Apache Hadoop® and Apache ZooKeeper.
Cell-level access control is important for organizations with complex policies
governing who is allowed to see data. It enables the intermingling of different
data sets with different access control policies and proper handling of
individual data sets that have some sensitive portions.
Without Accumulo, those policies are difficult to enforce systematically.
Accumulo encodes those rules for each individual data cell and allows fine-grained
Accumulo is ideal to be used as a data storage component in any Big Data Healthcare,
Financial, and security solution.
Hortonworks Accumulo requires access to hadoop cluster and zookeeper quorum.
Deploy Hadoop cluster
juju deploy hdp-hadoop yarn-hdfs-master
juju deploy hdp-hadoop compute-node
juju add-relation yarn-hdfs-master:namenode compute-node:datanode
juju add-relation yarn-hdfs-master:resourcemanager compute-node:nodemanager
Deploy zookeeper quorum:
juju deploy hdp-zookeeper
juju add-unit -n 2 hdp-zookeeper
Deploy Accumulo Cluster:
juju deploy hdp-accumulo accumulo-master
juju deploy hdp-accumulo tablet-servers
juju add-relation accumulo-master:accumulo-server tablet-servers:tabletserver
juju add-relation accumulo-master:zookeeper hdp-zookeeper:zookeeper
juju add-relation tablet-servers:zookeeper hdp-zookeeper:zookeeper
juju add-relation accumulo-master:namenode yarn-hdfs-master:namenode
juju add-relation tablet-servers:namenode yarn-hdfs-master:namenode
juju add-unit -n 2 compute-node
juju add-unit -n 3 tablet-servers
From any command line:
$juju ssh accumulo-master/0
HDFS validation from Tez Client
Remote HDFS Cluster health
$su hdfs -c 'hdfs dfsadmin -report '
Accumulo install and configuratio
Accumulo must be initialized to create the structures it uses internally to locate
data across the cluster. HDFS is required to be configured and running before
Accumulo can be initialized.
Once HDFS is started, initialization can be performed by executing:
This script will prompt for a name for this instance of Accumulo. The instance
name is used to identify a set of tables and instance-specific settings. The
script will then write some information into HDFS so Accumulo can start properly.
The initialization script will prompt you to set a root password. Once Accumulo
is initialized it can be started.
Run the Accumulo:
View the Accumulo native UI:
Amir Sanjar email@example.com