8 machines, 17 units
Apache Drill allows you to query a number of less traditional datasources. As detailed above these do not have to be SQL databases and instead might be, CSV files, JSON files, data stored in Hadoop or a combination of all 3 and more. Apache Drill allows users to query multiple data sources as a single entitiy, letting you combine customer data from your CRM with sales exports data stored on a shared file server in a single view. Allowing users the ability to gain better insight into their data than ever before.
HBase is designed to run on top of the HDFS file system and provide Google Bigtable like access to data stored within it. HBase runs on top of HDFS and is well-suited for faster read and write operations on large datasets with high throughput and low input/output latency. HBase isn't a SQL compatible platform though, and as such to use it with more traditional tools a layer like Apache Drill needs to be leveraged to provide compatibility.
This bundle is a basic HBase deployment with Apache Drill. It is designed to allow easy deployment of a scalable NOSQL OLAP analysis setup. The deployment of this bundle will deploy the following units:
There are 2 easy ways to deploy this bundle.
Click the Add to model button at the top of this page. Then the Deploy changes button and follow the on screen instructions.
Deploy this bundle using juju:
juju deploy ~spiculecharms/drill-hadoop
juju expose apache-drill
To make use of this bundle you first need to import data into HBase, this can be done in a number of ways.
Once the data is available there is a predefined connection available in your Apache Drill Storage pool.
To execute queries you can run SQL queries like this:
select CONVERT_FROM(row_key, 'UTF8') from `juju_hbase`.`tab4`
More information can be found on the Drill Documentation website.