apache oozie #5

  • By bigdata-dev
  • Latest version (#5)
  • trusty
  • Stable
  • Edge

Description

Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of actions.
Oozie Coordinator jobs are recurrent Oozie Workflow jobs triggered by
time (frequency) and data availabilty.
Oozie is integrated with the rest of the Hadoop stack supporting several
types of Hadoop jobs out of the box (such as Java map-reduce, Streaming
map-reduce, Pig, Hive, Sqoop and Distcp) as well as system specific jobs.
Oozie is a scalable, reliable and extensible system..


Overview

Oozie, Workflow Engine for Apache Hadoop.

Oozie v3 is a server based Bundle Engine that provides a higher-level oozie
abstraction that will batch a set of coordinator applications. The user will
be able to start/stop/suspend/resume/rerun a set coordinator jobs in the bundle
level resulting a better and easy operational control.

Oozie provides:

  • Oozie is a workflow scheduler system to manage Apache Hadoop jobs.
  • Oozie Workflow jobs are Directed Acyclical Graphs (DAGs) of actions.
  • Oozie Coordinator jobs are recurrent Oozie Workflow jobs triggered by time
    (frequency) and data availabilty.
  • Oozie is integrated with the rest of the Hadoop stack supporting several types
    of Hadoop jobs out of the box (such as Java map-reduce, Streaming map-reduce,
    Pig, Hive, Sqoop and Distcp) as well as system specific jobs (such as Java programs
    and shell scripts).
  • Oozie is a scalable, reliable and extensible system.

See [http://oozie.apache.org]http://oozie.apache.org) for more information.

Usage

TBD

Configuration

resources_mirror
(string) URL from which to fetch resources (e.g., Hadoop binaries) instead of Launchpad.
heap
(int) The maximum heap size (in MB) used by the hadoop client jvm. If you experience out of memory (OOM) errors when running jobs, consider increasing this value.
1024