Bigdata Dev Spark

By Juju Big Data Development
Big Data

Architecture:

Channel	Revision	Published	Runs on
latest/stable	46	18 Mar 2021	Ubuntu 16.04
latest/edge	46	18 Mar 2021	Ubuntu 16.04

Learn to deploy on juju >

Platform:

16.04

Learn about actions >

connectedcomponent

Run the Spark Bench ConnectedComponent benchmark.
decisiontree

Run the Spark Bench DecisionTree benchmark.
kmeans

Run the Spark Bench KMeans benchmark.
linearregression

Run the Spark Bench LinearRegression benchmark.
list-jobs

List scheduled periodic jobs.
logisticregression

Run the Spark Bench LogisticRegression benchmark.
matrixfactorization

Run the Spark Bench MatrixFactorization benchmark.
pagerank

Calculate PageRank for a sample data set
Params
- iterations string
  
  Number of iterations for the SparkPageRank job
pca

Run the Spark Bench PCA benchmark.
pregeloperation

Run the Spark Bench PregelOperation benchmark.
remove-job

Remove a job previously scheduled for repeated execution.
Params
- action-id string
  
  The ID returned by the action that scheduled the job.
Required

action-id
restart-spark-job-history-server

Restart the Spark job history server.
shortestpaths

Run the Spark Bench ShortestPaths benchmark.
smoke-test

Verify that Spark is working by calculating pi
sparkpi

Calculate Pi
Params
- partitions string
  
  Number of partitions to use for the SparkPi job
sql

Run the Spark Bench SQL benchmark.
start-spark-job-history-server

Start the Spark job history server.
stop-spark-job-history-server

Stop the Spark job history server.
stronglyconnectedcomponent

Run the Spark Bench StronglyConnectedComponent benchmark.
submit

Submit a job to Spark.
Params
- class string
  
  If a JAR is given, this should be the name of the class within the JAR to run.
- cron string
  
  Schedule the job to be run periodically, according to the given cron rule. For example: "*/5 * * * *" will run the job every 5 minutes.
- extra-params string
  
  Additional params to pass to spark-submit. For example: "--executor-memory 1000M --supervise"
- job string
  
  URL to a JAR or Python file. This can be any URL supported by spark-submit, such as a remote URL, an hdfs:// path (if connected to HDFS), etc.
- job-args
  
  Arguments for the job.
- packages string
  
  Comma-separated list of packages to include.
- py-files string
  
  Comma-separated list of Python packages to include.
Required

job
svdplusplus

Run the Spark Bench SVDPlusPlus benchmark.
svm

Run the Spark Bench SVM benchmark.