How to launch a Gearpump cluster on YARN
gearpump-2.12-0.9.0.zipto remote HDFS Folder, suggest to put it under
Make sure the home directory on HDFS is already created and all read-write rights are granted for user. For example, user gear's home directory is
Put the YARN configurations under classpath. Before calling
yarnclient launch, make sure you have put all yarn configuration files under classpath. Typically, you can just copy all files under
$HADOOP_HOME/etc/hadoopfrom one of the YARN Cluster machine to
$HADOOP_HOMEpoints to the Hadoop installation directory.
Launch the gearpump cluster on YARN
yarnclient launch -package /usr/lib/gearpump/gearpump-2.12-0.9.0.zip
If you don't specify package path, it will read default package-path (
NOTE: You may need to execute
chmod +x bin/*in shell to make the script file
After launching, you can browser the Gearpump UI via YARN resource manager dashboard.
How to configure the resource limitation of Gearpump cluster
Before launching a Gearpump cluster, please change configuration section
gear.conf to configure the resource limitation, like:
- The number of worker containers.
- The YARN container memory size for worker and master.
How to submit a application to Gearpump cluster.
To submit the jar to the Gearpump cluster, we first need to know the Master address, so we need to get a active configuration file first.
There are two ways to get an active configuration file:
Option 1: specify "-output" option when you launch the cluster.
yarnclient launch -package /usr/lib/gearpump/gearpump-2.12-0.9.0.zip -output /tmp/mycluster.conf
It will return in console like this:
==Application Id: application_1449802454214_0034
Option 2: Query the active configuration file
yarnclient getconfig -appid <yarn application id> -output /tmp/mycluster.conf
yarn application id can be found from the output of step1 or from YARN dashboard.
After you downloaded the configuration file, you can launch application with that config file.
gear app -jar examples/wordcount-2.12-0.9.0.jar -conf /tmp/mycluster.conf
To run Storm application over Gearpump on YARN, please store the configuration file with
-output application.confand then launch Storm application with
storm -jar examples/storm-2.12-0.9.0.jar storm.starter.ExclamationTopology exclamation
Now the application is running. To check this:
gear info -conf /tmp/mycluster.conf
To Start a UI server, please do:
services -conf /tmp/mycluster.conf
The default username and password is "admin:admin", you can check UI Authentication to find how to manage users.
How to add/remove machines dynamically.
Gearpump yarn tool allows to dynamically add/remove machines. Here is the steps:
First, query to get active resources.
yarnclient query -appid <yarn application id>
The console output will shows how many workers and masters there are. For example, I have output like this:
masters: container_1449802454214_0034_01_000002(IDHV22-01:35712) workers: container_1449802454214_0034_01_000003(IDHV22-01:35712) container_1449802454214_0034_01_000006(IDHV22-01:35712)
To add a new worker machine, you can do:
yarnclient addworker -appid <yarn application id> -count 2
This will add two new workers machines. Run the command in first step to check whether the change is effective.
To remove old machines, use:
yarnclient removeworker -appid <yarn application id> -container <worker container id>
The worker container id can be found from the output of step 1. For example "container_1449802454214_0034_01_000006" is a good container id.
To kill a cluster,
yarnclient kill -appid <yarn application id>
NOTE: If the application is not launched successfully, then this command won't work. Please use "yarn application -kill
To check the Gearpump version
yarnclient version -appid <yarn application id>