magnify
Home Tutorials GraphLab cluster deployment – quick start

GraphLab cluster deployment – quick start

Note: the MPI section of this toturial is based on this excellent tutorial.

Preliminaries:

  •  Mpi should be installed

Step 0: Install GraphLab on one of your cluster nodes.

Using the instructions here on your master node (one of your cluster machines)

Step 1: start MPI

a) Create a file called .mpd.conf in your home directory (only once)
This file should contain a secret password using the format:
secretword=some_password_as_you_wish

b) Verify that MPI is working by running the daemon (only once)

c) Spawn the cluster. Create a file named machines with the list of machines you like to deploy.
Run (for mpich2)

d) Copy GraphLab files to all machines.

On the node you installed GraphLab on, run the following commands to copy GraphLab files to the rest of the machines:

d1) Verify you have the machines files from section 1c) in your root folder.

d2) Copy the GraphLab files
cd ~/graphlabapi/release/toolkits;  ~/graphlabapi/scripts/mpirsync
cd ~/graphlabapi/deps/local; ~/graphlabapi/scripts/mpirsync

Step 2: Run GraphLab ALS

This step runs ALS (alternating least squares) in a cluster using small netflix susbset.
It first downloads the data from the web: http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.train and http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.validate, and runs 5 alternating least squares iterations. After the run is completed, you can login into any of the nodes and view the output files in the folder ~/graphlabapi/release/toolkits/collaborative_filtering/
The algorithm operation is explained in detail here.

Now run GraphLab:

Where -n is the number of MPI nodes, and –ncpus is the number of deployed cores on each MPI node.

machines is a file which includes a list of the machines you like to deploy on (each machine in a new line)

Note: this section assumes you have a network storage (ns) folder where the input can be stored.
Alternatively, you can split the input into several disjoint files, and store the subsets on the cluster machines.

Note: Don’t forget to change /path/to/als and /some/ns/folder to your actual folder path!

Note: For openmpi, use –hostfile instead of -f .

 

Step 3:

Fine tuning graphlab deployment.

 

Errors and their resolution:

Error:

/mnt/info/home/daroczyb/als: error while loading shared libraries: libevent_pthreads-2.0.so.5: cannot open shared object file: No such file or directory

Solution:

You should define LD_LIBRARY_PATH to point to the location of libevent_pthreads, this is done with the -x mpi command, for example:

Error:

/mnt/info/home/daroczyb/als: error while loading shared libraries: libjvm.so: cannot open shared object file: No such file or directory

Solution:

Point LD_LIBRARY_PATH to the location of libjvm.so using the -x mpi command:

 Error:

Solution:

You should verify the executable is found on the same path on all machines.

Error: a prompt asking for password when running mpiexec

Solution: Use the following instructions to allow connection with a public/private key pair (no password).

Error:

Solution:
Verify you classpath includes all hadoop required folders.