Graphlab deployment on a single multicore machine – quick start
Note: the MPI section of this toturial is based on this excellent tutorial.
Preliminaries:
MPI should be installed
Step 0: Install GraphLab on one of your cluster nodes.
Using the instructions here on your master node (one of your cluster machines)
Step 1: Run GraphLab ALS
This step runs ALS (alternating least squares) in a cluster using small netflix susbset.
It first downloads the data from the web: http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.train
It first downloads the data from the web: http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.train
runs 5 alternating least squares iterations. After the run is completed, the output files will be created in the running folder
(the folder ~/graphlabapi/release/toolkits/collaborative_filtering/)
The algorithm operation is explained in detail here.
|
1 2 3 4 5 6 |
cd ~graphlabapi/release/toolkits/collaborative_filtering/ mkdir smallnetflix cd smallnetflix/ wget http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.train wget http://www.select.cs.cmu.edu/code/graphlab/datasets/smallnetflix_mm.validate cd .. |
Now run GraphLab:
|
1 |
./als --matrix ./smallnetflix/ --max_iter=5 --ncpus=1 |
Where –ncpus is the number of deployed cores.
For any questions or bug fixes about this tutorial, please email: graphlabapi@googlegroups.com

