Disk-based large-scale graph computation
UPDATE: see the GraphChi v0.2 announcement: http://code.google.com/p/graphchi/wiki/GraphChiVersion0p2Release
GraphChi[huahua] is a spin-off of the GraphLab[rador's retriever] project.
GraphChi can run very large graph computations on just a single machine, by using a novel algorithm for processing the graph from disk (SSD or hard drive). Programs for GraphChi are written in similar vertex-centric model as GraphLab. GraphChi runs vertex-centric programs asynchronously (i.e changes written to edges are immediately visible to subsequent computation), and in parallel. GraphChi also supports streaming graph updates and changing the graph structure while computing.
GraphChi brings web-scale graph computation, such as analysis of social networks, available to anyone with a modern laptop or PC. It saves you from the hassle and costs of working with a distributed cluster or cloud services. We find it much easier to debug applications on a single computer than trying to understand how a distributed algorithm is executed. If you do require the processing power of high-performance clusters, GraphChi can be an excellent tool for developing and debugging your algorithms prior to deploying them to the cluster. GraphChi supports most of the new GraphLab v2.1 API (with some restrictions), making the transition easy.
Remarkably, in some cases GraphChi can solve bigger problems in reasonable time than many other available distributed frameworks. GraphChi also runs efficiently on servers with plenty of memory, and can use multiple disks in parallel by striping the data.
The source code and documentation of GraphChi for C++ is available at the Google Code project pages:
After downloading the source code, the best way to get started it to start studying the example applications: http://code.google.com/p/graphchi/wiki/ExampleApps
An early version for Java is available at http://code.google.com/p/graphchi-java
The paper for GraphChi, published at OSDI 2012 is available here: Download PDF
Slides for Aapo’s talk at OSDI: OSDI talk slides
MIT Technology Review’s article about GraphChi:
Danny Bickson’s blog post about collaborative filtering toolkit for GraphChi (thanks Danny!):