Mizan: Optimizing Graph Mining in Large Parallel Systems

Date: 
Friday, February 10, 2012 - 10:18am

UCSB COMPUTER SCIENCE DEPARTMENT

Monday, February 27, 2012
3:30 – 4:30 PM
Computer Science Conference Room, Harold Frank Hall Rm. 1132

HOST: Amr El Abbadi

SPEAKER: Panos Kalnis
Division of Mathematical and Computer Sciences and Engineering, King
Abdullah University of Science and Technology (KAUST).

Title: Mizan: Optimizing Graph Mining in Large Parallel Systems

Abstract:

Extracting information from graphs, from finding shortest paths to
complex graph mining, is essential for many applications. Due to the
shear size of modern graphs (e.g., social networks), processing must be
done on large parallel computing infrastructures (e.g., the cloud).
Earlier approaches relied on the MapReduce framework, which proved
inadequate for graph algorithms. Recently, the message-passing model
(e.g., Pregel) has emerged. Although the Pregel model has many
advantages, it is agnostic to the graph properties and the architecture
of the underlying computing infrastructure, leading to suboptimal
performance.

In this talk, I will present Mizan, a layer between the users’ code and
the computing infrastructure. Mizan considers the structure of the input
graph and the architecture of the infrastructure in order to: (i) decide
whether it is beneficial to generate a near-optimal partitioning of the
graph in a pre-processing step, and (ii) choose between typical
point-to-point message passing and a novel approach that puts computing
nodes in a virtual overlay ring. We deployed Mizan on a small local
Linux cluster, on the cloud (256 virtual machines in Amazon EC2), and on
an IBM BlueGene/P supercomputer (1024 CPUs). Mizan executes common
algorithms on very large graphs up to one order of magnitude faster than
MapReduce-based implementations and up to 4 times faster than
implementations relying on Pregel-like hash-based graph partitioning.

Bio:

Panos Kalnis is an associate professor in the Division of Mathematical
and Computer Sciences and Engineering in the King Abdullah University of
Science and Technology (KAUST). In 2009 he was a visiting assistant
professor in the Dept. of Computer Science, Stanford University. Before
that, he was an assistant professor in the Dept. of Computer Science,
National University of Singapore (NUS). In the past he was involved in
the designing and testing of VLSI chips in the Computer Technology
Institute, Greece. He also worked in several companies on database
designing, e-commerce projects and web applications. He received his
Diploma in Computer Engineering from the Computer Engineering and
Informatics Dept. , University of Patras, Greece in 1998 and his PhD
from the Computer Science Dept., Hong Kong University of Science and
Technology (HKUST) in 2002. His research interests include Databases,
Cloud Computing, Distributed Systems, Large Graphs and Data Privacy.