Query Execution in Column-Oriented Database System

Date: 
Thursday, March 22, 2007 - 2:23pm


Daniel Abadi – Faculty Candidate
MIT Computer Science and Artificial Intelligence Laboratory
Date: Wednesday, April 4, 2007
Time: 3:00pm-4:00pm
Location: ESB 2001

Abstract:

Recent research on column-oriented database systems (DBMSs) has shown
that these systems can outperform existing row-oriented DBMSs by one to
two orders of magnitude on read-mostly query workloads like those found
in data warehouses, decision support, and customer relationship
management systems. In this talk, I will discuss this exciting new class
of database systems and will provide an overview of the C-Store system
that we have developed over the past two years at MIT. I will then
focus on the design of the column-oriented query execution engine I have
developed. In particular, I will discuss the impact on query performance
of tuple construction (stitching together attributes from multiple
columns into a row-oriented “tuple”) and operation on compressed data.
Tuple construction allows column-oriented DBMSs to offer a
standards-compliant relational database interface (e.g., ODBC, JDBC,
etc); however, if done at the wrong point in a query plan, a significant
performance penalty is paid. Similarly, data compression can improve
query performance by an order of magnitude by trading cheap CPU cycles
for expensive I/O bandwidth.

Biography:

Daniel Abadi is a Ph.D. student at the Massachusetts Institute of
Technology. Before attending MIT, he spent a year at Cambridge
University in England on a Churchill Scholarship where he received an
M.Phil., and he received his B.S. from Brandeis University. His research
interests are in database system design, implementation, and evaluation.

Host: Amr El Abbadi