UpSizeR: Synthetically Scaling Up a Given Database State

Date: 
Monday, April 5, 2010 - 9:50am

UCSB COMPUTER SCIENCE DEPARTMENT PRESENTS:

Monday, April 5, 2010
3:30 – 4:30
Computer Science Conference Room, Harold Frank Hall Rm. 1132

HOST: Jianwen Su

SPEAKER: Y.C. Tay (National University of Singapore)

Title: UpSizeR: Synthetically Scaling Up a Given Database State

Abstract:

E-commerce and social networking services must ensure that their systems
are scalable. Engineering for rapid growth requires intensive testing
with scaled-up datasets. Although such a larger dataset is
synthetically generated, it must be similar to a real dataset if it is
to be useful.

This talk presents UpSizeR, a tool for scaling up relational databases.
Given a database state D and a positive number s, UpSizeR generates a
synthetic state D’ that is s times the size of D, yet similar to D in
terms of query results. UpSizeR does this by extracting inter-column
and inter-row information from D. UpSizeR can also be used by an
enterprise to make a synthetic copy (s=1) of its proprietary dataset for
a vendor, or scale down a production dataset (s<1) for non-production
testing. Experiments with Flickr data shows good agreement between
crawled data and UpSizeR output for various sizes.

However, UpSizeR currently cannot scale the social network topology in
Flickr. This leads to the Attribute Value Correlation Problem: If D
records data from a social network, how do the social interactions
affect correlation among attribute values in D?

Bio:

Y.C. Tay received his BSc from the University of Singapore and PhD from
Harvard University. He is a professor in the Departments of Mathematics
and Computer Science at the National University of Singapore.
His main research interest is performance modeling (database transactions, wireless
protocols, traffic equilibrium, cache misses). Other interests include distributed
protocols and their correctness proofs. He is currently on sabbatical at UCLA.