Report ID
Report Authors
Stratos Dimopoulos, Chandra Krintz, Rich Wolski
Report Date

In this report, we investigate and characterize the

behavior of “big” and “fast” data analysis frameworks, in multitenant,

shared settings for which computing resources (CPU

and memory) are limited. Such settings and frameworks are

frequently employed in both public and private cloud deployments.

Resource constraints stem from both physical limitations

(private clouds) and what the user is willing to pay (public

clouds). Because of these constraints, users increasingly attempt

to maximize resource utilization and sharing in these settings.

To understand how popular analytics frameworks behave and

interfere with each other under such constraints, we investigate

the use of Mesos to provide fair resource sharing for resource

constrained private cloud systems. We empirically evaluate such

systems using Hadoop, Spark, and Storm multi-tenant workloads.

Our results show that in constrained environments, there is

significant performance interference that manifests in multiple

ways. First, Mesos is unable to achieve fair resource sharing

for many configurations. Moreover, application performance over

competing frameworks depends on Mesos offer order and is

highly variable. Finally, we find that resource allocation among

tenants that employ coarse-grained and fine-grained framework

scheduling, can lead to a form of deadlock for fine-grained

frameworks and underutilization of system resources.