Report ID
Report Authors
Efstratios Dimopoulos
Report Date

The vast proliferation of monitoring and sensing devices equipped with Internet connectivity, commonly known as the “Internet of Things” (IoT) generates an unprecedented volume of data, which requires Big Data Analytics Systems (BDAS) to process it and extract actionable insights. The large diversity of IoT data processing applications require the deployment of multiple processing frameworks under the coordination of a resource allocator. To enable prompt actuation, these applications must meet deadlines and their processing takes place near where data is generated, in private clouds or edge computing clusters, which have limited resources. 

In resource-constrained and multi-analytics settings there are issues related to the combined use of open-source BDAS, originally designed for resource-rich, standalone clusters, that remain unaddressed. Specifically, open-source BDAS have unknown behavior when used combined under the coordination of a cluster-manager and the available resources are limited. Moreover, existing allocation policies are not suitable to meet deadlines in resource-constrained settings without wasting resources or requiring particular repetitive job patterns. Lastly, in such settings fair-share policies cannot reliably preserve fairness.

To satisfy deadlines and achieve allocation fairness in resource constrained clusters for multi-analytics, we employ predictive resource allocation and admission control. We evaluate the performance and behavior of BDAS in resource-constrained multi-analytics clusters and understand the root causes of their interference. Moreover, we design admission control and resource allocation suitable for resource-managers. Allocation decisions adapt to changing cluster conditions to satisfy deadlines and preserve fairness under resource-constrained multi-analytics settings. We evaluate our approach with trace-based simulations and production workloads and show that it satisfies more deadlines, preserves fairness, and utilizes the cluster more efficiently compared to existing fair-share allocators designed for resource managers.