The vast proliferation of monitoring and sensing devices equipped with Internet connectivity, commonly known as the “Internet of Things” (IoT) generates an unprecedented volume of data, which requires Big Data Analytics Systems (BDAS) to process it and extract actionable insights. The large diversity of IoT data processing applications require the deployment of multiple processing frameworks under the coordination of a resource allocator. To enable prompt actuation, these applications must meet deadlines and their processing takes place where data is generated, in private clouds or edge computing clusters, which have limited resources. In such settings, popular open-source BDAS, originally designed for resource-rich, standalone clusters, have unknown behavior, existing allocation policies are not suitable to meet deadlines, and fair-share policies cannot reliably preserve fairness.
We propose to evaluate the performance and behavior of BDAS in resource-constrained multi-analytics clusters and understand the root causes of their interference. With this experience and the insights we extract, we will then pursue new admission control and resource allocation mechanisms that are better suited to the resource constraints of the next generation in IoT analytics deployments. Such resource allocation decisions must adapt to changing cluster conditions to satisfy deadlines and preserve fairness in multi-analytics settings. In this talk, we overview our approach, which uses trace-based simulations and production workloads to compare existing resource manager allocators against our proposed extensions that balance deadline satisfaction with fairness across competing analytics frameworks.