Scalable Analytics Systems for Multi-Tier IoT Deployments with Applications in Agriculture
- Author(s): Golubovic, Nevena
- Advisor(s): Krintz, Chanda
- Wolski, Rich
- et al.
Recent technological advances in environmental and personal sensing, monitoring and data analytics are fueling remarkable innovation in data-driven actuation, decision support, and adaptive control that is based on the "Internet of Things" (IoT). However, to become truly transformative, IoT must exploit vastly heterogeneous combinations of compute, storage, and networking capabilities provisioned at multiple scales, from low-cost, resource-restricted devices to the public clouds. Low-latency applications and continuous telemetry from devices in our environment require services to be distributed to "the edge" of the network while, at the same time, both security and programmer-productivity requirements require a uniform, efficient, and end-to-end systems.
With this dissertation, we investigate the design and implementation of a scalable, end-to-end system for data-driven IoT applications, which spans IoT tiers -- sensors, edge, and cloud. Moreover, we do so in a problem-driven fashion and target the domain of agriculture, specializing in the system for data analytics and machine learning techniques that are applicable to precision farming. Our work is novel in that our system integrates popular cloud services and machine learning technologies using open source and provides the scalable building blocks for common analysis tasks, e.g. clustering and regression, in a way that can be tailored to specific problems that growers and farm consultants face. In addition, the system automates the placement of analytics deployments across edge and cloud tiers. For the sensing tier, we develop a novel approach that extends the capability of sensor platforms by "synthesizing" information from multiple, other sensors. The result, we believe, is a holistic, easy-to-use system for data ingress, analysis, and visualization, that integrates new insights in sensing and distributed and scalable systems, and that is applicable to a range of agricultural settings, applications, and low latency, sustainable solutions.