Towards a Scalable and Qualitative Spatial Regionalization
- Alrashid, Hussah
- Advisor(s): Magdy, Amr
Abstract
The growing volume of data generated in today’s world presents new challenges in analyzing spatial data. Spatial regionalization is one of the traditional methods employed in numerous applications to analyze spatial data. However, many existing regionalization techniques lack scalability, only supporting small-sized datasets. Moreover, due to the NP hardness nature of regionalization problems, all current techniques are individual methods that provide approximate solutions. This dissertation focuses on developing scalable regionalization methods to efficiently address spatial regionalization problems and assess the quality of the solutions generated by the existing regionalization techniques.
First, we tackle the scalability issue of an existing regionalization problem named MP-regions. The serializability nature of the MP-regions algorithm makes it difficult to parallelize. To address this issue, we develop PAGE (Parallel Scalable Regionalization Framework), an efficient module to solve the problem of MP-regions at scale. PAGE has the ability to partition a set of spatial areas efficiently while maintaining the spatial contiguity, and perform regionalization efficiently in parallel while maintaining high quality solutions. PAGE also introduces changes to the existing local search algorithms to make the process of optimizing the solution quality more efficient. The experimental evaluation of PAGE proves its ability in solving the MP-regions problem effectively and efficiently.
Second, we propose a new statistical inference problem named SISR (Statistical Inference for Spatial Regionalization) to assess the quality of the heuristic-based regionalization techniques. To solve SISR, we propose PRRP (P-Regionalization through Recursive Partitioning). PRRP generates a reference distribution of sample solutions with a user-defined number of regions and regions’ cardinality that is similar to the assessed solution. The key novelty is maintaining the spatial contiguity of areas at all times to enable growing regions with a specified cardinality value for each region. PRRP is extensively evaluated using real-world datasets and the results show its ability to solve the SISR problem.
Finally, we present RegioNinja, a usable unified system designed to handle seeding regionalization queries in an efficient manner. This proposed system offers a flexible and scalable structure to address seeding regionalization queries that are already presented in the literature, while also allowing the exploration of new combinations of queries that have not been addressed before. The introduction of this system moves the study of spatial regionalization from the era of individualized algorithms towards a new age of comprehensive system-level support. The experimental assessment of the proposed system shows its ability in solving regionalization queries for different spatial regionalization problems effectively while hiding the implementation complications from end-users.