Geographic Modeling with One-Class Data
- Author(s): Li, Wenkai
- Advisor(s): Guo, Qinghua
- et al.
Geographic modeling refers to spatially explicit modeling of the probability of presence for a specific geographic event. The problem of geographic one-class data (GOCD) is that absence data are often not available or are difficult to obtain, which challenges traditional statistical modeling methods because these methods require both presence and absence data. Therefore, the objective of this research is to address the challenges raised by GOCD, and develop effective techniques for modeling the spatial distributions of geographic events using GOCD. A novel presence and background learning (PBL) algorithm was developed to model the probability of presence conditional on predictor variables, and a novel accuracy assessment method was developed for model selection, threshold selection, and model evaluation. These methods require presence-background data, without requiring absence data, and hence they are appropriate for GOCD. To investigate their effectiveness, the novel PBL and accuracy assessment methods were applied in ecological niche modeling using simulated datasets, and one-class remote sensing classification using real datasets. Experimental results show that the PBL method is successful in modeling the conditional probability of presence, and it can produce high classification accuracy, outperforming existing methods such as the one-class support vector machines and maximum entropy; the new accuracy assessment method performs similarly to the traditional F-measure that requires both presence and absence data, and it shows promise in model selection, threshold selection, and accuracy assessment. Finally, I conclude that the proposed methods are promising for addressing the major challenges raised by GOCD. Because they do not require absence data, the proposed methods will have important applications in geographic modeling with GOCD.