Methods of examining the fit of multi-dimensional point process models using residual analysis are proposed. One method involves rescaled residuals, obtained by transforming points along one coordinate to form a homogeneous Poisson process inside a random irregular boundary. Both vertical and horizontal forms of this rescaling are discussed. We also present a different method of residual analysis, involving thinning the point process according to the conditional intensity to form a homogeneous Poisson process on the original, untransformed space. These methods for assessing goodness-of-fit are applied to point process models for the space-time-magnitude distribution of earthquake occurrences, using in particular the multi-dimensional versio of Ogata's epidemic-type aftershock sequence (ETAS) model and a 30-year catalog of 580 earthquakes occurring in Bear Valley, California, as an example. The thinned residuals suggest that the fit of the model may be significantly improved by using an anisotropic spatial distance function in the estimation of the spatially varying background rate. Using rescaled residuals, it is shown that the temporal-magnitude distribution of aftershock activity is not separable, and that in particular, in contrast to the ETAS model, the triggering density of earthquakes appears to depend on the magnitude of the secondary events. The residual analysis highlights that the fit of the space-time ETAS model may be substantially improved by allowing the parameters governing the triggering density to vary for earthquakes of different magnitudes. Such modifications are important since the ETAS model is widely used seismology for hazard analysis.

# Your search: "author:Schoenberg, Frederic P"

## filters applied

## Type of Work

Article (24) Book (0) Theses (23) Multimedia (0)

## Peer Review

Peer-reviewed only (28)

## Supplemental Material

Video (0) Audio (0) Images (0) Zip (1) Other files (1)

## Publication Year

## Campus

UC Berkeley (0) UC Davis (0) UC Irvine (1) UCLA (47) UC Merced (0) UC Riverside (0) UC San Diego (0) UCSF (0) UC Santa Barbara (0) UC Santa Cruz (0) UC Office of the President (3) Lawrence Berkeley National Laboratory (0) UC Agriculture & Natural Resources (0)

## Department

Department of Statistics, UCLA (21) Research Grants Program Office (3) Department of Emergency Medicine (UCI) (1)

## Journal

null (1)

## Discipline

Physical Sciences and Mathematics (1)

## Reuse License

BY-NC-SA - Attribution; NonCommercial use; Derivatives use same license (1)

## Scholarly Works (47 results)

e consider conditions under which parametric estimates of the intensity of a spatial-temporal point process are consistent. Although the actual point process being estimated may not be Poisson, an estimate involving maximizing a function that corresponds exactly to the log-likelihood if the process is Poisson is consistent under certain simple conditions. A second estimate based on weighted least squares is also shown to be consistent under quite similar assumptions. The conditions for consistency are simple and easily verified, and examples are provided to illustrate the extent to which consistent estimation may be achieved. An important special case is when the point processes being etimated are in fact Poisson, though other important examples are explored as well.

For models used to describe multi-dimensional marked point processes with covariates, the high number of parameters typically involved and the high dimensionality of the process can make model evaluation, construction, and estimation using maximum likelihood quite difficult. Conditions are explored here under which parameters governing one set of coordinates or covariates affecting a multi-dimensional marked point process may be estimated separately. The resulting estimates are, under the given conditions, similar to maximum likelihood estimates.

This study uses Poisson regression techniques to analyse the location of biotechnology companies throughout the USA. Three hypotheses are considered: that firms locate in population centres in order to attract workers, that they locate near colleges and universities where potential workers are likely to be better educated, and that they locate in close proximity to research-oriented universities and institutes because high-technology firms frequently spin-off from these research centres. We find that clusters do tend to be located near population centres colleges and universities but the influence of research-based universities is particularly striking. This highlights a powerful policy instrument for regions hoping to promote high-tech industrial clusters: the creation and maintenance of a first-rate research-oriented university. While these ideas have been suggested in the past, our approach to defining, measuring, and analysing these variables provides new insights into their significance, and also suggests avenues for future research.

Simple point processes are often characterized by their associated compensators or conditional intensities. For non-simple point processes, however, the conditional intensity and compensator do not uniquely determine the distribution of the process. Various ways of characterizing non-simple multivariate point processes are discussed here, some important classes of separable non-simple processes are investigated, and methods of simplification involving thinning, rescaling, and changing the mark space are presented.

For models used to describe multi-dimensional marked point processes with covariates, the high number of parameters typically involved and the high dimensionality of the process can make model evaluation, construction, and estimation using maximum likelihood quite difficult. Conditions are explored here under which parameters governing one set of coordinates or covariates affecting a multi-dimensional marked point process may be estimated separately. The resulting estimates are, under the given conditions, similar to maximum likelihood estimates.

Studying the spatial organization of the molecules in T cell following activation of the T cell antigen receptor can improve our understanding of the association of the spatial structure of the molecules with T cell activation states. In particular, it has been found that the formation of the signaling complex is tightly related to proper signal transduction and T cell activation during the immune response. The purpose of this work is to discover the relationship between the spatial structure of the molecules in the signaling complex and the state of T cell by extracting biological knowledge from cellular imaging data using spatial point process methods.In this thesis, we present discoveries on spatial distributions and attributes of proteins in microcluster and non-microcluster areas of three activated T cells and compare the differences between the spatial distributions in microcluster and non-microcluster. This thesis attempts to propose some possible biological hypotheses based on strong statistical evidence discovered from spatial point pattern analysis.

The extent to which Hawkes point process models can more accurately characterize the evolution of a disease epidemic than a standard compartmental model such as SEIR is investigated. Maximum likelihood estimation was used to fit SEIR model parameters to Ebola outbreak data in West Africa in 2014 from the World Health Organization (WHO). Projections using simulation were then conducted using the Poisson-leaping Tau Method (Cao et al. 2007) to evaluate the fit. The projections and rate function were compared to Hawkes point process estimation and simulation over the same data and projection scale. Results indicate that Hawkes models outperformed SEIR in predicting the spread of Ebola in West Africa with a 38% reduction in RMSE for weekly case estimation across all countries (total RMSE of 59.6 cases/week using SEIR compared to 37.2 for Hawkes). An analysis using the first 75% of the data for estimation and the subsequent 25% of the data for evaluation shows that the improved fit from Hawkes modeling cannot be attributed to overfitting.

The self-exciting Hawkes point process model (Hawkes, 1971) has been used to describe and forecast communicable diseases. In this dissertation, there are two parts. First, we introduce the non-parametric version of the recursive model (Schoenberg, 2019), an adaptation of the Hawkes model which allows for variable productivity, or disease reproduction rate. Here, we extend the data-driven non-parametric EM method of Marsan & Lenglin� (2008) in order to fit the recursive model without assuming a particular functional form for the productivity. We then evaluate the ability of the non-parametric recursive model to fit and forecast cases of mumps in Pennsylvania compared to that of other point process models and a variation of the commonly used SIR (Susceptible, Infected, Recovered) compartmental model. Second, we examine increasing surges of the COVID-19 pandemic using the HawkesN model with an exponential kernel (Rizoiu et. al., 2018), which assumes a finite susceptible population, is considered stationary when the reproduction number K is greater than one, and has interpretable terms similar to that of the SIR model (Kresin et. al., 2021). The HawkesN model is fit using a least squares method introduced in Schoenberg (2021), which is an effective method when an epidemiologic dataset only provides a daily case count rather than a specific time of infection. We first examine doubling time of COVID-19 in California during three notable surges using the HawkesN model and compare its predictive ability to that of an adaptation of the SIR compartmental model. Secondly, we compare HawkesN to the same compartmental model in forecasting cases of SARS-COV-2 for all fifty states nationwide. This larger study is to guide further work in improving the predictive ability of HawkesN.