This dissertation develops new empirical methods that inform two questions in energy policy: How will locations of photovoltaic (PV) systems affect the the need for flexible resources in power systems? And how much energy is currently being conserved by varying set point schedules in residential households?

Chapters 2

amp; 3 focus on the first question regarding the siting of PV systems.Chapter 2 presents, fits, and validates a model of variability and uncertainty in PV generation that is useful for estimating the future needs of flexibility in power systems. I call this the ``volatility state model'', due to its reliance on latent states that I refer to as volatility states. Specifically, the model (a) accounts for spatial correlation, (b) predicts metrics of variability and uncertainty that are directly relevant to grid operation and planning, and (c) predicts boundaries on distribution tails that are consistent with observed data.

I find that PV variability distributions are roughly Gaussian after conditioning on volatility states, which is helpful for finding the degree of spatial smoothing. I also propose a method for simulating volatility states that results in a very good upper bound for the probability of extreme events. Therefore the model can be used as a tool for planning additional reserve capacity requirements to balance solar variability from yet-to-be-built systems.

Chapter 3, applies the volatility state model to predict the need for reserve generation---load following and regulation---in California under different locational scenarios for PV. I find that clustering PV into small areas exacerbates the need for reserves, resulting primarily from the spatial correlation of hourly forecast errors. The benefits of dispersion diminish, and they can be saturated with a relatively small number of large utility-scale systems: 25, 500 MW systems. However, these systems need to be adequately separated, which has implications for the construction costs of transmission needed to reach them.I also identify trade-offs between locations that minimize variability and uncertainty and locations that maximize energy or capacity value. The largest trade off in California is actually between energy and capacity value: areas of the state with the greatest energy resource tend to be cloudy on summer afternoons, when peak demand---driven by air conditioning---tends to be greatest.

In Chapter 4, I explore the ability of statistical models fit to AMI data---which I refer to as ``utility meter models''---to predict the largest end use of electricity in residential homes: heating, ventilation, and air conditioning (HVAC). Specifically, I evaluate models' abilities to predict of the timing of HVAC use, the efficiency of operation, and the amount of energy consumed. I begin by presenting a general utility meter model; our goals is to create a form that (a) directly relates to physical models of heat dynamics in buildings, and (b) encompasses utility meter models already in the literature. I then fit and validate multiple variations of this model---some similar to those in literature, and some of our own device---using data from air conditioners, thermostats, and residential electricity sub-meters. I test four specific aspects of the general model: whether to use daily or hourly data, whether to allow expected energy use to be discontinuous with respect to outdoor temperature, whether to use binary latent states to classify when HVAC is on, and which probability distribution shape to assume for model disturbances.

I find a large benefit to combining models fit to daily and hourly data;

models perform best when days that are classified as without HVAC energy use cannot contain hours that are classified with HVAC. I also find that the distribution shape assumed for model disturbances greatly affects model classifications and parameters; where kernel density estimates for these distributions outperform the traditional normal distributions. Finally, I find that applying a post-fitting process that disaggregates model residuals---attributing part to HVAC and part to other end uses---increases estimations of cooling energy use.

Concluding chapter 4, I attempt to recover indoor temperature dynamics in homes metered by AMI. Though our models do not estimate these dynamics endogenously, I can infer them using out model outputs: the building is likely heating when estimated cooling energy use is less than that required to maintain a steady state, and vice versa. I find that I can infer intra-day changes in temperature well, and inter-day changes in temperature weakly.

Chapter 4 lays the groundwork for a future study, in which I hope to estimate the thermostat set point schedules of a large sample of households metered by AMI. This future work will provide an important empirical estimate of the energy currently being saved as a result of variable set point schedules.

It will also provide an important behavioral baseline for the current practices of consumers; the results of energy efficiency or demand response programs should be measured against this baseline.