Lawrence Berkeley National Laboratory
Extracting baseline electricity usage using gradient tree boosting
- Author(s): Kim, T
- Lee, D
- Choi, J
- Spurlock, A
- Sim, A
- Todd, A
- Wu, K
- et al.
Published Web Locationhttps://sdm.lbl.gov/oapapers/DataCom2015_kim_report.pdf
© 2015 IEEE. To understand how specific interventions affect a process observed over time, we need to control for the other factors that influence outcomes. Such a model that captures all factors other than the one of interest is generally known as a baseline. In our study of how different pricing schemes affect residential electricity consumption, the baseline would need to capture the impact of outdoor temperature along with many other factors. In this work, we examine a number of different data mining techniques and demonstrate Gradient Tree Boosting (GTB) to be an effective method to build the baseline. We train GTB on data prior to the introduction of new pricing schemes, and apply the known temperature following the introduction of new pricing schemes to predict electricity usage with the expected temperature correction. Our experiments and analyses show that the baseline models generated by GTB capture the core characteristics over the two years with the new pricing schemes. In contrast to the majority of regression based techniques which fail to capture the lag between the peak of daily temperature and the peak of electricity usage, the GTB generated baselines are able to correctly capture the delay between the temperature peak and the electricity peak. Furthermore, subtracting this temperature-adjusted baseline from the observed electricity usage, we find that the resulting values are more amenable to interpretation, which demonstrates that the temperature-adjusted baseline is indeed effective.