With advances in energy metering, communication, and analytic software technologies,providers of Energy Management and Information Systems (EMIS) are opening new frontiers inbuilding energy efficiency. Through their engagement platforms and interfaces, EMIS productscan enable energy savings through multiple strategies including equipment operationalimprovements and upgrades, and occupant behavioral changes. These products often quantifywhole-building savings relative to a baseline period using methods that predict energyconsumption from key parameters such as ambient weather conditions and operation schedule.These automated baseline models streamline the M&V process and are of critical importance toowners and utility program stakeholders implementing multi-measure energy efficiencyprograms.This paper presents the results of a PG&E Emerging Technology program, undertaken toadvance capabilities in evaluating EMIS products for building-level baseline energy modeling. Ageneral methodology to evaluate baseline model performance was developed and used withhourly whole-building energy data from nearly 400 small and large commercial buildings.Evaluation metrics describing model accuracy were identified and assessed for theirappropriateness in describing model baseline performance, as well as their usefulness foridentifying and pre-screening buildings for whole-building savings estimation suitability. Thestate of five public-domain models was assessed using the methodology and test data set, andimplications for whole building M&V described. Finally a protocol was developed to test EMISvendor's proprietary models while navigating practical issues concerning test data security,vendor intellectual property, and maintaining appropriate testing blinds, while processing alarge data set. Ongoing work entails stakeholder vetting, demonstration of the test procedureswith new baseline models solicited from the public, and publication of the results for industryadoption.