Evaluations of nonintrusive load monitoring (NILM) algorithms and technologies have mostly occurred in constrained, artificial environments. However, few field evaluations of NILM products have taken place in actual buildings under normal operating conditions. This paper describes a field evaluation of a state-of-the-art NILM product, tested in eight homes. The match rate metric-a technique recommended by a technical advisory group-was used to measure the NILMs success in identifying specific loads and the accuracy of the energy consumption estimates. A performance assessment protocol was also developed to address common issues with NILM mislabeling and ground-truth comparisons that have not been sufficiently addressed in past evaluations. The NILM products estimates were compared to the submetered consumption of eight major appliances. Overall, the product had good performance in disaggregating the energy consumption of the electric water heaters, which included both electric resistance and heat-pump water heaters, but only a fair accuracy with refrigerators, dryers, and air conditioners. The performance was poor for cooking equipment, furnace fans, clothes washers, and dishwashers. Moreover, the product was often unable to detect major loads in homes. Typically, two or more appliances were not detected in a home. At least two dryers, furnace fans, and air conditioners went undetected across the eight homes. On the other hand, the dishwasher was detected in all homes where available or monitored. The key findings were qualitatively compared to those of past field evaluations. Potential areas for improvement in NILM product performance were determined along with areas where complementary technologies may be able to aid in load-disaggregation applications.