Escherichia coli (E. coli) level in streams is a public health indicator. Therefore, being able to explain why E. coli levels are sometimes high and sometimes low is important. Using citizen science data from Fall Creek in central NY we found that complementarily using principal component analysis (PCA) and partial least squares (PLS) regression provided insights into the drivers of E. coli and a mechanism for predicting E. coli levels, respectively. We found that stormwater, temperature/season and shallow subsurface flow are the three dominant processes driving the fate and transport of E. coli. PLS regression modeling provided very good predictions under stormwater conditions (R2 = 0.85 for log (E. coli concentration) and R2 = 0.90 for log (E. coli loading)); predictions under baseflow conditions were less robust. But, in our case, both E. coli concentration and E. coli loading were significantly higher under stormwater condition, so it is probably more important to predict high-flow E. coli hazards than low-flow conditions. Besides previously reported good indicators of in-stream E. coli level, nitrate-/nitrite-nitrogen and soluble reactive phosphorus were also found to be good indicators of in-stream E. coli levels. These findings suggest management practices to reduce E. coli concentrations and loads in-streams and, eventually, reduce the risk of waterborne disease outbreak.