Multi-model ensemble strategy is a means to exploit the diversity of skillful predictions from different models. This paper studies the use of Bayesian model averaging (BMA) scheme to develop more skillful and reliable probabilistic hydrologic predictions from multiple competing predictions made by several hydrologic models. BMA is a statistical procedure that infers consensus predictions by weighing individual predictions based on their probabilistic likelihood measures, with the better performing predictions receiving higher weights than the worse performing ones. Furthermore, BMA provides a more reliable description of the total predictive uncertainty than the original ensemble, leading to a sharper and better calibrated probability density function (PDF) for the probabilistic predictions. In this study, a nine-member ensemble of hydrologic predictions was used to test and evaluate the BMA scheme. This ensemble was generated by calibrating three different hydrologic models using three distinct objective functions. These objective functions were chosen in a way that forces the models to capture certain aspects of the hydrograph well (e.g., peaks, mid-flows and low flows). Two sets of numerical experiments were carried out on three test basins in the US to explore the best way of using the BMA scheme. In the first set, a single set of BMA weights was computed to obtain BMA predictions, while the second set employed multiple sets of weights, with distinct sets corresponding to different flow intervals. In both sets, the streamflow values were transformed using Box-Cox transformation to ensure that the probability distribution of the prediction errors is approximately Gaussian. A split sample approach was used to obtain and validate the BMA predictions. The test results showed that BMA scheme has the advantage of generating more skillful and equally reliable probabilistic predictions than original ensemble. The performance of the expected BMA predictions in terms of daily root mean square error (DRMS) and daily absolute mean error (DABS) is generally superior to that of the best individual predictions. Furthermore, the BMA predictions employing multiple sets of weights are generally better than those using single set of weights. © 2006 Elsevier Ltd. All rights reserved.