The hydration of a bentonite barrier in the early stage of a geologic nuclear waste repository with a bentonite buffer is a critical issue for its long-term performance and safety because bentonite might be permanently altered and subsequently affect the function of bentonite barrier. Large scale in situ testing integrated with modeling analysis is an effective way to study the key processes affecting the hydration of a bentonite barrier. In this paper, through the comparison between coupled thermal, hydrological, mechanical, and chemical (THMC) models and data from a long term in situ test, we attempt to pinpoint the importance of non-Darcian flow, thermal osmosis, and hydro-mechanical coupling (porosity and permeability change due to swelling) to the hydration rate of the bentonite barrier under heating conditions. We found that a TH model equipped with non-Darcian flow severely underestimates the relative humidity and water content measured in the bentonite. Calibration of the parameters associated with relative permeability overshadows the contribution of non-Darcian flow, and non-Darcian flow under unsaturated conditions is not yet fully understood. An empirical relationship between saturated permeability and dry density was found to work better than a saturated permeability that is the function of effective stress in matching the relative humidity, water content data, and the chloride concentration in pore water. We also found that chemical data are actually helpful in calibrating the THM model. A question regarding the relevance of thermal osmosis to the hydration process, in terms of matching models and data, remains unanswered. Although a THMC model with thermal osmosis matches all THMC data nicely, similar goodness-of-fit can also be achieved by a THMC model without thermal osmosis but with lower permeability. We learned that the robustness of the model could be increased if the model is tested against long-term data and multiple types of data, and given that non-uniqueness is inevitable, more independent measurements of key parameters and multi-scale and multi-physics tests may help approximate the right model for evaluating the safety of the repository.