Electronic health records (EHRs) and linked biobanks have tremendous potential to advance biomedical research and ultimately improve the health of future generations. Repurposing EHR data for research is not without challenges, however. In this paper, we describe the processes and considerations necessary to successfully access and utilize a data warehouse for research. Although imperfect, data warehouses are a powerful tool for harnessing a large amount of data to phenotype disease. They will have increasing relevance and applications in clinical research with growing sophistication in processes for EHR data abstraction, biobank integration, and cross-institutional linkage.