Over the past decade, there has been steady increase of studies leveraging genomic and related molecular profiling technologies to identify drug targets and characterize drug effects for particular diseases. This is particularly true for cancer, in which researchers can potentially identify genetic and other factors in tumors to indicate appropriate treatments. Many relevant studies have leveraged high throughput drug screen (HTS) strategies using, e.g., cancer cell lines that have been profiled at the genetic and other levels. These strategies allow researchers to assess the effects of many different drugs and compounds on cell lines exhibiting wide variation in their genomic profiles, so that associations can be identified that relate elements of those profiles with drug response. Unfortunately, given the manner in which HTS studies are pursued and the amount of data they generate, problems such as false positives or a loss of statistical power must be addressed. These problems fall into three general categories: (i) the reliability of the screening data and the procedures used to generate the data; (ii) the statistical analysis methods used to identify associations between drug responses and other factors collected on the cells used in the screening; and (iii) the differences between the cell lines used in terms of their genetic architectures. I have taken a data-driven approach to address each of these concerns.
First, I assessed the reliability and reproducibility of HTS data leveraging two different collaborations. One set of analyses involved melanoma cancer cell lines subjected to two independent laboratory drug screens. I ultimately assessed the proportion of variation that could be explained by laboratory and technical effects associated with the design of the experiments and found that when sources of variation are quantified and controlled for, signals beyond “noise” can be detected that reflect true drug response. A second set of analyses involved HTS to identify drugs that influence lifespan in Caenorhabditis elegans. As with the study of the melanoma cell lines, I examined the variability in the screening outcome data that could be attributed to plate and plate-specific effects.
Second, I considered different ways of statistically analyzing dose-response data arising from HTS experiments. I ultimately evaluated the performance of nonlinear mixed effects (NLME) models relative to traditional models based on an analysis of IC50 values derived from individual cell line drug response pro- files. Through simulation studies as well as applications to actual data, I found that testing for differences in dose-response curves using the NLME models has greater statistical power to detect gene associations with drug responses than tests involving traditional IC50-based values.
Third, I assessed differences in genetic co-expression among cell lines used in drug screening studies. Such differences can dramatically affect identification of gene/drug relationships. I find evidence for differences in the way genes are related to each other between cell lines used in screening experiments and show how this “re-wiring” of genes can affect interpretation of resulting drug screen data and identification of drug targets. Since more and more emphasis will be placed on choosing the right treatment for an individual based on his or her genetic and related profile in the future — as this is the goal of “personalized,” “individualized” and “precision” medicine — I believe my analyses and approaches will motivate future studies and lead to more reliable drug screening strategies and results.