QTL Study Design from an Information Perspective
We examine the efficiency of different genotyping and phenotyping strategies in inbred line crosses from an information perspective. This provides a mathematical framework for the statistical aspects of QTL experimental design, while guiding our intuition. Our central result is a simple formula that quantifies the fraction of missing information of any genotyping strategy in a backcross. It includes the special case of selectively genotyping only the phenotypic extreme individuals. The formula is a function of the square of the phenotype, and the uncertainty in our knowledge of the genotypes at a locus. This result is used to answer a variety of questions. First, we examine the cost-information tradeoff varying the density of markers, and the proportion of extreme phenotypic individuals genotyped. Then we evaluate the information content of selective phenotyping designs and the impact of measurement error in phenotyping. A simple formula quantifies the information content of any combined phenotyping and genotyping design. We extend our results to cover multi-genotype crosses such as the F2 intercross, and multiple QTL models. We find that when the QTL effect is small, any contrast in a multi-genotype cross benefits from selective genotyping in the same manner as in a backcross. The benefit remains in the presence of a second unlinked QTL with small effect (explaining less than 20% of the variance), but diminishes if the second QTL has a large effect. Software for performing power calculations for backcross and F2 intercross incorporating selective genotyping and marker spacing is available [in related files].