Skip to main content
eScholarship
Open Access Publications from the University of California

Classification of Acute Leukemia Based on DNA Microarray Gene Expressions Using Partial Least Squares

Abstract

Analysis of microarray data. when presented with raw gene expression intensity data, often take two main steps when analyzing the data. First preprocess the data by rescaling and standardizing so that overall intensities for each array are equivalent. Second, apply statistical methodologies to answer scientific questions of interest. In this paper. for the data pre-processing step, we introduce a thresholding algorithm for rescaling each array. Step 2 involves statistical classification and dimension reduction methodologies. For this we introduce the method of partial least squares (PLS) and apply it to the leukemia microarray data set of Golub et al. (1999). We also discuss the use of principal components analysis (PCA). quadratic discriminant analysis (QDA) and logistic discrimination (LD). Fmally. we discuss other potential applications of PLS in analyzing gene expression data that address prediction of a target gene. prediction of the reaction in cell lines. assessment of patient survival. and generalisations in predicting multiple classes.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View