UC San Diego
A Fully Bayesian Bayesian Approach to Logistic Regression
- Author(s): Shin, Joanne
- Advisor(s): Coleman, Todd P
- et al.
Binary logistic regression is often used in clinical applications to predict the occurrence of medical conditions that arise within a patient population. Point estimations are often made to approximate the unknown regression weights. In doing so, information about the underlying posterior distribution of the weights is lost. We propose a method that views the logistic regression model from a Bayesian perspective and takes into consideration the full posterior of the un- known regression coefficients when computing the probability of belonging to the positive class. This method will be referred to as the Fully Bayesian method. The Fully Bayesian method allows us to quantify the uncertainty in our probability calculations. The work in this paper builds on Kim and Ma’s previous work in which they demonstrated efficiently solvable fully Bayesian estimation techniques. By solving a (convex) Kullback-Leibler divergence problem they were able to obtain a mapping from any log-concave prior to its corresponding posterior distribution thus enabling one to draw independent samples from the posterior with ease. Hav- ing the full posterior is useful in revealing how credible a prediction is and can be utilized to define an abstain strategy. The data set was created from a subsample of de-identified patient data from Kaiser Permanente and consists of a highly im- balanced number of patients that have and have not been diagnosed with asthma. The results show that the overall performance of a Fully Bayesian scheme produces a higher measure of accuracy than the point estimate method.