Predicting Surgical Site Infections Using Machine Learning Approaches with Further Investigation of Bias
Skip to main content
Open Access Publications from the University of California

UC Davis

UC Davis Electronic Theses and Dissertations bannerUC Davis

Predicting Surgical Site Infections Using Machine Learning Approaches with Further Investigation of Bias


Digitalization of healthcare records has made patient-centered records, commonly known as Electronic Health Records (EHRs), readily available and has provided opportunities for secondary analysis. These EHRs provide us with patient demographics, laboratory values, patient vitals values, the medications administered throughout the treatment, the type of surgery the patient underwent, outcomes, and much more information. This readily available data has enabled the fast-growing field of Machine Learning and Artificial Intelligence to build reliable patient statistics and gain useful insights which support healthcare providers in making better decisions, thereby improving the quality of healthcare.This Thesis demonstrates one such use case of EHR data in predicting the onset of Surgical Site Infections (SSIs). The objective is to predict and stratify patients who are at risk of developing SSI by applying various Machine Learning (ML) methods. Surgical Site Infections (SSIs) can be defined as the infections that occur at the site of the surgery within 30 to 90 days of the procedure, depending on the type of the procedure. SSIs account for about 20% of all Hospital-Acquired Infections (HAIs) and have an enormous effect on patients, hospitals, and public health in general. Predicting who may be at risk for SSIs can help clinicians take preventive measures to avoid the onset of the infection. Availability of data at both the pre-and post-operative stages allows the application of ML methods at each of the stages, aiding in drawing better insights. The usage of this readily available data comes with its own downsides. There is an inherent bias in data that could have been induced at different stages of the data acquisition process. These biases, when left unaddressed, can creep into the algorithms we employ and result in biased decisions. This unintentional bias may seem unfair to certain groups of the patient population. Identifying and mitigating this bias issue will result in a “fair” predictive model. In this Thesis, we analyze and demonstrate the issue of ML bias in using retrospective patient data.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View