BACKGROUND:As healthcare increasingly digitizes, streaming waveform data is being made available from an variety of sources, but there still remains a paucity of performant clinical decision support systems. For example, in the intensive care unit (ICU) existing automated alarm systems typically rely on simple thresholding that result in frequent false positives. Recurrent false positive alerts create distrust of alarm mechanisms that can be directly detrimental to patient health. To improve patient care in the ICU, we need alert systems that are both pervasive, and accurate so as to be informative and trusted by providers. OBJECTIVE:We aimed to develop a machine learning-based classifier to detect abnormal waveform events using the use case of mechanical ventilation waveform analysis, and the detection of harmful forms of ventilation delivery to patients. We specifically focused on detecting injurious subtypes of patient-ventilator asynchrony (PVA). METHODS:Using a dataset of breaths recorded from 35 different patients, we used machine learning to create computational models to automatically detect, and classify two types of injurious PVA, double trigger asynchrony (DTA), breath stacking asynchrony (BSA). We examined the use of synthetic minority over-sampling technique (SMOTE) to overcome class imbalance problems, varied methods for feature selection, and use of ensemble methods to optimize the performance of our model. RESULTS:We created an ensemble classifier that is able to accurately detect DTA at a sensitivity/specificity of 0.960/0.975, BSA at sensitivity/specificity of 0.944/0.987, and non-PVA events at sensitivity/specificity of .967/.980. CONCLUSIONS:Our results suggest that it is possible to create a high-performing machine learning-based model for detecting PVA in mechanical ventilator waveform data in spite of both intra-patient, and inter-patient variability in waveform patterns, and the presence of clinical artifacts like cough and suction procedures. Our work highlights the importance of addressing class imbalance in clinical data sets, and the combined use of statistical methods and expert knowledge in feature selection.