The rapid advancement of deep learning has significantly impacted the medical domain, benefitting various applications including clinical decision-making, personalized treatment, and medical education. Deep learning applications in the medical domain can be categorized based on the data types used: 1) Numerical measurements modeling: building models on numerical clinical measurements, including static and time-series data; 2) Natural Language Processing (NLP): training models on medical textual data such as doctor-patient conversations and clinical notes; and 3) Multimodal learning: leveraging data from multiple modalities to enhance the model's medical capacity and performance. This thesis presents works in these three categories, aiming to advance AI systems that can assist clinicians in enhancing healthcare outcomes and efficiency.
In numerical measurements modeling, despite the effectiveness of deep learning models in decision support, many studies rely on extensive public datasets, overlooking the data scarcity in small hospital settings. We address this by utilizing domain adaptation techniques to improve modality prediction in ICU patients with limited data.
Concerning NLP, while Large Language Models (LLMs) like ChatGPT and GPT-4 have shown promising results, privacy concerns restrict their direct use in healthcare. We propose integrating medical knowledge from LLMs into local models for decision support to alleviate these privacy concerns. Furthermore, instruction tuning has become crucial in aligning LLMs with human intents and has shown potential in medical applications. However, existing medical LLMs ignore the diversity of tuning data, limiting their ability to follow medical instructions and generalize. This thesis presents a novel approach to generating a diverse, machine-generated medical instruction-following dataset and demonstrates that the model tuned on this dataset achieves superior performance in both medical and general domains.
For multimodal learning, although improvements have been seen in medical predictions using multimodal data, challenges in modeling irregularities within each modality and integrating irregular time information into the multimodal representation persist. We introduce strategies for addressing these challenges in multimodal electronic health records to enhance predictions for ICU patients.
Finally, we summarize the key findings and discuss future research directions to push the boundaries of deep learning in medical applications.