Self-supervised Learning with Radiology Images and Reports

Developing a system that learns meaningful representations of patient radiology data for downstream tasks is one challenge in the healthcare domain that machine learning can address (Ghassemi et al., 2020). This is particularly important in the healthcare domain as manually labeled radiology datasets are not in abundance so leveraging unlabeled information is particularly advantageous for pre-training models. Recent studies have shown that it is better to pre-train exclusively on healthcare datasets rather than using transfer learning with large non-healthcare models to avoid incorporating non-medical features into the representation (Krishnan et al., 2022) — further emphasizing the need for self-supervised learning in healthcare. Moreover, self-supervised pre-training can leverage multimodal data, such as radiology text reports and x-ray images, to further enrich the quality of representation learning. Radiology reports, often used from the MIMIC-CXR dataset (Johnson et al., 2019), are especially useful as they contain information rich sections such as Findings and Impressions, but it is unclear how much contrastive learning frameworks actually learn from these important sections. In this project, we will examine the importance of different radiology sections, and modify an existing contrastive learning framework to improve upon the quality of representations using the radiology text data. [Paper] [GitHub repository]