Benchmarks and methods for 3D medical image retrieval

Asma Ben Abacha; Alberto Santamaria-Pang; H. Lee; Jameson Merkow; Qin-Lei Cai; Surya Teja Devarakonda; A. Islam; Julia Gong; Matthew P Lungren; Reza Forghani; Alekh Jindal; Thomas Lin; Noel Codella; Ivan Tarapov

Benchmarks and methods for 3D medical image retrieval

Asma Ben Abacha ,
Alberto Santamaria-Pang ,
H. Lee ,
Jameson Merkow ,
Qin-Lei Cai ,
Surya Teja Devarakonda ,
A. Islam ,
Julia Gong ,
Matthew P Lungren ,
Reza Forghani ,
Alekh Jindal ,
Thomas Lin ,
Noel Codella ,
Ivan Tarapov

Scientific Reports | April 2026

Download BibTex

The increasing use of medical imaging in healthcare settings presents a significant challenge due to the additional workload for radiologists, yet it also offers opportunity for enhancing healthcare outcomes if effectively leveraged. Artificial Intelligence (AI)-based 3D medical image retrieval holds the potential to alleviate radiologists’ burden by offering evidence-based diagnostics and predictions that can enhance the scale and accuracy of radiologists, while simultaneously supporting output verification for safety and regulatory compliance. Despite its promise, the field of 3D medical image retrieval lacks established evaluation benchmarks, comprehensive datasets, and rigorous evaluation studies. This paper aims to address these gaps by introducing the first benchmark for 3D Medical Image Retrieval (3D-MIR) and evaluating various pre-trained models and implementation approaches for retrieval. The benchmark includes four anatomies (Liver, Colon, Pancreas, and Lung) imaged using computed tomography (CT). A range of 3D image search strategies are explored, including those that use aggregated 2D slices/3D volumes (Image-to-Image) and text embeddings from popular foundation models as queries (Text-to-Image). Additionally, novel multi-modal and supervised fine-tuning approaches are investigated to generate multi-modal embeddings for 3D image retrieval. The paper provides quantitative and qualitative assessments of each approach, along with an in-depth discussion offering insights for future research and solutions to support clinical decision-making and healthcare applications. To foster advancement in this field, our benchmark, models, and code are made publicly available.