CVPR 2024 Workshop on Data Curation and Augmentation in Medical Imaging

Event: CVPR 2024 Workshop · Duration: 226 min · ▶ Watch on YouTube

Abstract

The CVPR 2024 Workshop on Data Curation and Augmentation in Medical Imaging (DCAinMI) focused on the critical role of high-quality data in advancing computer vision and AI applications in medical imaging. The workshop explored innovative strategies for data generation, curation, and augmentation, addressing challenges such as domain specificity, data scarcity, and class imbalance. Presentations highlighted the synergistic development between computer vision advancements and medical imaging, showcasing successful applications and discussing future directions for robust and generalizable AI models in healthcare. Key topics included the impact of AI-generated content, the development of complex AI agent systems, and the ethical considerations surrounding protected attributes in medical image analysis. The event also featured an awards ceremony recognizing outstanding contributions in the field.

Speakers

  • Sockey Chen — Program Chair
  • Dr. James Zou — Stanford University
  • Dr. Anthony Jarc — Intuitive Surgical Inc.
  • Fiona R. Kolbinger — Purdue University
  • Abril Corona-Figueroa — Durham University
  • Soham Gadgil — University of Washington
  • Hallee Wong — MIT CSAIL
  • Yumnah Hasan — University College Cork, Ireland
  • James Holcomb — UT Southwestern Medical Center

Talks (11)

  • 00:01:00Sockey Chen: Opening Remarks
    • Introduces the workshop, highlights the importance of computer vision and medical imaging, discusses challenges in medical imaging data, and outlines the workshop’s topics and schedule.
  • 01:10:00Dr. James Zou: Data challenges and opportunities in the era of generative AI
    • Explores the impact of AI-generated data on research and the necessity of curating high-quality datasets for training generative AI models, particularly in medical imaging, and introduces novel methods for optimizing complex AI agent systems.
  • 01:43:00Dr. Anthony Jarc: Data and ML in robotic surgery: Translation to clinical environments
    • Discusses the evolution of surgical interactions, the capabilities of robotic platforms, and the role of data science and AI/ML in improving patient care through objective metrics and insights from surgical procedures.
  • 02:18:20Fiona R. Kolbinger: Strategies to Improve Real-World Applicability of Laparoscopic Anatomy Segmentation Models
    • Investigates how training data composition and class weights impact segmentation performance in laparoscopic anatomy segmentation, highlighting the benefits of negative data for small and large organs.
  • 02:38:00Abril Corona-Figueroa: Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling
    • Presents a 2D to 3D image translation framework for CT reconstruction from X-ray projections, focusing on retaining 2D information and generalizing with limited data, and discusses its application to dense correspondence on angiograms.
  • 02:41:00Soham Gadgil: Discovering mechanisms underlying AI prediction of protected attributes via data auditing
    • Investigates how AI models predict protected attributes like sex from medical images, highlighting the impact of domain shifts on generalization and proposing methods like counterfactual image generation and concept differential analysis to understand the underlying reasoning process.
  • 02:47:00Hallee Wong: ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image
    • Introduces ScribblePrompt, an interactive segmentation tool designed for flexibility with multiple interaction types, accurate correction incorporation, generalization to unseen data, and fast inference using an efficient CNN.
  • 02:54:00Yumnah Hasan: A Comparative Analysis of Implicit Augmentation Techniques for Breast Cancer Diagnosis Using Multiple Views
    • Analyzes nine implicit augmentation methods for class imbalance in breast cancer diagnosis using multiple mammography views, evaluating their performance with GoogleNet and Haralick features on DDSM and WBC datasets.
  • 03:00:00James Holcomb: Advancing Brain Tumor Analysis: Curating a High-Quality MRI Dataset for Deep Learning-Based Molecular Marker Profiling
    • Discusses the importance of high-quality MRI datasets for advancing deep learning in brain tumor analysis, focusing on curating a comprehensive dataset with detailed clinical and molecular profiles to improve prognostic accuracy and patient-specific therapeutic approaches.
  • 03:04:00Sockey Chen: Awards Ceremony
    • Announces the award winners for Best Paper, Best Paper Runner-up, Bench-to-Bedside Award, and Best Poster Award, recognizing outstanding contributions to data curation and augmentation in medical imaging.
  • 03:08:00Sockey Chen: Closing Remarks
    • Concludes the oral presentations and invites attendees to the poster session, emphasizing the importance of continued collaboration and engagement in the field.

Key Takeaways

  • High-quality, diverse, and representative datasets are fundamental for developing robust and generalizable AI models in medical imaging, especially given the increasing prevalence of AI-generated content.
  • Novel data curation and augmentation strategies, including simulating data from 3D models and leveraging expert-annotated social media content, are crucial for overcoming data scarcity and variability challenges.
  • Explainable AI techniques and methods for understanding AI’s reasoning process are essential for building trust, mitigating bias, and ensuring the responsible deployment of AI in clinical settings.
  • The integration of AI/ML into robotic surgery platforms offers significant opportunities for enhancing surgical performance, providing objective feedback for training, and improving patient outcomes through data-driven insights.
  • Collaborative efforts between clinicians, AI engineers, and industry partners are vital for translating research into clinically relevant and impactful solutions, addressing the unique challenges of medical image analysis.

Methods / Models / Datasets Mentioned

  • SAM
  • SAM-Med3D
  • DINO
  • RAD-DINO
  • Checkport
  • MatPix
  • ChatGPT
  • LLMs
  • Da Vinci
  • PLIP (Pathology Language-Image Pre-training)
  • TextGrad
  • CT
  • X-ray
  • DRR (Digitally Rendered Radiographs)
  • CCTA (Coronary Computed Tomography Angiography)
  • MedSAM
  • MIDeepSeg
  • ScribblePrompt
  • ScribblePrompt-UNet
  • ScribblePrompt-SAM
  • ResNet50
  • Counterfactual Image Generation
  • GoogleNet
  • Haralick features
  • DDSM (Digital Database for Screening Mammography)
  • WBC (Wisconsin Breast Cancer) dataset
  • 1D-CNN
  • MLP
  • ADASYN
  • BSMOTE
  • S-ENN
  • SMOTE
  • S-Tomek
  • SVM-S
  • Mixup
  • STEM
  • STEM/Mixup
  • EHR (Electronic Health Records)
  • XNAT
  • FeTS (Federated Tumor Segmentation)
  • IDH mutation
  • 1p/19q co-deletion
  • MGMT promoter status
  • Grounded-SAM
  • Cutie Video Object Segmentation

Topics

Medical Imaging · Computer Vision · Data Curation · Data Augmentation · Generative AI · Explainable AI · Robotic Surgery · Brain Tumor Analysis · Breast Cancer Diagnosis · Interactive Segmentation


Notes

Open for commentary — connections to other work, critiques, follow-up reading.