DEF-AI-MIA Workshop at CVPR 2024

Event: CVPR 2024 · Duration: 306 min · ▶ Watch on YouTube

Abstract

This video segment features the opening remarks and several presentations from the DEF-AI-MIA Workshop at CVPR 2024. The workshop focuses on domain adaptation, explainability, and fairness in AI for medical image analysis, including a competition on COVID-19 diagnosis from CT scans. Talks cover various topics such as unsupervised domain adaptation for histology and skin lesion diagnosis, source-free domain adaptation for object localization, multi-scale interpretable deep learning for mammography, zero-shot medical image segmentation, and automatic biomarker extraction from medical images. This video segment presents a series of short talks from the DEF-AI-MIA CVPR 2024 Workshop, focusing on various applications of AI in medical imaging. Topics covered include enhancing cell segmentation with uncertainty-informed active learning, developing interpretable deep learning models for mass margin classification in mammography, utilizing complex style image transformations for domain generalization, and creating prototype-based interpretable networks for glaucoma detection. Additionally, the segment explores the interpretation of COVID-19 lateral flow tests using foundation models, introduces an efficient transformer for 3D medical image segmentation, and proposes a multiple instance learning framework for robust medical diagnosis. This segment explores multimodal AI data fusion in healthcare, emphasizing the integration of diverse data types like medical records, scans, and genomics for improved diagnostic and prognostic predictions. It introduces two novel techniques: the Multi-modal Outer Arithmetic Block (MOAB) and Flattened Outer Arithmetic Attention (FOAA). MOAB uses bilinear fusion with arithmetic operations to intermingle features, demonstrating enhanced separation of brain tumor grades. FOAA extends these concepts to attention mechanisms, achieving superior quantitative results on both brain tumor and breast tumor datasets compared to existing methods. This segment introduces the Interactive Medical Image Learning (IMIL) Framework, a novel approach to medical image analysis that leverages targeted clinician feedback to improve model performance and interpretability. It also presents a novel approach using residual-based language models as ‘free boosters’ for biomedical imaging tasks, demonstrating their effectiveness in improving performance across various medical image analysis challenges. Additionally, it introduces LaPA, a Latent Prompt Assist Model for Medical Visual Question Answering, designed to improve the accuracy and interpretability of medical image analysis by leveraging latent prompts and multi-modal fusion. Finally, it presents a novel approach to fine-grained medical activity recognition in trauma resuscitation using actor tracking, aiming to improve the accuracy and efficiency of monitoring and decision-making in critical medical scenarios.

Speakers

  • Dimitrios Kollias — NTUA
  • Ruby Wood — University of Oxford
  • Janet Wang — Tulane University
  • Alexis Guichemerre — ETS Montreal
  • Julia Yang — Duke University
  • Sidra Aleem — Dublin City University
  • Ronald M. Summers — NIH
  • Greg Slabaugh — Professor of Computer Vision and AI, Director of the Digital Environment Research Institute (DERI) at Queen Mary University of London
  • Bob Zhang

Talks (20)

  • 00:00:00 — Dimitrios Kollias: Domain adaptation, Explainability, Fairness in AI for Medical Image Analysis (DEF-AI-MIA) Workshop
    • Introduction to the DEF-AI-MIA workshop, its scope, aims, and competition challenges, including thanks to sponsors and introduction of organizers.
  • 00:15:45Ruby Wood: Cluster Triplet Loss for Unsupervised Domain Adaptation on Histology Images
    • Presents a method using cluster triplet loss for unsupervised domain adaptation on histology images to predict patient response to radiotherapy.
  • 00:21:55Janet Wang: Achieving Reliable and Fair Skin Lesion Diagnosis via Unsupervised Domain Adaptation
    • Presents a study on investigating the effectiveness of unsupervised domain adaptation (UDA) for training skin lesion classifiers with various public datasets, especially when labeled data from the target set is unavailable, and improving fairness.
  • 00:25:00Alexis Guichemerre: Source-free Domain Adaptation of Weakly-supervised Object Localization Models for Histology
    • Discusses source-free domain adaptation (SFDA) methods in the context of weakly-supervised object localization (WSOL) for histology images, exploring different SFDA techniques and their performance.
  • 00:30:50Julia Yang: FPN-IAIA-BL: A Multi-Scale Interpretable Deep Learning Model for Classification of Mass Margins in Digital Mammography
    • Introduces FPN-IAIA-BL, a multi-scale interpretable deep learning model designed to classify mass margins in digital mammography, focusing on interpretability and localization at different scales.
  • 00:30:50Sidra Aleem: Test-Time Adaptation with SALIP: A Cascade of SAM and CLIP for Zero-Shot Medical Image Segmentation
    • Presents a method called SALIP that combines SAM and CLIP models for zero-shot medical image segmentation, focusing on test-time adaptation.
  • 00:31:15Ronald M. Summers: Automatic Extraction of Biomarkers Through Deep Learning and Explainable Disease Diagnosis
    • Discusses the automatic extraction of biomarkers from medical images using deep learning, emphasizing explainability for disease diagnosis and its application in large-scale body composition analysis.
  • 01:16:55David Anglada-Rotger: Enhancing Ki-67 Cell Segmentation with Dual U-Net Models: A Step Towards Uncertainty-Informed Active Learning
    • This talk presents a dual U-Net model for Ki-67 cell segmentation that incorporates uncertainty-informed active learning to improve performance.
  • 02:31:30Julia Yang, Alina Jade Barnett, Jon Donnelly, Satvik Kishore, Jerry Fang, Fides Regina Schwartz, Chaofan Chen, Joseph Y. Lo, Cynthia Rudin: FPN-IAIA-BL: A Multi-Scale Interpretable Deep Learning Model for Classification of Mass Margins in Digital Mammography
    • This talk introduces a multi-scale interpretable deep learning model (FPN-IAIA-BL) for classifying mass margins in digital mammography.
  • 02:33:00Greg Slabaugh: Multimodal AI in Healthcare: Attention, Salience, Global/local analysis
    • Introduction to multimodal data in healthcare and the need for AI data fusion.
  • 03:11:30Nikolaos Spanos, Anastasios Arsenos, Paraskevi-Antonia Theofilou, Paraskevi Tzouveli, Athanasios Voulodimos, Stefanos Kollias: Complex Style Image Transformations for Domain Generalization in Medical Images
    • This talk explores complex style image transformations within an augmentation framework to improve domain generalization in medical image analysis.
  • 03:42:30Mohana Singh, BS Vivek, Jayavardhana Gubbi, Arpan Pal: Prototype-based Interpretable Network for Glaucoma Detection
    • This talk proposes a prototype-based interpretable network for glaucoma detection, focusing on learning class-specific prototypes.
  • 03:49:30Dimitrios Kollias: Interactive Medical Image Learning (IMIL) Framework
    • This talk introduces the Interactive Medical Image Learning (IMIL) Framework, a novel approach to medical image analysis that leverages targeted clinician feedback to improve model performance and interpretability.
  • 03:55:10Bob Zhang: Residual-based Language Models are Free Boosters for Biomedical Imaging Tasks
    • This talk presents a novel approach using residual-based language models as ‘free boosters’ for biomedical imaging tasks, demonstrating their effectiveness in improving performance across various medical image analysis challenges.
  • 04:00:29Tiancheng Gu: LaPA: Latent Prompt Assist Model For Medical Visual Question Answering
    • This talk introduces LaPA, a Latent Prompt Assist Model for Medical Visual Question Answering, designed to improve the accuracy and interpretability of medical image analysis by leveraging latent prompts and multi-modal fusion.
  • 04:05:15Wenjin Zhang: Focusing on What Matters: Fine-grained Medical Activity Recognition for Trauma Resuscitation via Actor Tracking
    • This talk presents a novel approach to fine-grained medical activity recognition in trauma resuscitation using actor tracking, aiming to improve the accuracy and efficiency of monitoring and decision-making in critical medical scenarios.
  • 04:07:30Stuti Pandey, Josh Myers-Dean, Jarek Reynolds, Danna Gurari: Interpreting COVID Lateral Flow Tests’ Results with Foundation Models
    • This talk investigates the use of modern foundation models for interpreting COVID-19 lateral flow test results, focusing on identifying and grounding test components.
  • 04:57:30Jakub Laszczyk, Mohamed: Using counterfactual information for breast classification diagnosis
    • This talk explores the use of counterfactual information to improve breast classification diagnosis.
  • 06:32:30Shehan Perera, Pouyan Navard, Alper Yilmaz: SegFormer3D: An Efficient Transformer for 3D Medical Image Segmentation
    • This talk introduces SegFormer3D, a lightweight and efficient transformer architecture for 3D medical image segmentation.
  • 06:57:30D. J. Araújo, M. R. Verdelho, A. Bissoto, J. C. Nascimento, C. Santiago, C. Barata: Key Patches Are All You Need: A Multiple Instance Learning Framework For Robust Medical Diagnosis
    • This talk proposes a multiple instance learning framework that uses key patches for robust medical diagnosis, addressing spurious correlations in attention maps.

Key Takeaways

  • The DEF-AI-MIA workshop addresses critical challenges in applying AI to medical imaging, focusing on robustness, interpretability, and ethical considerations like fairness.
  • Various novel approaches are presented, including unsupervised domain adaptation techniques leveraging clustering and contrastive learning, and multi-scale interpretable models for specific medical tasks.
  • The importance of explainability and localization in medical AI is highlighted, especially for clinical adoption and trust.
  • Large-scale body composition analysis using automated AI tools on CT scans shows promise for predicting cardiovascular risk and overall survival.
  • Uncertainty-informed active learning can significantly enhance cell segmentation performance in medical imaging.
  • Interpretable deep learning models are crucial for clinical adoption, especially in complex tasks like mass margin classification.
  • Domain generalization techniques, including style transformations and augmentation, are vital for robust AI models across diverse medical datasets.
  • Foundation models show promise in interpreting medical test results, but challenges remain in ensuring accuracy and explainability.
  • Multimodal data fusion is crucial in healthcare to leverage correlations and complementarity across diverse data types for improved diagnostic and prognostic predictions.
  • The Multi-modal Outer Arithmetic Block (MOAB) is a novel bilinear fusion technique that effectively intermingles unimodal and multimodal features, leading to better data separation and classification.
  • Flattened Outer Arithmetic Attention (FOAA) extends MOAB’s arithmetic operations into an attention mechanism, achieving state-of-the-art performance in multimodal medical image analysis tasks.
  • These fusion techniques show promise in complex medical problems like brain tumor grading and rheumatoid arthritis, where traditional single-modality approaches may neglect crucial clinical context.
  • The IMIL Framework significantly improves model accuracy and calibration by incorporating clinician feedback, demonstrating a 4% increase in accuracy with only 4% of the dataset augmented.
  • Residual-based language models can act as ‘free boosters’ for biomedical imaging tasks, enhancing performance across various medical image analysis challenges.
  • The LaPA model, utilizing latent prompts and multi-modal fusion, shows exceptional performance in medical visual question answering, outperforming state-of-the-art methods.
  • Fine-grained medical activity recognition in trauma resuscitation can be effectively achieved through actor tracking, leading to more clinically relevant attention and calibrated confidence in AI models.

Methods / Models / Datasets Mentioned

  • ADDA (Adversarial Discriminative Domain Adaptation)
  • Ablation-CAM
  • Active Learning
  • AdaDSA
  • Agatston Score
  • Augmentation
  • BMI (Body Mass Index)
  • CAM (Class Activation Map)
  • CDCL (Cross-Domain Contrastive Learning)
  • CLIP (Contrastive Language-Image Pre-training)
  • CNN
  • CNN-based patch encoders
  • Cluster Triplet Loss
  • ConvNeXt
  • CutMix
  • CutOut
  • DANN (Domain Adversarial Neural Network)
  • DeepMHL
  • Distill-SODA
  • ERM (Empirical Risk Minimization)
  • FOAA
  • FPN (Feature Pyramid Network)
  • FPN-IAIA-BL
  • FRS (Framingham Risk Score)
  • Foundation Models
  • Grad-CAM
  • Grad-CAM++
  • Graph Neural Network
  • Grounding
  • IAIA-BL (Interpretable AI Algorithm for Breast Lesions)
  • Image Captioning
  • K-Means
  • KF
  • LaPA
  • LayerCAM
  • M2F
  • M3SDA (Moment Matching for Multi-Source Domain Adaptation)
  • MDAN (Multi-Source Domain Adversarial Networks)
  • MLP
  • MMD (Maximum Mean Discrepancy)
  • MOAB
  • MixUp
  • Multi-scale deep learning
  • MultiCoFusion
  • Multiple Instance Learning (MIL)
  • Pathomic
  • Prototype-based learning
  • RNN
  • ResNet 3D
  • ResNet-50
  • SAM (Segment Anything Model)
  • SFDA (Source-Free Domain Adaptation)
  • SHOT
  • SRDC
  • SS-CAM
  • Score-CAM
  • SegFormer3D
  • Style Transfer
  • TS-CAM
  • Transformers
  • Triplet Centre Loss
  • U-Net
  • UMAP
  • Video Swin Transformer
  • Video-MAE
  • Vision Transformers (ViT)
  • WSOL (Weakly-Supervised Object Localization)
  • YOLOv8
  • t-SNE

Topics

Active Learning · Activity Recognition · Attention Mechanisms · Bilinear Fusion · Biomarker Extraction · Biomedical Imaging · Body Composition Analysis · Brain Tumor Grading · COVID-19 Diagnosis · Clinician Feedback · Data Fusion · Deep Learning · Domain Adaptation · Domain Generalization · Explainable AI · Fairness in AI · Foundation Models · Glaucoma Detection · Healthcare AI · Histology Image Analysis · Interpretability · Interpretable AI · Language Models · Mammography · Medical Image Analysis · Model Performance · Multimodal AI · Precision Medicine · Rheumatoid Arthritis · Segmentation · Skin Lesion Diagnosis · Trauma Resuscitation · Visual Question Answering · Weakly-supervised Object Localization · Zero-shot Segmentation


Notes

Open for commentary — connections to other work, critiques, follow-up reading.