Scalable Real-Time Abnormal Event Detection

Event: CVPR Workshop 2024 · Duration: 398 min · ▶ Watch on YouTube

Abstract

This segment covers the introduction to the VAND 2.0 workshop, including its organizers, program committee, and submission statistics. It then features two talks: ‘SplatPose & Detect: Pose-Agnostic 3D Anomaly Detection’ by Mathis Kruse, which introduces a method for detecting anomalies in multi-pose 3D objects using Gaussian Splatting, and ‘Advancing Visual Anomaly Detection: A Comprehensive Open-Source Approach’ by Samet Akcay, which presents Anomaliib, an open-source library for anomaly detection. The segment concludes with the beginning of Shai Avidan’s talk, ‘Everything but Anomaly Detection,’ which delves into the conceptual understanding of anomalies and representation. This segment covers two presentations on anomaly detection. The first speaker introduces a method for coarse-grain anomaly detection using normalizing flows on pose graphs, emphasizing its efficiency and robustness, and proposes a ‘Novel Class Discovery’ approach for aerial imagery. The second speaker presents ‘DMR: Disentangling Marginal Representations for Out-of-Distribution Detection’, which tackles the problem of overconfidence in OOD detection through a pipeline involving encoding, marginal feature manipulation, and synthetic data generation, showcasing improved performance on benchmark datasets. This segment features two presentations. The first introduces the Text-Align Anomaly Backbone (TAB) model, a novel pre-training framework designed for industrial inspection tasks, leveraging text-image alignment to enhance anomaly detection and defect classification. The second presentation introduces BMAD, a new benchmark for medical anomaly detection, which includes six diverse datasets from five common medical domains and supports 15 state-of-the-art algorithms for standardized evaluation. This segment features two research talks on scalable real-time abnormal event detection in video, followed by an introduction to the VAND 2.0 Challenge at CVPR 2024. The first talk presents a knowledge distillation approach, leveraging object-centric models to train a fast frame-level anomaly detector. The second talk introduces a self-distilled masked auto-encoder architecture, enhanced with motion gradient weighting and synthetic anomalies to improve efficiency and accuracy. The segment concludes with an overview of the VAND 2.0 Challenge, detailing its categories for robust anomaly detection in real-world applications and few-shot learning for logical/structural detection, emphasizing the need for models adaptable to domain shifts. This segment features presentations from the winning teams of the VAND 2.0 Challenge at CVPR 2024. The first talk introduces ARNet, the 1st place solution for Category 1, which focuses on robust anomaly detection under real-world variations using a reconstruction-based network with a foreground predictor and synthetic data augmentation. The subsequent talks detail the top solutions for Track 2, the VLM Anomaly Challenge, with Ziyu Bao presenting the 2nd place approach utilizing segment-aligned features and Zhaopeng Gu presenting the 1st place AnomalyMoE, a Mixture of Experts framework for few-shot anomaly detection. The segment concludes with a wrap-up and information about related CVPR events.

Speakers

  • Paul Bergmann — MedUni Wien
  • Mathis Kruse — Leibniz University Hannover, Institute for Information Processing
  • Samet Akcay — AI Research Engineer & Scientist, Intel
  • Shai Avidan — School of Electrical Engineering, Tel-Aviv University
  • Dasol Choi — Yonsei University, MODULABS
  • Dongbin Na — Pohang University of Science and Technology
  • Ho-Weng Lee — National Tsing Hua University
  • Shang-Hong Lai — National Tsing Hua University
  • Jinan Bao — University of Alberta
  • Hanshi Sun — University of Alberta
  • Hanqiu Deng — University of Alberta
  • Yinsheng He — University of Alberta
  • Zhaoxiang Zhang — University of Alberta
  • Xingyu Li — University of Alberta
  • Radu Tudor Ionescu — University of Bucharest, Romania; SecurifAI, Romania
  • Florinel-Alin Croitoru — University of Bucharest, Romania
  • Nicolae-Cătălin Ristea — University of Bucharest, Romania
  • Fahad Shahbaz Khan — MBZ University of Artificial Intelligence, UAE
  • Mubarak Shah — University of Central Florida, US
  • Paula Ramos, PhD — AI Evangelist/CV Scientist, Intel
  • Dick Ameln, MSc — AI Research Engineer/Scientist, Intel
  • Ashwin Vaidya, MSc — AI Research Engineer/Scientist, Intel
  • Babar Hussain — TCL CORPORATE RESEARCH, HK
  • Ziyu Bao — Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences; University of Chinese Academy of Sciences; Objecteye Inc.
  • Zhaopeng Gu — Foundation Model Research Center, Institute of Automation, Chinese Academy of Sciences; University of Chinese Academy of Sciences; Objecteye Inc.
  • Dick Ameln — AI Research Engineer/Scientist, Utrecht, NL

Talks (15)

  • 00:00:00 — Paul Bergmann: Introduction to VAND 2.0 Workshop
    • Introduces the VAND 2.0 workshop, its organizers, program committee, submission statistics, schedule, and feedback mechanisms.
  • 00:03:06Mathis Kruse: SplatPose & Detect: Pose-Agnostic 3D Anomaly Detection
    • Presents SplatPose, a method for pose-agnostic 3D anomaly detection using 3D Gaussian Splatting to learn pose-invariant normality and detect anomalies in multi-pose settings.
  • 00:15:59Samet Akcay: Advancing Visual Anomaly Detection: A Comprehensive Open-Source Approach
    • Introduces Anomaliib, an open-source library designed to address challenges in visual anomaly detection by providing a comprehensive toolkit for designing, developing, and deploying deep learning anomaly detection algorithms, emphasizing reproducibility and ease of use.
  • 01:08:38Shai Avidan: Everything but Anomaly Detection
    • Discusses the fundamental aspects of anomaly detection, emphasizing the importance of representation and defining ‘anomaly with respect to what,’ and introduces a graph embedding approach for video anomaly detection.
  • 01:19:34Dasol Choi: Coarse-Grain Anomaly Detection
    • This talk introduces a method for coarse-grain anomaly detection using normalizing flows on pose graphs, highlighting its compact and real-time performance, and addresses the ‘Novel Class Discovery’ problem in aerial imagery by focusing on ‘anomaly existence’.
  • 02:13:34Dasol Choi: DMR: Disentangling Marginal Representations for Out-of-Distribution Detection
    • This presentation introduces DMR, a method for Out-of-Distribution (OOD) detection that addresses overconfidence by disentangling marginal representations using latent operations and multiple latent mixup, demonstrating superior performance on various datasets.
  • 02:39:08Ho-Weng Lee: TAB: Text-Align Anomaly Backbone Model for Industrial Inspection Tasks
    • This talk introduces the Text-Align Anomaly Backbone (TAB) model, a pre-training framework that leverages text-image alignment for improved industrial anomaly detection and defect classification.
  • 03:21:15Jinan Bao: BMAD: Benchmarks for Medical Anomaly Detection
    • This talk introduces BMAD, a comprehensive and standardized benchmark for medical anomaly detection, including six datasets from five medical domains and supporting 15 state-of-the-art algorithms.
  • 03:58:42Radu Tudor Ionescu: Scalable Real-Time Abnormal Event Detection
    • This talk introduces a method for scalable real-time abnormal event detection in video using knowledge distillation from object-centric models to a fast frame-level model, incorporating adversarial training.
  • 04:07:00Radu Tudor Ionescu: Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors
    • This talk presents a self-distilled masked auto-encoder architecture for efficient video anomaly detection, leveraging motion gradient weighting and synthetic anomalies for improved performance and scalability.
  • 04:14:17Paula Ramos, PhD: VAND 2.0 Challenge at CVPR
    • Introduction to the VAND 2.0 Challenge at CVPR, sponsored by Intel, outlining its categories and evaluation process for visual anomaly detection.
  • 04:15:31Dick Ameln, MSc: The Challenge (Category 1 & 2 details)
    • Detailed explanation of the VAND 2.0 Challenge categories, focusing on adapt & detect (robust anomaly detection) and VLM anomaly challenge (few-shot learning for logical and structural detection), including dataset creation and evaluation metrics.
  • 05:18:31Babar Hussain: VAND 2.0: Challenge Category 1 - Adapt & Detect ARNet for Robust Anomaly Detection
    • Presents ARNet, the winning solution for VAND 2.0 Challenge Category 1, focusing on robust anomaly detection under real-world variations using a reconstruction-based network with a foreground predictor and synthetic data augmentation.
  • 05:33:46Ziyu Bao: Segment-aligned Features Impose Logical Constraints
    • Presents the 2nd place solution for VAND 2.0 Challenge Track 2, focusing on few-shot anomaly detection using segment-aligned features, employing few-shot learning with pre-trained visual-language models to differentiate anomaly types.
  • 05:38:46Zhaopeng Gu: AnomalyMoE: Few-shot Anomaly Detection Using Mixture of Experts
    • Presents the 1st place solution for VAND 2.0 Challenge Track 2, AnomalyMoE, which uses a Mixture of Experts framework for few-shot anomaly detection, combining various strategies to detect logical and structural anomalies.

Key Takeaways

  • The VAND 2.0 workshop emphasizes advancements in visual anomaly and novelty detection, featuring diverse research and an open-source initiative.
  • SplatPose offers a novel approach to pose-agnostic 3D anomaly detection by leveraging 3D Gaussian Splatting for robust normality learning and efficient anomaly localization.
  • Anomaliib addresses the need for standardized, reproducible, and easily deployable anomaly detection solutions, providing a comprehensive open-source framework for researchers and practitioners.
  • Effective anomaly detection hinges on choosing the right data representation and understanding ‘anomaly with respect to what,’ moving beyond simple outlier detection in pixel space.
  • Normalizing Flows (STG-NF) applied to pose graphs offer a compact and real-time solution for video anomaly detection, demonstrating robustness across various scenarios.
  • The ‘Novel Class Discovery’ problem can be reframed as an ‘anomaly existence’ question, where the goal is to efficiently identify if any novel classes exist within a large dataset, rather than exhaustively classifying all anomalies.
  • The DMR (Disentangling Marginal Representations) method effectively addresses overconfidence in Out-of-Distribution (OOD) detection by synthesizing artificial OOD training data through latent operations and multiple latent mixup, leading to superior performance.
  • Large Language Models (LLMs) like ChatGPT show potential in anomaly detection by providing plausible anomaly scores and explanations for complex scenarios, suggesting a new avenue for research in integrating LLMs with traditional anomaly detection techniques.
  • The TAB model significantly improves anomaly detection and defect classification performance in industrial inspection tasks by using text-image alignment during pre-training.
  • The TAB model addresses domain gaps and global content biases inherent in ImageNet pre-trained models, making it more suitable for detecting subtle local anomalies.
  • BMAD provides a much-needed standardized benchmark for medical anomaly detection, offering diverse datasets and a robust evaluation framework for 15 state-of-the-art algorithms.
  • The BMAD benchmark highlights the current performance gaps in medical anomaly detection, especially for localization, and provides a platform for future research and development in this critical area.
  • Knowledge distillation from high-performing but slow object-centric models can create fast, frame-level anomaly detectors suitable for real-time applications.
  • Self-distillation and motion gradient weighting within masked auto-encoder architectures can significantly improve the efficiency and accuracy of video anomaly detection.
  • Synthetic anomalies and data augmentation techniques are crucial for training robust anomaly detection models, especially when labeled abnormal data is scarce.
  • Real-world anomaly detection challenges require models that are robust to domain shifts (e.g., lighting, camera position, motion blur) and can handle logical/structural defects with few-shot learning.
  • Robust anomaly detection requires models capable of adapting to unknown real-world variations, which can be achieved through synthetic data generation and foreground-aware training.
  • Few-shot anomaly detection benefits from leveraging pre-trained visual-language models and integrating semantic segmentation for segment-aligned feature extraction.
  • A Mixture of Experts approach, combining different anomaly detection strategies (VLM-based, part-segmentation-based, patch-level), can effectively address both logical and structural anomalies across various granularity levels.
  • The VAND 2.0 Challenge highlights the importance of developing models that are robust to domain shifts and capable of precise pixel-level anomaly localization.

Methods / Models / Datasets Mentioned

  • 3D Gaussian Splatting
  • ACET
  • ARNet
  • AST
  • AUC
  • AUPR
  • AUROC
  • Anomaliib
  • Anomaly-Text-Aware pre-training strategy
  • AnomalyDINO
  • AnomalyMoE
  • Autoencoder (AE)
  • Avenue dataset
  • BMAD
  • BTAD
  • CADSD
  • CAVGA-R
  • CFA
  • CFLOW
  • CFlow-AD
  • CIFAR-10
  • CIFAR-100
  • CLIP
  • CS-Flow
  • ChatGPT
  • ComAD
  • Convolutional Transformer block
  • Coupled-Hypersphere-based Feature Adaptation
  • Cross-entropy loss
  • Cube-level models
  • CutPaste
  • DINO-ViT
  • DINOv2
  • DMR
  • DN2
  • DOTA-v2.0
  • DRAEM
  • DREAM
  • DTD
  • DeAOT
  • Deep SVDD
  • DeepSVDD
  • DenseNet-121
  • Discriminator
  • EfficientAD
  • Entropy
  • F1Max score
  • Frame-level models
  • GAN
  • GANomaly
  • GANs
  • GEOM
  • GOAD
  • Gaussian Noise
  • HR-STC
  • IDPA (Industrial Domain Prompt Association)
  • Image-level AUROC
  • ImageNet
  • KIRBY
  • KSDDD2
  • Kinetics-250
  • LSUN-crop
  • MAD dataset
  • MHRot
  • MIM
  • MKD
  • MSP
  • MVTec AD
  • MVtec AD
  • MVtec AD dataset
  • Mahalanobis
  • Masked Auto-Encoder (MAE)
  • Max pooling
  • MaxLogit
  • Mean Squared Error (MSE) reconstruction loss
  • MemSeg
  • MixedWM38
  • Motion gradient weighting
  • Multi-head attention
  • Multiple Latent Mixup (MLM)
  • NTU-RGB+D
  • OC-SVM
  • ODIN
  • Object-centric models
  • OmniAD
  • PNI Ensemble
  • PRO
  • PSAD
  • PaDiM
  • PaDim
  • PatchCore
  • Perlin Noise Generator
  • Pixel-level AUROC
  • Places-365
  • RBDC
  • RD4AAD
  • RD4AD
  • RealNet
  • RegAD
  • ResNet-50
  • ResNet18
  • ResNet50
  • SAM
  • SDAS
  • SPADE
  • ST-GCAE
  • ST-GCN
  • STC
  • STG-NF
  • STPM
  • SVHM
  • Self-Supervised Predictive Convolutional Attentive Block
  • Self-distillation
  • ShanghaiTech
  • ShanghaiTech dataset
  • SimpleNet
  • SplatPose
  • Student-Teacher Networks
  • Synthetic anomalies
  • TAB (Text-Align Anomaly Backbone)
  • TBDC
  • Textures
  • Tiny-ImageNet
  • UBNormal
  • UBnormal
  • UBnormal dataset
  • UCSD Ped2 dataset
  • UTRAD
  • VIM
  • VisA
  • WideResNet-40-2
  • WinCLIP
  • f-AnoGAN
  • iNeRF
  • t-SNE

Topics

3D Gaussian Splatting · Aerial Imagery · Anomaliib · Anomaly Detection · Anomaly Existence · Challenge Design · ChatGPT · Data Augmentation · Disentangling Marginal Representations · Domain Shift · Few-shot Learning · Graph Embedding · Knowledge Distillation · Logical Anomalies · Masked Autoencoders · Mixture of Experts · Multiple Latent Mixup · Normalizing Flows · Novel Class Discovery · Open-Source Tools · Out-of-Distribution Detection · Pose Graphs · Pose-Agnostic 3D Anomaly Detection · Real-Time Processing · Real-world Variations · Reproducibility · Robustness · Self-Supervised Learning · Semantic Segmentation · Structural Anomalies · Video Anomaly Detection · Workshop Introduction · anomaly detection · benchmark · deep learning · defect classification · industrial inspection · medical anomaly detection · pre-training framework · text-image alignment


Notes

Open for commentary — connections to other work, critiques, follow-up reading.