The 20th Embedded Vision Workshop (EVW2024)

Event: CVPR Embedded Vision Workshop 2024 · Duration: 189 min · ▶ Watch on YouTube

Abstract

The 20th Embedded Vision Workshop (EVW2024) at CVPR 2024 featured a series of invited talks and oral presentations covering recent advances and challenges in embedded vision. Topics ranged from neuromorphic computing and efficient neural network design for lightweight instance segmentation to real-time urban streetscape video analysis and hardware-aware scaling for on-device continual learning. Discussions also included novel approaches for video stabilization, object detection on ultra-low-power systems, and the application of AI in companion robots, emphasizing the importance of latency, energy efficiency, and robust design methodologies for real-world deployment.

Speakers

  • Branislav Kisačanin — Nvidia and Institute for AI R&D of Serbia
  • Yusuke Sakemi — Chiba Institute of Technology, Japan
  • Tse-Wei Chen — Canon Inc., Japan
  • Carlos Victorino Padeiro — ETH Zurich
  • Sam Leroux — Qualcomm
  • Zoran Kostic — Columbia University, US
  • Lukas Frickenstein — BMW Group
  • Manon Damphoffer — UGA (Université Grenoble Alpes)
  • Cevahir Cigla — Aselsan Inc.
  • Mario E. Munich — Embodied, Inc., US
  • Marilyn Wolf — University of Nebraska - Lincoln, US
  • Jamie Menjay Lin — ETH Zurich
  • Francesco Paissan — ETH Zurich
  • Laurie Nicholas Bose — Visionchip Ltd.
  • Luca Bompani — ETH Zurich
  • Elishai Ezra Tsur — Technion - Israel Institute of Technology
  • Anamika Jha — Texas Instruments
  • Parakh Agarwal — Purdue University
  • Omkar Prabhune — Purdue University
  • Jan Ernst — Latent AI, USA

Talks (20)

  • 00:00:00 — Branislav Kisačanin: Welcome and Workshop Overview
    • Opening remarks for the 20th Embedded Vision Workshop (EVW2024) at CVPR 2024, outlining the agenda, invited speakers, and logistical details for presentations and poster sessions.
  • 01:04:00Yusuke Sakemi: Physical Modeling Approach for Simple and Energy-Efficient Analog Neuromorphic Computers
    • This talk presents a physical modeling approach for in-memory computing circuits, interpreting them as spiking neural networks to address non-idealities and improve energy efficiency for neuromorphic applications.
  • 01:07:20Tse-Wei Chen: Dedicated Inference Engine and Binary-Weight Neural Networks for Lightweight Instance Segmentation (Paper #01)
    • This presentation introduces a dedicated inference engine for binary-weight neural networks designed to reduce hardware costs and improve accuracy for lightweight instance segmentation tasks on embedded platforms.
  • 01:10:15Carlos Victorino Padeiro: Lightweight Maize Disease Detection Through Post-Training Quantization with Similarity Preservation (Paper #03)
    • This talk explores a post-training quantization method that preserves similarity for lightweight maize disease detection models, enabling efficient deployment on embedded systems without significant accuracy loss.
  • 01:12:55Sam Leroux: Multi-bit, Black-box Watermarking of Deep Neural Networks in Embedded Applications (Paper #06)
    • This presentation introduces a multi-bit, black-box watermarking technique for deep neural networks, designed to protect intellectual property in embedded applications by embedding a unique fingerprint into the model.
  • 01:15:55Zoran Kostic: Real Time for Urban Streetscape Video-Based Applications
    • This invited talk discusses the challenges and solutions for achieving real-time performance in urban streetscape video-based applications, focusing on low-latency processing and high-bandwidth communication for embedded vision systems.
  • 01:22:05Lukas Frickenstein: Pruning as a Binarization Technique (Paper #09)
    • This presentation introduces a novel pruning technique that leverages binarization to achieve high-accuracy and efficient deep neural networks, focusing on reducing computational complexity for embedded applications.
  • 01:25:15Manon Damphoffer: Neuromorphic Lip-Reading with Signed Spiking Gated Recurrent Units (Paper #10)
    • This talk presents a neuromorphic lip-reading pipeline using event cameras and signed spiking gated recurrent units, demonstrating high accuracy and energy efficiency for on-device processing.
  • 01:28:15Cevahir Cigla: Efficient Video Stabilization via Partial Block Phase Correlation on Edge GPUs (Paper #11)
    • This presentation introduces an efficient video stabilization method utilizing partial block phase correlation on edge GPUs, designed to provide low-latency and high-performance stabilization for embedded vision applications.
  • 01:31:15Mario E. Munich: Building Moxie - an embedded AI companion
    • This invited talk discusses the development of Moxie, an embedded AI companion robot designed for children, focusing on the challenges of building a consumer robot with advanced AI capabilities while maintaining cost-effectiveness and emotional intelligence.
  • 01:38:00Marilyn Wolf: Perception/Control Co-Design for Autonomous Vehicles
    • This invited talk explores the critical relationship between perception latency and control accuracy in autonomous vehicles, proposing a Markovian error model to improve tracking performance and enable predictive design space exploration for robust embedded vision systems.
  • 01:46:15Jamie Menjay Lin: SciFlow: Empowering Lightweight Optical Flow Models with Self-Cleaning Iterations (Paper #12)
    • This presentation introduces SciFlow, a lightweight optical flow model empowered by self-cleaning iterations and a regression focal loss, designed to mitigate ambiguities and errors in optical flow estimation for embedded applications.
  • 01:49:15Francesco Paissan: Structured Sparse Back-propagation for Lightweight On-Device Continual Learning on Microcontroller Units (Paper #13)
    • This talk presents a structured sparse back-propagation method for lightweight on-device continual learning on microcontroller units, enabling efficient model updates and significant memory savings for embedded vision tasks.
  • 01:52:15Laurie Nicholas Bose: Demonstration of SCAMP-7 Pixel Processor Array (Demo #04)
    • This demonstration showcases the SCAMP-7 Pixel Processor Array, an image sensor where every pixel is a processor, enabling high-speed, low-power in-pixel computation for various embedded vision applications like object tracking, gesture detection, and CNN inference.
  • 01:55:15Luca Bompani: Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems (Paper #15)
    • This presentation introduces a multi-resolution rescored ByteTrack method for video object detection on ultra-low-power embedded systems, improving accuracy and throughput by leveraging temporal consistency and selectively processing frames at different resolutions.
  • 01:58:15Elishai Ezra Tsur: ED-DCFNet: an unsupervised encoder-decoder neural model for event-driven feature extraction and object tracking (Paper #17)
    • This talk introduces ED-DCFNet, an unsupervised encoder-decoder neural model designed for event-driven feature extraction and object tracking, leveraging event camera data for efficient and robust performance in embedded vision systems.
  • 02:01:15Anamika Jha: RAVN: Reinforcement Aided Adaptive Vector Quantization of Deep Neural Networks (Paper #18)
    • This presentation introduces RAVN, a reinforcement-aided adaptive vector quantization method for deep neural networks, designed to optimize quantization for efficient deployment on embedded systems while maintaining accuracy.
  • 02:04:15Parakh Agarwal: Prune Efficiently by Soft Pruning (Paper #21)
    • This talk presents a soft pruning technique for neural networks that efficiently reduces model complexity by adjusting weights across epochs, leading to smaller and more efficient networks suitable for embedded applications.
  • 02:07:15Omkar Prabhune: Content-aware Input Scaling & Deep Learning Computation Offloading for Low-Latency Embedded Vision (Paper #22)
    • This presentation introduces a content-aware input scaling and computation offloading method for low-latency embedded vision, optimizing the trade-off between latency and accuracy by selectively processing high-resolution regions of interest on edge devices and offloading background processing to a server.
  • 02:10:15Jan Ernst: Design Space Exploration for ML System Design with Hardware-in-the-Loop
    • This invited talk introduces a holistic and evidence-driven approach to ML system design using hardware-in-the-loop, leveraging semantic abstraction and automation to navigate complex design tradeoffs and predict performance across various hardware targets and applications.

Key Takeaways

  • Embedded vision systems require careful co-design of perception and control, considering factors like latency, energy efficiency, and hardware constraints for optimal real-world performance.
  • Novel approaches in neuromorphic computing and sparse back-propagation enable efficient on-device continual learning and reduce computational demands for ultra-low-power embedded systems.
  • Addressing non-idealities in analog computing and leveraging techniques like self-cleaning iterations and content-aware scaling are crucial for developing robust and accurate embedded AI models.
  • The development of AI companion robots like Moxie highlights the need for advanced multimodal input processing, emotional intelligence, and robust hardware design within strict cost and privacy considerations.
  • Design space exploration with hardware-in-the-loop and Markovian error modeling are essential for predictive performance analysis and navigating complex tradeoffs in ML system design for safety-critical applications like autonomous vehicles.

Methods / Models / Datasets Mentioned

  • YOLO v5
  • YOLO v8
  • SSD-MobilenetV2
  • MobileNetV2-FPNLite
  • EfficientDet-D1
  • FasterRCNN-ResNet50
  • ByteTrack
  • SCAMP-7
  • ED-DCFNet
  • RAVN
  • PhiNet
  • SciFlow
  • YOLOv8n
  • YOLOv8s
  • YOLOv8m
  • YOLOv8l
  • YOLOv8x
  • YOLOv1
  • YOLOv8x6
  • TensorRT
  • DeepStream
  • GStreamer
  • Tiny YOLO V3
  • ResNetV1
  • MobileNetV2
  • PixelRNN
  • IMX334
  • NileCAM01
  • QDrop
  • BRECQ
  • SPQ
  • CORE50
  • PULP platform
  • RISC-V
  • COSMOS
  • DataCity
  • Mobility Intelligence
  • HMM (Hidden Markov Model)
  • NanoDet-Plus
  • YOLOX-Nano

Topics

Embedded Vision · Neuromorphic Computing · Efficient AI · Optical Flow · Continual Learning · Microcontroller Units · Video Stabilization · Object Detection · AI Companions · Hardware-in-the-Loop · Low-Latency Systems · Energy Efficiency · Deep Learning Optimization


Notes

Open for commentary — connections to other work, critiques, follow-up reading.