Bio-Inspired Embedded Event-based Visual Processing

Event: Event-Based Vision Workshop · Duration: 27 min · ▶ Watch on YouTube

Abstract

This presentation explores bio-inspired embedded event-based visual processing, highlighting the advantages of event-based sensors and spiking neural networks for power-efficient and real-time applications. The speaker details the development of HFIRST, a spiking neural network inspired by HMAX, for character and digit recognition, implemented on FPGAs and Spinnaker. The talk also covers the challenges of small datasets and the transition to Deep Learning with IBM TrueNorth for scaling up. Finally, it delves into motion estimation using both delay-based and delay-free spiking neural networks, demonstrating their capabilities on simulated and real-world data, including quadcopter recordings.

Speakers

Garrick Orchard — Temasek Laboratories at NUS (TL@NUS), Singapore Institute for Neurotechnology (SINAPSE@NUS), National University of Singapore

Talks (35)

00:00:00 — Garrick Orchard: Bio-Inspired Embedded Event-based Visual Processing
- The speaker introduces his work on bio-inspired embedded event-based visual processing, focusing on the advantages of event-based sensors and spiking neural networks for power efficiency and real-world navigation.
00:40:59 — Garrick Orchard: What is an Event-Based Vision Sensor?
- The speaker briefly touches upon the definition of event-based vision sensors, acknowledging the audience’s familiarity with the topic.
00:50:00 — Garrick Orchard: A Bio-Inspired Sensory Processing System
- A diagram illustrates a bio-inspired sensory processing system comprising an ATIS sensor, which outputs spikes, feeding into a Spiking Neural Network for processing and spike output.
02:14:00 — Garrick Orchard: Tasks
- The talk will cover two main tasks using Spiking Neural Networks: Recognition and Optical Flow. Other non-neural techniques for optical flow, depth from stereo, and localization are also mentioned.
02:43:00 — Garrick Orchard: HFIRST (INSPIRED BY HMAX)
- The speaker describes HFIRST, a spiking neural network model inspired by the frame-based HMAX model, detailing its stages: Orientation Extraction (Gabor Filters), Local Max Pooling, Template Matching, and Classifier.
04:41:00 — Garrick Orchard: ORIENTATION EXTRACTION
- A video demonstrates orientation extraction using Gabor filters on event-based data, showing different colors representing various edge orientations.
05:05:00 — Garrick Orchard: DATA FOR TRAINING
- The speaker presents data collected from a DVS128 sensor, showing characters printed on a rotating barrel, used for training the recognition system.
05:27:00 — Garrick Orchard: DATA PROCESSING
- A video illustrates the data processing steps, including tracking a character, stabilizing its view, and applying orientation extraction filters to generate a color-encoded spike response.
06:02:00 — Garrick Orchard: HFIRST (SNN ON FPGA)
- The HFIRST model was implemented on an FPGA, achieving 97.5% accuracy for Card Pip Recognition (4 classes) and 84.9% for Digits and Characters (36 classes) by having each template neuron fire when it sees a match.
07:35:00 — Garrick Orchard: HFIRST (SNN ON SPINNAKER)
- A live demo of HFIRST running on a Spinnaker board is shown, where an ATIS sensor captures characters, and the Spinnaker board processes them to recognize the digits in real-time.
08:35:00 — Garrick Orchard: SMALL DATASET PROBLEM
- The speaker discusses the limitations of small datasets in event-based vision, highlighting how initial high accuracy on simple tasks like card pip recognition quickly drops when moving to more complex problems like facial recognition with limited samples.
09:26:00 — Garrick Orchard: DATASET CREATION
- To address the small dataset problem, the team created larger datasets by converting existing computer vision datasets like Caltech101 and MNIST into neuromorphic events using a pan-tilt mechanism, generating real-time image presentations and recorded data.
11:42:00 — Garrick Orchard: SCALING UP
- Recognizing that hand-tuning is ineffective for large datasets, the team shifted to Deep Learning with tools from IBM TrueNorth, leveraging its development platform for training and deployment.
12:17:00 — Garrick Orchard: USING TRUENORTH
- A video demonstrates a live character recognition demo using an ATIS sensor directly interfaced with an IBM TrueNorth chip, showcasing a fully embedded system without a PC in the loop for processing.
13:28:00 — Garrick Orchard: TrueNorth System Training
- The training setup for the TrueNorth system involves an ATIS sensor and FPGA communicating preprocessed spikes to a PC, which uses IBM’s Deep Learning tools to train a model that is then deployed to the TrueNorth chip.
14:14:00 — Garrick Orchard: TrueNorth System Runtime
- During runtime, the ATIS sensor and FPGA directly communicate spikes to the TrueNorth chip, where a preprocessing module and Deep Neural Network are implemented, with a PC only needed for output interpretation (spike counting).
14:41:00 — Garrick Orchard: Direct Interface ATIS-TrueNorth
- A video shows a direct interface between the ATIS sensor and TrueNorth, with spikes from the ATIS being directly injected into the TrueNorth chip, demonstrating a fully spiking system from beginning to end.
15:17:00 — Garrick Orchard: NMNIST
- The speaker presents results on the NMNIST dataset, evaluating different preprocessing methods (constant event number, constant time binary image, constant time event count, constant time surfaces) and showing high accuracies (up to 98.7%) for digit recognition.
16:52:00 — Garrick Orchard: WHERE TO NEXT
- The future work involves tackling more challenging problems like N-Caltech 101 and recognizing objects in clutter from a moving platform, moving beyond simple black-on-white scenarios.
17:26:00 — Garrick Orchard: MOTION ESTIMATION
- The speaker transitions to motion estimation, emphasizing its importance for mobile and embedded systems, and bio-inspired approaches.
17:38:00 — Garrick Orchard: Using delays to create a motion sensitive neuron
- A diagram illustrates how synapse delays can be used to create a neuron sensitive to motion at a fixed speed and direction by making spatially and temporally spread spikes arrive coincidentally at the neuron.
18:38:00 — Garrick Orchard: ROTATING BAR EXAMPLE
- A video demonstrates motion estimation on a rotating bar, showing how the system outputs speed (color-coded) and direction (arrow) of the moving bar, with increasing speeds indicated by brighter colors.
19:13:00 — Garrick Orchard: REAL WORLD DATA
- The motion estimation model is applied to real-world traffic data captured by an ATIS sensor, showing different directions of motion encoded by colors and speeds indicated by lightness, even with some faster motion present.
19:41:00 — Garrick Orchard: A delay based learning rule
- The speaker introduces a delay-based learning rule called STDPP (Spike-Time Dependent Delay Plasticity), which adjusts synaptic delays to make input spikes arrive coincidentally at the output neuron, effectively tuning the neuron to specific spatio-temporal patterns.
21:11:00 — Garrick Orchard: LEARNING TO PERCEIVE MOTION
- The STDPP learning rule is applied to a rotating bar stimulus, showing that over training iterations, the neurons become reliably tuned to the motion, with the percentage of stimuli eliciting a response increasing significantly.
22:42:00 — Garrick Orchard: SPEEDS LEARNT
- Histograms show that the neurons successfully learn to tune to the specific speeds present in various simulated and real-world motion stimuli (rotating bar, rotating spiral), with the learned speeds matching the stimulated speeds.
23:02:00 — Garrick Orchard: DIRECTIONS LEARNT
- Similar to speeds, the neurons also learn to tune to the specific directions of motion present in the scene, with histograms showing clear peaks corresponding to the directions of motion in the rotating bar and spiral examples.
23:41:00 — Garrick Orchard: Can we compute motion without delays? (Barlow-Levick)
- The speaker explores an alternative approach to motion estimation without explicit delays, inspired by the Barlow-Levick model, where a neuron’s membrane potential is excited by one part of the receptive field and inhibited by another, allowing it to detect motion based on the sequence of excitation and inhibition.
24:42:00 — Garrick Orchard: MOTION ESTIMATION ON TRUENORTH
- This delay-free motion estimation model is implemented on IBM TrueNorth, demonstrating its ability to detect the direction of motion of a rotating bar, with different colors representing different directions.
25:07:00 — Garrick Orchard: Can we compute speed too?
- The speaker explains how the delay-free model can also compute speed by measuring the time between excitation and inhibition, with faster motion resulting in shorter time differences and thus higher firing rates.
25:46:00 — Garrick Orchard: MOTION ESTIMATION NON-NEURAL APPROACH
- The speaker presents a non-neural approach to motion estimation based on local plane fitting, which is implemented on an FPGA and can process 100 million events per second, providing high throughput for optical flow estimation.
26:12:00 — Garrick Orchard: QUADCOPTER RECORDINGS
- The motion estimation techniques are applied to real-world quadcopter recordings, demonstrating their ability to estimate motion in complex, textured environments.
26:16:00 — Garrick Orchard: MOTION ESTIMATION ON FPGA
- A video shows the FPGA implementation of the non-neural motion estimation approach processing quadcopter data in real-time, correctly estimating the direction of motion in a highly textured environment.
26:48:00 — Garrick Orchard: LAB TEAM
- The speaker thanks his lab team members, acknowledging their contributions to the presented work.

Key Takeaways

Event-based sensors and spiking neural networks offer significant advantages in power efficiency and real-time processing for embedded visual applications.
Bio-inspired models like HFIRST can be effectively implemented on neuromorphic hardware (FPGA, Spinnaker, TrueNorth) for tasks like character and digit recognition.
The transition to Deep Learning with platforms like IBM TrueNorth is crucial for scaling up event-based vision systems to handle larger and more complex datasets.
Spiking neural networks can be designed to perform motion estimation by leveraging synaptic delays or by using delay-free mechanisms inspired by biological vision.
Developing larger, more realistic neuromorphic datasets is essential for advancing the field of event-based vision, similar to how large datasets propelled traditional computer vision.

Methods / Models / Datasets Mentioned

ATIS
Spiking Neural Network
HFIRST
HMAX
Gabor Filters
Local Max Pooling
Template Matching
DVS128
FPGA
Spinnaker
Deep Learning
IBM TrueNorth
NMNIST
Caltech101
MNIST
STDPP (Spike-Time Dependent Delay Plasticity)
Barlow-Levick Model
Local Plane Fitting

Topics

Event-based Vision · Spiking Neural Networks (SNN) · Bio-inspired Processing · Embedded Systems · Neuromorphic Computing · IBM TrueNorth · FPGA Implementation · Object Recognition · Motion Estimation · STDPP (Spike-Time Dependent Delay Plasticity) · Barlow-Levick Model

Notes

Open for commentary — connections to other work, critiques, follow-up reading.