Low-latency localization by active led markers tracking using a dynamic vision sensor. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 891–898. Tokyo,Japan, November 2013. pdfdoi supp. material
bibtexAbstract: At the current state of the art, the agility of an autonomous flying robot is limited by the speed of its sensing pipeline, as the relatively high latency and low sampling frequency limit the aggressiveness of the control strategies that can be implemented. To obtain more agile robots, we need faster sensors. A Dynamic Vision Sensor (DVS) encodes changes in the perceived brightness using an address-event representation. The latency of such sensors can be measured in the microseconds, thus offering the theoretical possibility of creating a sensing pipeline whose latency is negligible compared to the dynamics of the platform. However, to use these sensors we must rethink the way we interpret visual data. We present an approach to low-latency pose tracking using a DVS and Active Led Markers (ALMs), which are LEDs blinking at high frequency (>1 KHz). The DVS time resolution is able to distinguish different frequencies, thus avoiding the need for data association. We compare the DVS approach to traditional tracking using a CMOS camera, and we show that the DVS performance is not affected by fast motion, unlike the CMOS camera, which suffers from motion blur.
Additional materials
- Slides:
- datasets:
- available at http://andrea.caltech.edu/pub/1212-DVS-data/
- source code:
- This is the C++ implementation used in the experiments: https://github.com/ailab/dvs_tracking
- This is a Python implementation: https://github.com/AndreaCensi/env_dvs
How the DVS data looks like
These videos shows a representation of the output of a dynamic vision sensor (DVS) developed by Tobi Delbruck's group at the Institute for Neuroinformatics at ETH/University of Zurich. The data was collected at the RPG lab at University of Zurich with assistance from Jonas Strubel.
A DVS returns the data as a series of events rather than frames. Each event corresponds to the change in luminance of one pixel. For visualization only, the events are visualized as a histograms in a given time slice (below, the interval is 1/30, 1/1000, and 1/3000 of a second). See these references for an introduction to the DVS technology:
- S. Liu and T. Delbruck, Neuromorphic sensory systems DOI:10.1016/j.conb.2010.03.00
- P. Lichtsteiner, C. Posch, T. Delbruck A 128x128 120 dB 15 Latency Asynchronous Temporal Contrast Vision Sensor DOI:10.1109/JSSC.2007.914337
Real-time video
In these videos, a Parrot quadcopter is seen from below by a DVS. There are 4 blinking LEDs near the rotors as well. This is the data rendered in real time (1 frame = 1/30 seconds).
1 frame = 1 millisecond
This is the data rendered such that each frame shows the events arrived in 1 millisecond. Note that we can distinguish the rotations of the rotors.
1 frame = 0.05 milliseconds
As the histogram interval gets smaller, fewer and fewer events are plotted.