Spectral Stereo

1 view

Skip to first unread message

Ortiz Ullery

unread,

Aug 5, 2024, 9:52:00 AM8/5/24

to apvoibubi

Wepropose Gated RCCB Stereo, a novel stereo-depth estimation method that is capable of exploiting multi-modal multi-view depth cues from RCCB and gated stereo images. We combine complementary cross-spectral features within an iterative stereo matching network with our proposed cross-spectral matching layer, utilizing intermediate depth estimations for accurate feature-registration. The proposed method achieves accurate depth at long ranges, outperforming the next best existing method by 39% for ranges of 100 to 220m in MAE on accumulated LiDAR ground-truth.

Below, example images of both image modalities (left) are shown next to the sensor setup used for collecting the data. RCCB cameras (left top row) capture 8 Mpix passive RGB images. Gated cameras (left bottom row) record Time-of-Flight data of a scene by combining active flash illumination and analog gated readout. Both sensors are complementary, with distinct strengths depending on the scenario. RCCB cameras excel in daylight (a) with high dynamic range, resolution and color. At night (b, c), gated images (gated slices here RGB-color coded by mapping each slice to one RGB color) provide strong depth cues and maintain consistent scene illumination through active illumination. This work integrates both modalities to estimate depth accurately in all ambient illumination conditions.

In the following, qualitative results of the proposed Gated RCCB Stereo are compared to existing methods that use gated images (Gated2Gated, Gated Stereo), stereo RCCB images (CREStereo) as input. Gated2Gated uses three active gated slices and demonstrates effective exploitation of implicit depth cues in gated images. Gated Stereo uses the same gated stereo images as our method, combining gated and multi-view cues. CREStereo (RCCB) is the next best stereo depth estimation method for high-resolution RCCB data as input.

The proposed cross-spectral stereo architecture estimates depth from stereo RCCB and stereo gated images. Intermediate depth estimates are used for iterative fusion within the Cross-Spectral Matching (CSM) layer along the depth estimation process. The network is trained with self-supervision (Left-Right consistency for RCCB and gated images, Gated Reconstruction) and LiDAR supervision.

The Cross-Spectral Matching (CSM) layer fuses encoded features from RCCB ($F^c_l$) and gated ($F^g_l$) images. In the coarse registration step, RCCB features are aligned with gated features based on calibrated poses $X_c \to g$. Registration is refined based on residual pose $\hatX_c$ estimated from coarse aligned images and measured time delta with PoseNet. Registered images are fused with attention-based fusion retaining complementary information in $\hatF$.

[1] Stefanie Walz, Mario Bijelic, Andrea Ramazzina, Amanpreet Walia, Fahim Mannan and Felix Heide. Gated Stereo: Joint Depth Estimation from Gated and Wide-Baseline Active Stereo Cues. The IEEE International Conference on Computer Vision (CVPR), 2023.

[2] Amanpreet Walia, Stefanie Walz, Mario Bijelic, Fahim Mannan, Frank Julca-Aguilar, Michael Langer, Werner Ritter and Felix Heide. Gated2Gated: Self-Supervised Depth Estimation from Gated Images. The IEEE International Conference on Computer Vision (CVPR), 2022.

[3] Jiankun Li, Peisen Wang, Pengfei Xiong, Tao Cai, Ziwei Yan, Lei Yang, Jiangyu Liu, Haoqiang Fan and Shuaicheng Liu. Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation. The IEEE International Conference on Computer Vision (CVPR), 2022.

[4] Tixiao Shan, Brendan Englot, Drew Meyers, Wei Wang, Carlo Ratti and Daniela Rus. LIO-SAM: Tightly-coupled Lidar Inertial Odometry via Smoothing and Mapping. The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020.

Model overview. The disparity prediction network (DPN) predicts left-right disparity for a RGB-NIR stereo input. The spectral translation network (STN) converts the left RGB image into a pseudo-NIR image. The two networks are trained simultaneously with reprojection error.

Comparison of smoothing with and without confidence. Smoothing without confidence makes the reliable matching around the car sides be misled by the unreliable matching on glass, which causes the predicted disparity (orange) to be smaller than the correct one (red). Introducing confidence addresses this issue.

August 25, 2020: v5.1.1 firmware available (minor bug-fix) Version 5.1.1 zip fileVersion 5.1.1 wav file [LOUD!]Manual 1.1.1The Spectral Multiband Resonator from 4ms Company is an innovative resonant filter which can process audio like a classic filter bank, ring like a marimba when struck, vocode, re-mix tracks, harmonize, output spectral data, quantize audio to scales, and much more...A gorgeous ring of colored lights displays the frequency of each filter, as well as the levels and current scale selection(s).

At a glance, the SMR works like a normal six-band graphic EQ: six frequency band-pass filters are mixed together using sliders. Resonance/Q is variable, which changes the "ringy-ness" or width of the bands.

But here's where comparison stops. The frequency of each channel is treated like a note in a scale, and the six bands form a chord. Spin the Rotate knob and the "notes" circle around the scale, rotating back to the bottom once they've reached the top. Adjust Spread and the distance (interval) between each note changes. Triggers for up/down motion, CV inputs for sequencing and scale selection allow for flexible control with external modules. Morph, which automatically cross-fades between frequencies, together with variable Slew allows rhythmic clocks drive the SMR as a variable-speed evolving resonant filter.

At maximum Resonance/Q, the SMR can be struck like a gong or marimba by inputting clocks or triggers. The frequency of each channel is quantized to a scale: beautiful chords, ethereal tones and eerie ambiance flows easily. With lower resonance, the SMR can pull out particular frequency bands, and sweep these across the spectrum.

Save your settings into one of six storage banks, and recall on the fly. On startup, the SMR instantly jumps to the last saved settings. The color scheme of the lights can be adjusted, so if you prefer all white lights on your modular, you can have that (or all red, or rainbows, pastels, etc..)!

Basic Features:Six filter channels, selected from twenty active filters displayed on the light ringVariable "Q" (Resonance) ranges from classic band-pass to ultra-resonant ringingRing of 20 full-color lights displays the filter frequenciesFilter frequencies move about the scale using Rotation and SpreadRotation "spins" the frequencies around the scaleSpread controls the gap between neighboring bandsMorph creates a variable speed cross-fade when the filter frequencies changeLock buttons prevent each channel from changing in frequency (resonance can also be locked)Freq Nudge knobs for creating de-tuning effects, or honing in on an exact frequencyFrequency CV input (1V/octave) for even and odd bands (flip switch selects single or multiple channel control)Stereo ins and outs stagger the bands into evens/odds for an immersive stereo fieldSpectral content outputs for each channel (Env Out) allow for vocoding (spectral transfer)Fast or Slow tracking speeds for Envelope CV ouptuts, or selectable Trigger output mode is useful for extracting a beat from musicSliders and CV jacks control each filter's levelSlew switch applies slew-limiting to the CV level jacks, which prevents clicking from clocks or triggersEach bank has 11 scales of 20 frequencies/notes each.User-programmable scales (select octave, chromatic note, and microtonal adjustments for each note in the scale)Scale banks for Western, Indian, Chromatic, Micro-tonal, Equal tempered, Just intonation tuningsRotation/Spread moves about an entire bank, or can be limited to a single scaleWhite noise is normalized to the inputs, so the SMR can be used without an external signalAdvanced featuresProgram your own scale. The frequency of each of the 20 notes can be assigned by setting the octave, the semi-tone, and coarse and fine micro-tone. Up to 11 scales can be saved permanently in the user bankAdjust color scheme of the LEDs. Pick a pre-programmed color scheme or create your own using the sliders to set Red/Green/Blue values. Custom color schemes can be saved permanently Save your settings in one of six parameter banks. Note position, scale and bank selection, Q value, Lock settings and color scheme can be saved and recalled on the fly. On startup, the SMR loads settings from the last saved bank.Optional alternative filter type for a more exponential decay when plucked and different timbral qualities. Freq jacks no longer track 1V/oct in this mode.Slider LEDs can be assigned to display level for each channel (combination of slider position and CV on the jack), or clipping for each channel.

Soniformer is a spectral mastering dynamics processor AAX, AudioUnit, and VST plugin for professional music production applications. During its operation, Soniformer splits incoming sound signal into multiple spectral bands. This makes Soniformer a powerful and precise tool for mastering and sound restoration purposes.

Continuing the retro trend of 'new analogue monosynths', Swiss company Spectral Audio enter the fray with their ProTone, a 2U module generously laden with controls and ready to do your bidding. The first thing you'll notice is its colour. The review model was a cherry red, and because Spectral Audio are industry striplings, they're still able to offer purchasers the personal touch, and supply the ProTone in the colour of your choice.

The unit is compact, a mere six inches deep, and a look at the back panel reveals MIDI In and Thru, but no Out. This is because there is no processing between the synth's controls and the sound generator, and so MIDI codes can be neither generated nor received by the controls. Stereo outputs, CV and Gate sockets, and external VCO and LFO inputs complete the picture. Happily, the power supply is internal.