CSFNet: A Cosine Similarity Fusion Network for Real-Time RGB-X Semantic Segmentation of Driving Scenes
Authors:
Danial Qashqai,
Emad Mousavian,
Shahriar Baradaran Shokouhi,
Sattar Mirzakuchaki
Abstract:
Semantic segmentation, as a crucial component of complex visual interpretation, plays a fundamental role in autonomous vehicle vision systems. Recent studies have significantly improved the accuracy of semantic segmentation by exploiting complementary information and develo** multimodal methods. Despite the gains in accuracy, multimodal semantic segmentation methods suffer from high computationa…
▽ More
Semantic segmentation, as a crucial component of complex visual interpretation, plays a fundamental role in autonomous vehicle vision systems. Recent studies have significantly improved the accuracy of semantic segmentation by exploiting complementary information and develo** multimodal methods. Despite the gains in accuracy, multimodal semantic segmentation methods suffer from high computational complexity and low inference speed. Therefore, it is a challenging task to implement multimodal methods in driving applications. To address this problem, we propose the Cosine Similarity Fusion Network (CSFNet) as a real-time RGB-X semantic segmentation model. Specifically, we design a Cosine Similarity Attention Fusion Module (CS-AFM) that effectively rectifies and fuses features of two modalities. The CS-AFM module leverages cross-modal similarity to achieve high generalization ability. By enhancing the fusion of cross-modal features at lower levels, CS-AFM paves the way for the use of a single-branch network at higher levels. Therefore, we use dual and single-branch architectures in an encoder, along with an efficient context module and a lightweight decoder for fast and accurate predictions. To verify the effectiveness of CSFNet, we use the Cityscapes, MFNet, and ZJU datasets for the RGB-D/T/P semantic segmentation. According to the results, CSFNet has competitive accuracy with state-of-the-art methods while being state-of-the-art in terms of speed among multimodal semantic segmentation models. It also achieves high efficiency due to its low parameter count and computational complexity. The source code for CSFNet will be available at https://github.com/Danial-Qashqai/CSFNet.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
Structural Properties of Search Trees with 2-way Comparisons
Authors:
Sunny Atalig,
Marek Chrobak,
Erfan Mousavian,
Jiri Sgall,
Pavel Vesely
Abstract:
Optimal 3-way comparison search trees (3WCST's) can be computed using standard dynamic programming in time O(n^3), and this can be further improved to O(n^2) by taking advantage of the Monge property. In contrast, the fastest algorithm in the literature for computing optimal 2-way comparison search trees (2WCST's) runs in time O(n^4). To shed light on this discrepancy, we study structure propertie…
▽ More
Optimal 3-way comparison search trees (3WCST's) can be computed using standard dynamic programming in time O(n^3), and this can be further improved to O(n^2) by taking advantage of the Monge property. In contrast, the fastest algorithm in the literature for computing optimal 2-way comparison search trees (2WCST's) runs in time O(n^4). To shed light on this discrepancy, we study structure properties of 2WCST's. On one hand, we show some new threshold bounds involving key weights that can be helpful in deciding which type of comparison should be at the root of the optimal tree. On the other hand, we also show that the standard techniques for speeding up dynamic programming (the Monge property / quadrangle inequality) do not apply to 2WCST's.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.