WAN Jun, ZHOU Kai, HE Wenlei. Lightweight Multisource Object Detection Based on Group Feature Extraction[J]. Infrared Technology , 2025, 47(3): 307-315.
Citation: WAN Jun, ZHOU Kai, HE Wenlei. Lightweight Multisource Object Detection Based on Group Feature Extraction[J]. Infrared Technology , 2025, 47(3): 307-315.

Lightweight Multisource Object Detection Based on Group Feature Extraction

More Information
  • Received Date: May 21, 2023
  • Revised Date: July 02, 2023
  • To balance the accuracy and efficiency of multisource object detection networks, a lightweight infrared and visible light object detection model with a multiscale attention structure and an improved object-box filtering strategy was designed by applying group convolution to multimodal object features. First, multiple feature dimensionality reduction strategies were adopted to sample the input image and reduce the impact of noise and redundant information. Subsequently, feature grouping was performed based on the mode of the feature channel, and deep separable convolution was used to extract infrared, visible, and fused features, to enhance the diversity and efficiency of extracted multisource feature structures. Then, an improved attention mechanism was utilized to enhance key multimodal features in various dimensions, combining them with a neighborhood multiscale fusion structure to ensure scale invariance of the network. Finally, the optimized non-maximum suppression algorithm was used to synthesize the prediction results of objects at various scales for accurate detection of each object. Experimental results based on the KAIST, FLIR, and RGBT public thermal datasets show that the proposed model effectively improves object detection performance compared with the same type of multisource object detection methods.

  • [1]
    杜紫薇, 周恒, 李承阳, 等. 面向深度卷积神经网络的小目标检测算法综述[J]. 计算机科学, 2022, 49(12): 205-218. DOI: 10.11896/jsjkx.220500260

    DU Z W, ZHOU H, LI C Y, et al. A survey on small object detection algorithms for deep convolutional neural networks[J]. Computer Science, 2022, 49(12): 205-218. DOI: 10.11896/jsjkx.220500260
    [2]
    李科岑, 王晓强, 林浩, 等. 深度学习中的单阶段小目标检测方法综述[J]. 计算机科学与探索, 2022, 16(1): 41-58.

    LI K C, WANG X Q, LIN H, et al. A survey on single-stage small object detection methods in deep learning[J]. Journal of Computer Science and Exploration, 2022, 16(1): 41-58.
    [3]
    LIANG Y, QIN G, SUN M, et al. Multi-modal interactive attention and dual progressive decoding network for RGB-D/T salient object detection[J]. Neurocomputing, 2022, 490: 132-145. DOI: 10.1016/j.neucom.2022.03.029
    [4]
    宋文姝, 侯建民, 崔雨勇. 基于多源信息融合的智能目标检测技术[J]. 电视技术, 2021, 45(6): 101-105.

    SONG W S, HOU J M, CUI Y Y. Intelligent object detection technology based on multi-source information fusion[J]. Television Technology, 2021, 45(6): 101-105.
    [5]
    LIU J, FAN X, HUANG Z, et al. Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 5802-5811.
    [6]
    吴泽, 缪小冬, 李伟文, 等. 基于红外可见光融合的低能见度道路目标检测算法[J]. 红外技术, 2022, 44(11): 1154-1160. http://hwjs.nvir.cn/article/id/4bac684b-eed1-4894-900f-ed97489995e6

    WU Z, MIAO X D, LI W W, et al. Low-visibility road object detection algorithm based on infrared visible light fusion[J]. Infrared Technology, 2022, 44(11): 1154-1160. http://hwjs.nvir.cn/article/id/4bac684b-eed1-4894-900f-ed97489995e6
    [7]
    Afyouni I, Al Aghbari Z, Razack R A. Multi-feature, multi-modal, and multi-source social event detection: a comprehensive survey[J]. Information Fusion, 2022, 79: 279-308. DOI: 10.1016/j.inffus.2021.10.013
    [8]
    程腾, 孙磊, 侯登超, 等. 基于特征融合的多层次多模态目标检测[J]. 汽车工程, 2021, 43(11): 1602-1610.

    CHENG T, SUN L, HOU D C, et al. Multi-level multi-modal object detection based on feature fusion[J]. Automotive Engineering, 2021, 43(11): 1602-1610.
    [9]
    ZHANG L, WANG S, SUN H, et al. Research on dual mode target detection algorithm for embedded platform[J]. Complexity, 2021, 2021(8): 1-8. http://openurl.ebsco.com/contentitem/doi:10.1155%2F2024%2F9874354?sid=ebsco:plink:crawler&id=ebsco:doi:10.1155%2F2024%2F9874354
    [10]
    邝楚文, 何望. 基于红外与可见光图像的目标检测算法[J]. 红外技术, 2022, 44(9): 912-919. http://hwjs.nvir.cn/article/id/60c5ef39-1d9c-4918-842f-3d86b939f3a6

    KUANG C W, HE W. Target detection algorithm based on infrared and visible light images[J]. Infrared Technology, 2022, 44(9): 912-919. http://hwjs.nvir.cn/article/id/60c5ef39-1d9c-4918-842f-3d86b939f3a6
    [11]
    马野, 吴振宇, 姜徐. 基于红外图像与可见光图像特征融合的目标检测算法[J]. 导弹与航天运载技术, 2022(5): 83-87.

    MA Y, WU Z Y, JIANG X. Target detection algorithm based on feature fusion of infrared and visible light images[J]. Missile and Space Vehicle Technology, 2022(5): 83-87.
    [12]
    ZHANG D, YE M, LIU Y, et al. Multi-source unsupervised domain adaptation for object detection[J]. Information Fusion, 2022, 78: 138-148. http://www.sciencedirect.com/science/article/pii/S1566253521001895
    [13]
    CHEN S, MA W, ZHANG L. Dual-bottleneck feature pyramid network for multiscale object detection[J]. Journal of Electronic Imaging, 2022, 31(1): 1-16. http://www.nstl.gov.cn/paper_detail.html?id=58c6ec4e74c19b48febf51e68105aea0
    [14]
    TANG B. ASFF-YOLOv5: Multielement detection method for road traffic in UAV images based on multiscale feature fusion[J]. Remote Sensing, 2022, 14(14): 3498-3499. http://www.mdpi.com/2072-4292/14/14/3498
    [15]
    Woo S, Park J, Lee J Y, et al. CBAM: Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018: 3-19.
    [16]
    LI C, LIANG X, LU Y, et al. RGB-T object tracking: benchmark and baseline[J]. Pattern Recognition, 2019, 96: 106977. http://doc.paperpass.com/patent/arXiv180508982.html
    [17]
    LIN T Y, Maire M, Belongie S, et al. Microsoft COCO: Common objects in context[C]//Computer Vision–ECCV 2014: 13th European Conference, 2014: 740-755.
    [18]
    MA N, ZHANG X, ZHENG H T, et al. Shufflenet v2: Practical guidelines for efficient cnn architecture design[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018: 116-131.
    [19]
    HAN K, WANG Y, XU C, et al. GhostNets on heterogeneous devices via cheap operations[J]. International Journal of Computer Vision, 2022, 130(4): 1050-1069. DOI: 10.1007/s11263-022-01575-y
    [20]
    Howard A, Sandler M, CHU G, et al. Searching for mobilenetv3[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 1314-1324.
    [21]
    WANG Q, WU B, ZHU P, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020: 11534-11542.
    [22]
    Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: transformers for image recognition at scale[C]//International Conference on Learning Representations, 2021: 331-368.
    [23]
    Misra D, Nalamada T, Arasanipalai A U, et al. Rotate to attend: convolutional triplet attention module[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021: 3139-3148.

Catalog

    Article views (33) PDF downloads (10) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return