基于孪生网络的无人机目标多模态融合检测

韩自强, 岳明凯, 张骢, 高棋

韩自强, 岳明凯, 张骢, 高棋. 基于孪生网络的无人机目标多模态融合检测[J]. 红外技术, 2023, 45(7): 739-745.
引用本文: 韩自强, 岳明凯, 张骢, 高棋. 基于孪生网络的无人机目标多模态融合检测[J]. 红外技术, 2023, 45(7): 739-745.
HAN Ziqiang, YUE Mingkai, ZHANG Cong, GAO Qi. Multimodal Fusion Detection of UAV Target Based on Siamese Network[J]. Infrared Technology , 2023, 45(7): 739-745.
Citation: HAN Ziqiang, YUE Mingkai, ZHANG Cong, GAO Qi. Multimodal Fusion Detection of UAV Target Based on Siamese Network[J]. Infrared Technology , 2023, 45(7): 739-745.

基于孪生网络的无人机目标多模态融合检测

基金项目: 

辽宁省教育厅基本科研面上项目 LJKMZ20220605

详细信息
    作者简介:

    韩自强(1999-),男,硕士研究生,主要从事探测、控制与信息对抗技术。E-mail:hanzq@sylu.edu.cn

    通讯作者:

    岳明凯(1971-),男,教授,博士,研究方向:武器系统安全控制,探测、控制与毁伤技术。E-mail:13032486996@163.com

  • 中图分类号: TN219

Multimodal Fusion Detection of UAV Target Based on Siamese Network

  • 摘要: 为解决小型无人机“黑飞”对公共领域的威胁问题。基于无人机目标多模态图像信息,文中提出一种轻量化多模态自适应融合孪生网络(Multimodal adaptive fusion Siamese network,MAFS)。设计一种全新的自适应融合策略,该模块通过定义两个模型训练参数赋予不同模态权重以实现自适应融合;本文在Ghost PAN基础上进行结构重建,构建一种更适合无人机目标检测的金字塔融合结构。消融实验结果表明本文算法各个模块对无人机目标检测精度均有提升,多算法对比实验结果表明本文算法鲁棒性更强,与Nanodet Plus-m相比检测时间基本不变的情况下mAP提升9%。
    Abstract: To address the threat of small drones "black flying" to the public domain. Based on the multimodal image information of an unmanned aerial vehicle (UAV) target, a lightweight multimodal adaptive fusion Siamese network is proposed in this paper. To design a new adaptive fusion strategy, this module assigns different modal weights by defining two model training parameters to achieve adaptive fusion. The structure is reconstructed on the basis of a Ghost PAN, and a pyramid fusion structure more suitable for UAV target detection is constructed. The results of ablation experiments show that each module of the algorithm in this study can improve the detection accuracy of the UAV targets. Multi-algorithm comparison experiments demonstrated the robustness of the algorithm. The mAP increased by 9% when the detection time was basically unchanged.
  • 图  1   MAFS网络模型

    Figure  1.   MAFS network model

    图  2   多模态融合策略

    Figure  2.   Multimodal fusion strategies

    图  3   改进的Ghost PAN

    Figure  3.   Improved Ghost PAN

    图  4   部分测试集数据

    Figure  4.   Part of the test set data

    图  5   多模态特征融合偏差

    Figure  5.   Multimodal feature fusion deviation

    图  6   不同算法检测结果

    Figure  6.   Detection results of different algorithms

    表  1   编码器组成结构

    Table  1   Encoder composition

    Layer Output size Channel
    Image 416×416 3
    Conv1 208×208 24
    MaxPool 104×104
    Stage2 52×52 116
    Stage3 26×26 232
    Stage4 13×13 464
    下载: 导出CSV

    表  2   实验配置

    Table  2   Experimental configuration

    Parameters Configuration
    Operating system Ubuntu 20.04
    RAM 32G
    CPU Intel core i5 12400
    GPU Geforce RTX 3060
    GPU acceleration environment CUDA11.3
    Training framework Pytorch
    下载: 导出CSV

    表  3   算法实现的具体参数配置

    Table  3   Specific parameter configuration for algorithm implementation

    Parameters Configuration
    Model MAFSnet
    Training rounds 100
    Batch size 32
    Optimizer SGD
    下载: 导出CSV

    表  4   多模态融合策略

    Table  4   Multimodal fusion strategy

    Fusion Strategy AP50/% mAP/%
    Add 83.00 38.26
    Mul 75.17 31.86
    Cat 66.32 25.91
    下载: 导出CSV

    表  5   多模态融合的权重因子

    Table  5   Weighting factors for multimodal fusion

    μ λ AP50/% mAP/%
    80.60 36.09
    80.71 36.69
    81.69 37.59
    83.00 38.26
    下载: 导出CSV

    表  6   损失函数

    Table  6   Loss function

    GFLoss GIoULoss AP50/% mAP/%
    80.83 37.24
    78.17 36.87
    83.00 38.26
    下载: 导出CSV

    表  7   改进Ghost PAN的消融实验结果

    Table  7   Improving the ablation experiment results of Ghost PAN

    Model AP50/% mAP/% Flops/G Params/M Latency/s
    Nanodet Plus- Ghost PAN 79.10 35.14 0.757 4.164 0.0082
    MAFSnet-Ghost PAN 79.66 36.26 0.757 4.164 0.0082
    MAFSnet -our PAN 80.35 37.79 0.823 4.201 0.0086
    MAFSnet-our PAN* 83.00 38.26 0.895 4.397 0.0090
    下载: 导出CSV

    表  8   不同算法结果对比

    Table  8   Comparison of the results of different algorithms

    Model Input shape AP50/% mAP/% Flops/G Params/M Latency/s
    YOLOv6-n 640×640 82.44 38.00 2.379 4.63 0.0068
    YOLOX-tiny 416×416 82.70 36.52 3.199 5.033 0.0055
    YOLOv8-n 640×640 81.99 37.52 1.72 3.011 0.0062
    Nanodet Plus-m 416×416 79.1 35.1 0.757 4.164 0.0082
    Ours 416×416 83.00 38.26 0.895 4.397 0.0090
    下载: 导出CSV
  • [1] 张辰, 赵红颖, 钱旭. 面向无人机影像的目标特征跟踪方法研究[J]. 红外技术, 2015, 37(3): 224-228, 239. http://hwjs.nvir.cn/article/id/hwjs201503010

    ZHANG Chen, ZHAO Hongying, QIAN Xu. Research on Target Feature Tracking Method for UAV Images[J]. Infrared Technology, 2015, 37(3): 224-228, 239. http://hwjs.nvir.cn/article/id/hwjs201503010

    [2] 王宁, 李哲, 梁晓龙, 等. 无人机单载荷目标检测及定位联合实现方法[J]. 电光与控制, 2021, 28(11): 94-100. https://www.cnki.com.cn/Article/CJFDTOTAL-DGKQ202111021.htm

    WANG Ning, LI Zhe, LIANG Xiaolong, et al. Joint realization method of single payload target detection and positioning of UAV[J]. Electro-optic and Control, 2021, 28(11): 94-100. https://www.cnki.com.cn/Article/CJFDTOTAL-DGKQ202111021.htm

    [3] 杨欣, 王刚, 李椋, 等. 基于深度卷积神经网络的小型民用无人机检测研究进展[J]. 红外技术, 2022, 44(11): 1119-1131. http://hwjs.nvir.cn/article/id/f016be54-e981-4314-b634-7c05912eb61e

    YANG Xin, WANG Gang, LI Liang, et al. Research progress in detection of small civilian UAVs based on deep convolutional neural networks [J]. Infrared Technology, 2022, 44(11): 1119-1131. http://hwjs.nvir.cn/article/id/f016be54-e981-4314-b634-7c05912eb61e

    [4] 粟宇路, 苏俊波, 范益红, 等. 红外中长波图像彩色融合方法研究[J]. 红外技术, 2019, 41(4): 335-340. http://hwjs.nvir.cn/article/id/hwjs201904007

    SU Yulu, SU Junbo, FAN Yihong, et al. Research on color fusion method of infrared medium and long wave images [J]. Infrared Technology, 2019, 41(4): 335-340. http://hwjs.nvir.cn/article/id/hwjs201904007

    [5] 陈旭, 彭冬亮, 谷雨. 基于改进YOLOv5s的无人机图像实时目标检测[J]. 光电工程, 2022, 49(3): 69-81. https://www.cnki.com.cn/Article/CJFDTOTAL-GDGC202203006.htm

    CHEN Xu, PENG Dongliang, GU Yu. Real-time target detection of UAV images based on improved YOLOv5s [J]. Optoelectronic Engineering, 2022, 49(3): 69-81. https://www.cnki.com.cn/Article/CJFDTOTAL-GDGC202203006.htm

    [6]

    Redmon J, Farhadi A. YOLO9000: Better, Faster, Stronger[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 6517-6525.

    [7]

    Bertinetto L, Valmadre J, Henriques J F, et al. Fully-convolutional siamese networks for object tracking[J/OL]. Computer Vision and Pattern Recognition, 2016. https://arxiv.org/abs/1606.09549

    [8] 闫号, 戴佳佳, 龚小溪, 等. 基于多源图像融合的光伏面板缺陷检测[J]. 红外技术, 2023, 45(5): 488-497. http://hwjs.nvir.cn/article/id/9de7d764-d0af-4af8-9eb1-a1b94186c243

    YAN Hao, DAI Jiajia, GONG Xiaoxi, et al. Photovoltaic panel defect detection based on multi-source image fusion [J]. Infrared Technology, 2023, 45(5): 488-497. http://hwjs.nvir.cn/article/id/9de7d764-d0af-4af8-9eb1-a1b94186c243

    [9]

    MA Jiayi, MA Yong, LI Chang. Infrared and visible image fusion methods and applications: A survey[J]. Information Fusion, 2019, 45: 153-178.

    [10] 白玉, 侯志强, 刘晓义, 等. 基于可见光图像和红外图像决策级融合的目标检测算法[J]. 空军工程大学学报: 自然科学版, 2020, 21(6): 53-59, 100. https://www.cnki.com.cn/Article/CJFDTOTAL-KJGC202006009.htm

    BAI Yu, HOU Zhiqiang, LIU Xiaoyi, et al. Target detection algorithm based on decision-level fusion of visible light images and infrared images [J]. Journal of Air Force Engineering University: Natural Science Edition, 2020, 21(6): 53-59, 100. https://www.cnki.com.cn/Article/CJFDTOTAL-KJGC202006009.htm

    [11] 宁大海, 郑晟. 可见光和红外图像决策级融合目标检测算法[J]. 红外技术, 2023, 45(3): 282-291. http://hwjs.nvir.cn/article/id/5340b616-c317-4372-9776-a7c81ca2c729

    NING Dahai, ZHENG Sheng. Decision-level Fusion Object Detection Algorithm for Visible and Infrared Images[J]. Infrared Technology, 2023, 45(3): 282-291. http://hwjs.nvir.cn/article/id/5340b616-c317-4372-9776-a7c81ca2c729

    [12] 马野, 吴振宇, 姜徐. 基于红外图像与可见光图像特征融合的目标检测算法[J]. 导弹与航天运载技术, 2022(5): 83-87. https://www.cnki.com.cn/Article/CJFDTOTAL-DDYH202205016.htm

    MA Ye, WU Zhenyu, JIANG Xu. Target detection algorithm based on feature fusion of infrared image and visible light image [J]. Missile and Space Vehicle Technology, 2022(5): 83-87. https://www.cnki.com.cn/Article/CJFDTOTAL-DDYH202205016.htm

    [13] 刘建华, 尹国富, 黄道杰. 基于特征融合的可见光与红外图像目标检测[J]. 激光与红外, 2023, 53(3): 394-401. https://www.cnki.com.cn/Article/CJFDTOTAL-JGHW202303010.htm

    LIU Jianhua, YIN Guofu, HUANG Daojie. Object detection in visible and infrared images based on feature fusion [J]. Laser and Infrared, 2023, 53(3): 394-401. https://www.cnki.com.cn/Article/CJFDTOTAL-JGHW202303010.htm

    [14] 解宇敏, 张浪文, 余孝源, 等. 可见光-红外特征交互与融合的YOLOv5目标检测算法[J/OL]. 控制理论与应用, http://kns.cnki.net/kcms/detail/44.1240.TP.20230511.1643.024.html.

    XIE Yumin, ZHANG Langwen, YU Xiaoyuan, etc. YOLOv5 target detection algorithm based on interaction and fusion of visible light-infrared features [J/OL]. Control theory and application, http://kns.cnki.net/kcms/detail/44.1240.TP.20230511.1643.024.html.

    [15]

    RangiLyu. NanoDet-Plus: Super fast and high accuracy lightweight anchor-free object detection model[EB/OL]. https://github.com/RangiLyu/nanodet, 2021.

    [16]

    MA N, ZHANG X, ZHENG H T, et al. Shufflenet v2: Practical guidelines for efficient cnn architecture design[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 116-131.

    [17]

    LI X, WANG W, WU L, et al. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection[J]. Advances in Neural Information Processing Systems, 2020, 33: 21002-21012.

    [18]

    JIANG Nan, WANG Kuiran, PENG Xiaoke. Anti-UAV: A Large-Scale Benchmark for Vision-Based UAV Tracking[J]. IEEE Transactions on Multimedia, 2023, 25: 486-500, DOI: 10.1109/TMM.2021.3128047.

    [19]

    ZHAO J, WANG G, LI J, et al. The 2nd Anti-UAV workshop & challenge: Methods and results[J/OL]. arXiv preprint arXiv: 2108.09909, 2021.

    [20]

    LI C, LI L, JIANG H, et al. YOLOv6: A single-stage object detection framework for industrial applications[J/OL]. arXiv preprint arXiv: 2209.02976, 2022.

    [21]

    GE Z, LIU S, WANG F, et al. Yolox: Exceeding yolo series in 2021[J/OL]. arXiv preprint arXiv: 2107.08430, 2021.

    [22]

    Github. Yolov5[EB/OL]. https://github.com/ultralytics/yolov5, 2021.

    [23]

    LI B, XIAO C, WANG L, et al. Dense nested attention network for infrared small target detection[J]. IEEE Transactions on Image Processing, 2022, 32: 1745-1758.

  • 期刊类型引用(0)

    其他类型引用(1)

图(6)  /  表(8)
计量
  • 文章访问数:  295
  • HTML全文浏览量:  56
  • PDF下载量:  75
  • 被引次数: 1
出版历程
  • 收稿日期:  2023-05-31
  • 修回日期:  2023-06-20
  • 刊出日期:  2023-07-19

目录

    /

    返回文章
    返回