YANG Peilong, CHEN Shuyue, YANG Shangyu, WANG Jiahong. Two-Stream Residual Dilation Network Algorithm for Crowd Counting Based on RGB-T Images[J]. Infrared Technology , 2023, 45(11): 1177-1186.
Citation: YANG Peilong, CHEN Shuyue, YANG Shangyu, WANG Jiahong. Two-Stream Residual Dilation Network Algorithm for Crowd Counting Based on RGB-T Images[J]. Infrared Technology , 2023, 45(11): 1177-1186.

Two-Stream Residual Dilation Network Algorithm for Crowd Counting Based on RGB-T Images

More Information
  • Received Date: July 12, 2022
  • Revised Date: September 12, 2022
  • We proposed a multimodal crowd counting algorithm based on RGB-Thermal (RGB-T) images (two-stream residual expansion network) in crowd counting, given scale changes, uneven pedestrian distribution, and poor imaging conditions at night. It has a front-end feature extraction network, multi-scale residual dilation convolution, and global attention modules. We used the front-end network to extract RGB and thermal features, and the dilated convolution module further extracted pedestrian feature information at different scales and used the global attention module to establish dependencies between global features. We also introduced a new multi-scale dissimilarity loss method to improve the counting performance of the network and conducted comparative experiments on the RGBT crowd counting (RGBT-CC) and DroneRGBT datasets to evaluate the method. Experimental results showed that compared with the cross-modal collaborative representation learning (CMCRL) algorithm on the RGBT-CC dataset, the grid average mean absolute error (GAME (0)) and root mean squared error (RMSE) of this algorithm are reduced by 0.8 and 3.49, respectively. On the DroneRGBT dataset, the algorithm are reduced by 0.34 and 0.17, respectively, compared to the multimodal crowd counting network (MMCCN) algorithm, indicating better counting performance.
  • [1]
    张宇倩, 李国辉, 雷军, 等. FF-CAM: 基于通道注意机制前后端融合的人群计数[J]. 计算机学报, 2021, 44(2): 304-317. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJX202102004.htm

    ZHANG Yuqian, LI Guohui, LEI Jun, et al. FF-CAM: crowd counting based on front-end and back-end fusion of channel attention mechanism [J]. Journal of Computer Science, 2021, 44(2): 304-317. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJX202102004.htm
    [2]
    YANG Z, WEN J, HUANG K. A method of pedestrian flow monitoring based on received signal strength[J]. EURASIP Journal on Wireless Communications and Networking, 2022, 2022(1): 1-17. DOI: 10.1186/s13638-021-02080-5
    [3]
    王曲, 赵炜琪, 罗海勇, 等. 人群行为分析研究综述[J]. 计算机辅助设计与图形学学报, 2018, 30(12): 2353-2365. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJF201812018.htm

    WANG Qu, ZHAO Weiqi, LUO Haiyong, et al. Review of research on crowd behavior analysis[J]. Journal of Computer-Aided Design and Graphics, 2018, 30(12): 2353-2365. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJF201812018.htm
    [4]
    蒋一, 侯丽萍, 张强. 基于改进空时双流网络的红外行人动作识别研究[J]. 红外技术, 2021, 43(9): 852-860. http://hwjs.nvir.cn/article/id/f44f08d7-9ff9-413b-938d-de049d8dc5a2

    JIANG Yi, HOU Liping, ZHANG Qiang. Research on infrared pedestrian action recognition based on improved space-time dual-stream network [J]. Infrared Technology, 2021, 43(9): 852-860. http://hwjs.nvir.cn/article/id/f44f08d7-9ff9-413b-938d-de049d8dc5a2
    [5]
    赵才荣, 齐鼎, 窦曙光, 等. 智能视频监控关键技术: 行人再识别研究综述[J]. 中国科学: 信息科学, 2021, 51(12): 1979-2015. https://www.cnki.com.cn/Article/CJFDTOTAL-PZKX202112002.htm

    ZHAO Cairong, QI Ding, DOU Shuguang, et al. Key technologies for intelligent video surveillance: A review of pedestrian re-identification research [J]. Science in China: Information Science, 2021, 51(12): 1979-2015. https://www.cnki.com.cn/Article/CJFDTOTAL-PZKX202112002.htm
    [6]
    Enzweiler M, Gavrila D M. Monocular pedestrian detection: Survey and experiments[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 31(12): 2179-2195.
    [7]
    LI M, ZHANG Z, HUANG K, et al. Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection[C]//2008 19th International Conference on Pattern Recognition, 2008: 1-4.
    [8]
    CHEN K, Loy C C, GONG S, et al. Feature mining for localised crowd counting[C]//BMVC, 2012: 3-12.
    [9]
    Pham V Q, Kozakaya T, Yamaguchi O, et al. Count forest: Co-voting uncertain number of targets using random forest for crowd density estimation[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 3253-3261.
    [10]
    PAN S, ZHAO Y, SU F, et al. SANet++: enhanced scale aggregation with densely connected feature fusion for crowd counting[C]//ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021: 1980-1984.
    [11]
    吴奇元, 王晓东, 章联军, 等. 融合注意力机制与上下文密度图的人群计数网络[J]. 计算机工程, 2022, 48(5): 235-241, 250. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJC202205031.htm

    WU Qiyuan, WANG Xiaodong, ZHANG Lianjun, et al. Crowd counting network integrating attention mechanism and context density map [J]. Computer Engineering, 2022, 48(5): 235-241, 250. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJC202205031.htm
    [12]
    TANG H, WANG Y, CHAU L-P. TAFNet: a three-stream adaptive fusion network for RGB-T crowd counting[J/OL]. arXiv preprint arXiv: 2202.08517, 2022. https://doi.org/10.48550/arXiv.2202.08517.
    [13]
    LIU L, CHEN J, WU H, et al. Cross-modal collaborative representation learning and a large-scale rgbt benchmark for crowd counting[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 4823-4833.
    [14]
    PENG T, LI Q, ZHU P. RGB-T crowd counting from drone: a benchmark and MMCCN network[C]//Computer Vision – ACCV 2020, 2021: 497-513.
    [15]
    Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[C]//International Conference on Learning Representations (ICLR), 2014: 1-14.
    [16]
    HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.
    [17]
    DAI F, LIU H, MA Y, et al. Dense scale network for crowd counting[C]//Proceedings of the 2021 International Conference on Multimedia Retrieval, 2021: 64-72.
    [18]
    LI Y, ZHANG X, CHEN D. Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 1091-1100.
    [19]
    Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018: 3-19.
    [20]
    ZHANG J, FAN D P, DAI Y, et al. UC-Net: uncertainty inspired RGB-D saliency detection via conditional variational autoencoders[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 8582-8591.
    [21]
    PANG Y, ZHANG L, ZHAO X, et al. Hierarchical dynamic filtering network for rgb-d salient object detection[C]//European Conference on Computer Vision, 2020: 235-252.
    [22]
    ZHANG Y, ZHOU D, CHEN S, et al. Single-image crowd counting via multi-column convolutional neural network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 589-597.
    [23]
    CAO X, WANG Z, ZHAO Y, et al. Scale aggregation network for accurate and efficient crowd counting[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018: 734-750.
    [24]
    FAN D P, ZHAI Y, Borji A, et al. BBS-Net: RGB-D salient object detection with a bifurcated backbone strategy network[C]//European Conference on Computer Vision, 2020: 275-292.
    [25]
    ZHANG Q, CHAN A B. Wide-area crowd counting via ground-plane density maps and multi-view fusion cnns[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 8297-8306.
    [26]
    MA Z, WEI X, HONG X, et al. Bayesian loss for crowd count estimation with point supervision[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 6142-6151.
    [27]
    ZENG L, XU X, CAI B, et al. Multi-scale convolutional neural networks for crowd counting[C]//IEEE International Conference on Image Processing (ICIP), 2017: 465-469.
    [28]
    SHEN Z, XU Y, NI B, et al. Crowd counting via adversarial cross-scale consistency pursuit[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 5245-5254.

Catalog

    Article views PDF downloads Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return