Review of Lightweight Target Detection Algorithms
-
摘要:
传统基于深度学习的目标检测算法通常需要巨大的计算资源和长时间的训练,不能满足工业界的需求。轻量级目标检测网络通过牺牲一部分检测精度,换取更快的推理速度和更轻量的模型,适用于边缘计算设备中的应用,受到了广泛关注。本文介绍了常用于压缩和加速模型轻量化技术,归类分析了轻量化骨干网络结构原理,并在YOLOv5s上进行实际效果对比。最后对轻量化目标检测算法的未来前景以及面临的挑战进行了展望。
Abstract:Traditional target detection algorithms based on deep learning usually require extensive computing resources and long-term training, which do not meet the needs of the industry. Lightweight target detection networks sacrifice part of the detection accuracy in exchange for faster inference speed and lighter models. They are suitable for applications in edge-computing devices and have received widespread attention. This study introduces lightweight technologies commonly used to compress and accelerate models, classifies and analyzes the structural principles of lightweight backbone networks, and evaluates their practical impact on YOLOv5s. Finally, the prospects and challenges of lightweight target-detection algorithms are discussed.
-
Keywords:
- deep learning /
- target detection /
- lightweight technology /
- backbone network
-
0. 引言
有机电致发光器件(Organic Light Emitting Device,OLED)具有发光亮度高、响应时间短、可视范围大和可柔性化等优点,被称为“梦幻般的显示器”,被视为液晶显示后的下一代主流显示器,并初步应用于装饰和室内照明[1-6]。近年来,高性能顶发射器件逐渐成为研究热点,诸多科研工作者投身于实现高性能器件的研究中,目前主要从两个方面入手:一是新材料的研发,如新型有机发光分子材料[7];二是新结构的开发,如超薄结构[8]、量子阱结构[9]和和微腔结构[10]等。在微腔结构方面,主要是通过理论计算改变有机结构层厚度,进而调节器件的微腔长度,获得不同模数的微腔,使器件处于不同微腔加强区,从而提升器件性能。
光学微腔是一种光学微型谐振腔,尺寸在光波长量级。有机微腔电致发光器件最早是日本九州大学在1993年完成的[11]。当前关于有机微腔发光的大部分研究致力于提升器件效率[12-14],而对具有微腔效应顶发射器件的色纯度及稳定性的研究存在不足。因此,本文在现有器件研究的基础上,通过引入二阶微腔结构[15-16],制备了一系列顶发射微型器件,验证二阶微腔长度范围内器件的光电性能,最终获得优化后的稳定绿光顶发射器件,实现标准绿光显示。
1. 实验
本文所制备的顶发射器件,微腔结构为简单的FP(Fabry-Perot)微腔结构[17-19],底部全反射电极采用Ag,顶部光出射端采用半透明的金属阴极Mg/Ag作为半反射镜。器件各膜层通过蒸镀设备依次完成,主要膜层及所用材料见表 1,其中阳极为ITO,空穴注入层(Hole Injection Layer, HIL)为有机材料F16CuPc和NPB,F16CuPc为掺杂料;空穴传输层(Hole Transport Layer, HTL)为有机材料NPB;电子阻挡层(Electron Blocking Layer, EBL)为有机材料TCTA;有机发光层(Emitting Layer, EML)为有机材料mCP和Ir(ppy)3,mCP为绿色发光基质,Ir(ppy)3掺杂料;电子传输层(Electron Transport Layer, ETL)为有机材料Bphen和Liq,Liq为掺杂料;光输出耦合层(Capping Layer, CPL)为有机材料Alq3。器件中涉及的有机材料分子结构如图 1所示。
表 1 器件主要膜层及所用材料Table 1. Layers and materials of deviceLayer Material anode ITO HIL Copper(II)1, 2, 3, 4, 8, 9, 10, 11, 15, 16, 17, 18, 22, 23, 24, 25-hexadecafluoro-29H, 31H-phthalocyanine(F16CuPc)
N, N'-Di-[(1-naphthyl)-N, N'-diphenyl]-1, 1'-biphenyl)-4, 4'-diamine (NPB)HTL N, N'-Di-[(1-naphthyl)-N, N'-diphenyl]-1, 1'-biphenyl)-4, 4'-diamine (NPB) EBL 4, 4', 4''-tris(carbazol-9-yl)-triphenylamine (TCTA) EML 1, 3-bis(9-carbazolyl)benzene(mCP)
Iridium, tris[2-(2-pyridinyl-kN)phenyl-kC](Ir(ppy)3)ETL 4, 7-Diphenyl-1, 10-phenanthroline(Bphen)
8-hydroxyquinoline lithium(Liq)cathode Mg/Ag CPL 8-Hydroxyquinoline aluminum salt(Alq3) 该器件采用云南北方奥雷德光电股份有限公司开发的硅基CMOS基板作为器件衬底,依次蒸镀各层有机材料,蒸发速率保持在0.1 nm/s,真空度保持在2×10-4 Pa。器件的亮度及光谱通过PR-655测量,电流和电压采用搭载Keithley 2400测试仪的测试系统进行测量。
2. 结果讨论
2.1 微腔长度对色纯度影响
一般来说,顶发射器件都存在微腔效应,器件发出的光谱强度I(λ)如式(1)[20]:
$$ I\left( \lambda \right) = \frac{{\left( {1 + {R_{\text{h}}}} \right)\left[ {1 + {R_{\text{f}}} + 2\sqrt {{R_{\text{f}}}} \cos \left( {\frac{{4{\rm{ \mathsf{ π} }}Z}}{\lambda }} \right)} \right]}}{{1 + {R_{\text{f}}}{R_{\text{h}}} - 2\sqrt {{R_{\text{f}}}{R_{\text{h}}}} \cos \left( {\frac{{4{\rm{ \mathsf{ π} }}L}}{\lambda }} \right)}}{I_0}\left( \lambda \right) $$ (1) 式中:Rf为全反射镜的反射率;Rh为半透明反射镜的反射率;I0(λ)为自由空间的光谱强度;L为器件微腔光学长度;Z为全反射镜与有机发光层之间的距离。其中,微腔的光学长度L计算式为:
$$ L = \sum {{n_{\text{m}}}{d_{\text{m}}}} + {n_{{\text{ITO}}}}{d_{{\text{ITO}}}} + \left| {\frac{{{\lambda _q}}}{{4{\rm{ \mathsf{ π} }}}}\sum\limits_i {{\phi _i}\left( \lambda \right)} } \right| = q\frac{{{\lambda _q}}}{2} $$ (2) 式中:nm、dm分别为有机材料的折射率和厚度;nITO、dITO分别为ITO的折射率和厚度;q(1, 2, 3, 4, …)是发射模的模(阶)数;λq是模(阶)数为q的共振发射波长;ϕt(λ)为光在有机界面/金属镜面之间的相移,i为阳极/有机界面或阴极/有机界面。由式(1)、(2)可知,通过调节有机材料膜层厚度,可以改变器件微腔长度,使腔模q的位置产生移动,从而改变微腔器件的出射光波长。为了使器件微腔的谐振波长与发光层电致发光谱的峰值波长相匹配以实现增益,利用公式(2)计算得到一阶腔长对应的有机层总厚度约为100 nm,二阶腔长对应的有机层总厚度约为250 nm。
通过调整空穴传输层和电子阻挡层厚度,实验中制作了5种不同微腔长度的器件A~E,如图 2所示。其结构为:Si Substrate/Ag/ITO/ NPB: F16CuPc(10 nm, 3%)/NPB(x nm)/TCTA(y nm)/ mCP: Ir(ppy)3(40 nm, 6%)/ Bphen: Liq(30 nm, 40%)/ Mg/Ag(12 nm)/Alq3(35 nm),x表示空穴传输层(NPB)的膜层厚度,y表示电子阻挡层(TCTA)的膜层厚度。其中x分别为30、30、60、20、120,y分别为20、15、20、15、40,器件有机层厚度依次为130 nm、125 nm、160 nm、115 nm、240 nm。
图 3为不同腔长器件EL光谱。器件A、B、C、D在524 nm处有一强峰,556 nm、552 nm、560 nm、560 nm处出现一弱峰,器件E为520 nm处唯一单峰。从图中可以看出,器件C→A→B→D→E长波一侧出现明显的窄化趋势,向短波一侧移动,出现蓝移,560 nm处的肩峰逐渐减弱至消失。这一现象是器件微腔效应导致的,根据腔量子电动力学效应,腔内光场的模式密度受到调制,在谐振波长处得到增强,而在其他波长处的受到抑制,光谱得到窄化[21]。微腔效应的强弱常通过半高宽(FWHM, full width at half maximum)来衡量,计算得到器件C→A→B→D→E半高宽从84 nm减小到33 nm,微腔效应逐渐增强。
不同腔长器件的发光性能如表 2所示。在A~E中,D在亮度、电流效率与外量子效率等方面表现较佳,B次之,C表现最差,而E色坐标偏移最小。这主要是因为,D位于一阶加强区,E位于二阶加强区,C远离加强区。可以看出,当器件腔长位于一阶加强区时,器件的光电效率会得到加强;当位于二阶加强区时,器件效率会低于一阶加强区[22-23],但器件色纯度明显高于一阶加强区,说明处于二阶加强区对器件的色纯度有显著的提升作用。
表 2 不同腔长器件的光电特性Table 2. Optoectronic performance of device with different cavity lengthsDevice Luminance/(cd/m2) Current efficiency/(cd/A) Peak wavelength/nm FWHM/nm External quantum efficiency/% CIEx, y Color shift[CIE 1931] A 6330 33.80 524 73 9.19% (0.3713, 0.6019) (0.1613, 0.1081) B 7439 39.73 524 70 10.59% (0.3601, 0.6110) (0.1501, 0.0990) C 2198 11.74 524 84 3.39% (0.3959, 0.5821) (0.1859, 0.1279) D 9123 48.72 524 66 12.75% (0.3436, 0.6243) (0.1336, 0.0857) E 5477 29.25 520 33 7.67% (0.2092, 0.7167) (0.0008, 0.0067) 通过进一步的测试发现,制作得到的器件色坐标都具有很好的稳定性,如图 4所示。A~E色坐标CIEx,CIEy在低电压阶段经过短暂上升,电压达到2.8 V后,色坐标保持平稳。从整个变化情况来看,器件E色坐标出现了明显的突变,CIEx骤降到0.2左右,CIEy骤升到0.71左右,出现该现象的原因是器件A~D分别在556 nm、552 nm、560 nm、560 nm处存在一弱峰,导致色坐标产生偏离,发光时表现出黄绿光,而器件E为唯一单峰,在器件正常启亮后就表现出近乎接近标准绿光(0.21, 0.71)显示,如图 4(c)所示。这一结果也再次表明微腔长度处于二阶加强区,对器件发光色纯度有明显的提升作用。
2.2 空穴传输层和电子阻挡层厚度对腔长影响
前述结果表明,当器件微腔长度位于二阶加强区时,器件的色纯度会得到明显提升。为了验证器件处于二阶加强区时,空穴传输层和电子阻挡层厚度是否对微腔长度改变起同等作用,制作了器件E1。在其他条件保持不变的情况下,空穴传输层厚度为40 nm,电子阻挡层厚度为120 nm。从表 3可以看出,E、E1在亮度、电流效率、外量子效率等性能方面表现相当,差异很小。通过光谱图(图 5)和色坐标(图 6)也可以看出,两者EL光谱基本重合,且CIEx、CIEy未发生较大改变。这一结果表明,空穴传输层与电子传输层厚度在微腔长度改变中作用相同,均能有效调节色纯度。
表 3 不同HTL & EBL厚度器件的光电特性Table 3. Optoectronic performance of device with different HTL & EBL thicknessDevice Luminance/(cd/m2) Current efficiency/(cd/A) Peak wavelength/nm FWHM/nm External quantum efficiency/% CIEx, y Color shift[CIE 1931] E 5477 29.25 520 33 7.67 (0.2092, 0.7167) (0.0008, 0.0067) E1 5261 28.09 520 32 7.58 (0.2079, 0.7173) (0.0021, 0.0073) 3. 结论
研究发现器件结构为Si Substrate/Ag/ITO/ NPB: F16CuPc(10 nm, 3%)/NPB(x nm)/TCTA(y nm)/ mCP: Ir(ppy)3(40 nm, 6%)/Bphen: Liq(30 nm, 40%)/ Mg/ Ag(12 nm)/Alq3(35 nm)的顶发射绿光器件,通过调节器件空穴传输层和电子阻挡层的厚度使器件处于第二阶微腔加强区,可以使光谱明显窄化,器件色纯度得到极大提升,进一步研究发现,空穴传输层与电子阻挡层在微腔长度改变中作用相同,均能有效调节色纯度。器件在腔长为240 nm时,能实现稳定的高色纯度绿光显示,正向出射绿光的色坐标达到了(0.2092,0.7167),接近标准绿光(0.21, 0.71),该结果对二阶腔长绿光器件的应用有较好的参考意义。
-
表 1 卷积算子推理测试
Table 1 Convolution operator inference test
Name All time/s Mean time/ms FPS GFLOPs Params/k Conv 16.3 5.44 184 77.6 147.8 Depth Conv 21.2 7.07 141 9.7 18.3 Group Conv 12.5 4.17 240 9.9 18.8 Dilated Conv 16.3 5.43 184 77.6 147.8 Ghost-GhostNet 12 4.01 250 4.9 9.2 PConv-FasterNet 9.2 3.05 327.5 5.1 9.5 DSConv 14.4 4.8 208 0.27 0.26 GSConv 18.4 6.15 163 39.8 75.7 DCnv2 65.9 21.97 45.5 16.6 31.4 DCnv3 44.2 14.74 67.8 20.1 38.2 表 2 不同骨干网络性能对比实验
Table 2 Performance comparison experiment of different backbone networks
Model Layers Params/M GFLOPs mAP/(%) CPU/ms GPU/ms YOLOv5s 157 7.06 15.9 73.7 35.1 8.9 MobileNetv3 294 4.65(66%) 7.2(45%) 61 29 5.3 ShuffleNetv2 179 3.28(46%) 6.0(38%) 53.8 17.2 3.8 GhostNet 385 4.09(58%) 7.8(49%) 65.9 35.0 6.3 LCNet 242 4.7(67%) 8.8(55%) 68.6 41 4.1 MobileOne 258 4.48(63%) 11(69%) 64.9 26.9 3.7 FasterNet 223 5.59(79%) 11.4(72%) 68.4 24.6 4.9 FBNetV3 597 9.21(130%) 14.3(90%) 71.9 97 8.5 MobileViT 492 4.3(61%) 10.4(65%) 69.0 78.6 11.8 EdgeNeXt 259 4.32(61%) 8.8(55%) 65.3 21.1 6.5 EffientViT 286 3.79(54%) 7.0(44%) 63.4 42.3 6.7 FastViT 544 6.65(88%) 14.7(92%) 68.0 50.5 7.4 表 3 轻量级目标检测网络实验结果
Table 3 Experimental results of lightweight object detection network
Model Platform Dataset Params/M mAP/(%) Latency/ms GFLOPs YOLOv1-tiny Nvidia Titan X VOC07 - 52.7 6.45 - YOLOv2-tiny - VOC07 - 57.1 - - YOLOv3-tiny - COCO 6.06 16.6 - 6.96 YOLOv4-tiny Kirin 990 COCO 8.86 21.7 55.44 5.62 YOLOv5n Nvidia V100 COCO 1.9 28 6.3 4.5 YOLOv6lite-S sm8350 COCO 0.55 22.4 7.99 0.56 YOLOv7-tiny Nvidia V100 COCO 6.2 38.7 3.5 13.8 YOLOv8n Nvidia A100 COCO 3.2 37.3 0.99 8.7 YOLO-Fastest RK3568(A55) COCO 0.35 24.4(0.5) 26.6 0.252 YOLO-Fastestv2 RK3568(A55) COCO 0.25 24.1(0.5) 23.8 0.212 FastestDet RK3568(A55) COCO 0.24 13 23.51 - MobileNetv1-SSD NanoPi 2 VOC07 - 72.0 885 - Tiny-SSD - VOC07 1.13 61.3 - - Tiny-DSOD Nvidia Titan X VOC07 0.95 72.1 9.5 - Micro-YOLO Nvidia 2080ti COCO 1.92 29.3(0.5) 2.8 2.15 CSL-YOLO Nvidia 1080ti COCO 3.2 24.5 - 1.47 YOLOX-Nano ARM(4xA76) COCO 0.91 25.8 23.08 1.08 Tiny LC-YOLO Nvidia 3090 UCAS-AOD 1.83 94.17 14.4 4.6 Gold-YOLO Tesla T4 COCO 5.6 39.9 1.7 12.1 NanoDet ARM(4*A76) COCO 0.95 20.6 10.23 0.72 NanoDet-Plus ARM(4*A76) COCO 1.17 27.0 11.97 0.9 PP-PicoDet ARM(4*A77) COCO 1.18 29.1 4.8 0.97 DPNet Nvidia 2080Ti COCO 2.5 29.6 6 1.04 FemtoDet ARM(4*A77) VOC 0.0688 46.31 15.5 - -
[1] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014: 580-587.
[2] 陈东, 刘宁. 深度学习中的模型压缩技术[J]. 人工智能, 2023(3): 40-51. DOI: 10.16453/j.2096-5036.2023.03.004. CHEN Dong, LIU Ning. Model compression technology in deep learning[J]. Artificial Intelligence, 2023(3): 40-51. DOI: 10.16453/j.2096-5036.2023.03.004.
[3] HAN S, Pool J, Tran J, et al. Learning both Weights and Connections for Efficient Neural Networks[J]. arXiv e-prints arXiv: 1506.026262015.
[4] Alizadeh M, Tailor S A, Zintgraf L M, et al. Prospect pruning: Finding trainable weights at initialization using meta-gradients[J]. arXiv preprint arXiv: 2202.08132, 2022.
[5] HE Y, KANG G, DONG X, et al. Soft filter pruning for accelerating deep convolutional neural networks[J]. arXiv preprint arXiv: 1808.06866, 2018.
[6] HE Y, LIU P, WANG Z, et al. Filter pruning via geometric median for deep convolutional neural networks acceleration[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 4340-4349.
[7] FANG G, MA X, SONG M, et al. Depgraph: towards any structural pruning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 16091-16101.
[8] GONG Y, LIU L, YANG M, et al. Compressing deep convolutional networks using vector quantization[J]. arXiv preprint arXiv: 1412.6115, 2014.
[9] Courbariaux M, Hubara I, Soudry D, et al. Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or-1[J]. arXiv preprint arXiv: 1602.02830, 2016.
[10] LI F, LIU B, WANG X, et al. Ternary weight networks[J]. arXiv preprint arXiv: 1605.04711, 2016.
[11] Lebedev V, Ganin Y, Rakhuba M, et al. Speeding-up convolutional neural networks using fine-tuned cp-decomposition[J]. arXiv preprint arXiv: 1412.6553, 2014.
[12] Kim Y D, Park E, Yoo S, et al. Compression of deep convolutional neural networks for fast and low power mobile applications[J]. arXiv preprint arXiv: 1511.06530, 2015.
[13] Novikov A, Podoprikhin D, Osokin A, et al. Tensorizing Neural Networks[J]. arXiv preprint arXiv: 1509.06569, 2015.
[14] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network[J]. arXiv preprint arXiv: 1503.02531, 2015.
[15] CHEN Y, WANG N, ZHANG Z. DarkRank: accelerating deep metric learning via cross sample similarities transfer[J]. arXiv e-prints, arXiv: 1707.01220, 2017.
[16] Sifre L, Mallat S. Rigid-motion scattering for texture classification[J]. arXiv preprint arXiv: 1403.1687, 2014.
[17] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Advances in Neural Information Processing Systems, 2012, 25: 1097-105. http://www.open-open.com/misc/goto?guid=4959622549944527866
[18] YU F, Koltun V. Multi-scale context aggregation by dilated convolutions[J]. arXiv preprint arXiv: 1511.07122, 2015.
[19] DAI J, QI H, XIONG Y, et al. Deformable convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 764-773.
[20] ZHU X, HU H, LIN S, et al. Deformable convnets v2: More deformable, better results[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 9308-9316.
[21] WANG W, DAI J, CHEN Z, et al. Internimage: Exploring large-scale vision foundation models with deformable convolutions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 14408-14419.
[22] Gennari M, Fawcett R, Prisacariu V A. DSConv: Efficient Convolution Operator[J]. arXiv preprint arXiv: 1901.01928, 2019.
[23] LI H, LI J, WEI H, et al. Slim-neck by GSConv: A better design paradigm of detector architectures for autonomous vehicles[J]. arXiv preprint arXiv: 2206.02424, 2022.
[24] HE X, ZHAO K, CHU X. AutoML: A survey of the state-of-the-art[J]. Knowledge-Based Systems, 2021, 212: 106622. http://www.sciencedirect.com/science/article/pii/S0950705120307516
[25] Zoph B, Le Q V. Neural architecture search with reinforcement learning[J]. arXiv preprint arXiv: 1611.01578, 2016.
[26] 邵延华, 张铎, 楚红雨, 等. 基于深度学习的YOLO目标检测综述[J]. 电子与信息学报, 2022, 44(10): 3697-3708. DOI: 10.11999/JEIT210790 SHAO Yanhua, ZHANG Duo, CHU Hongyu, et al. A review of YOLO object detection based on deep learning[J]. Journal of Electronics & Information Technology, 2022, 44(10): 3697-3708. DOI: 10.11999/JEIT210790
[27] Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size[J]. arXiv preprint arXiv: 1602.07360, 2016.
[28] Howard A G, Zhu M, Chen B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv: 1704.04861, 2017.
[29] Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 4510-4520.
[30] Howard A, Sandler M, CHU G, et al. Searching for mobilenetv3[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 1314-1324.
[31] ZHANG X, ZHOU X, LIN M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 6848-6856.
[32] MA N, ZHANG X, ZHENG H T, et al. Shufflenet v2: Practical guidelines for efficient CNN architecture design[C]//Proceedings of the European Conference on Computer Vision (ECCV), 2018: 116-131.
[33] HAN K, WANG Y, TIAN Q, et al. Ghostnet: More features from cheap operations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 1580-1589.
[34] CUI C, GAO T, WEI S, et al. PP-LCNet: A lightweight CPU convolutional neural network[J]. arXiv preprint arXiv: 2109.15099, 2021.
[35] Vasu P K A, Gabriel J, Zhu J, et al. MobileOne: an improved one millisecond mobile backbone[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 7907-7917.
[36] CHEN J, KAO S, HE H, et al. Run, don't walk: chasing higher FLOPS for faster neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 12021-12031.
[37] Mehta S, Rastegari M. Mobilevit: light-weight, general-purpose, and mobile-friendly vision transformer[J]. arXiv preprint arXiv: 2110.02178, 2021.
[38] Maaz M, Shaker A, Cholakkal H, et al. Edgenext: efficiently amalgamated CNN-transformer architecture for mobile vision applications[C]//European Conference on Computer Vision, 2022: 3-20.
[39] CAI H, GAN C, HAN S. Efficientvit: Enhanced linear attention for high-resolution low-computation visual recognition[J]. arXiv preprint arXiv: 2205.14756, 2022.
[40] Vasu P K A, Gabriel J, ZHU J, et al. FastViT: A fast hybrid vision transformer using structural reparameterization[J]. arXiv preprint arXiv: 2303.14189, 2023.
[41] WU B, DAI X, ZHANG P, et al. Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 10734-10742.
[42] WAN A, DAI X, ZHANG P, et al. Fbnetv2: Differentiable neural architecture search for spatial and channel dimensions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 12965-12974.
[43] DAI X, WAN A, ZHANG P, et al. Fbnetv3: Joint architecture-recipe search using predictor pretraining[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 16276-16285.
[44] XIONG Y, LIU H, Gupta S, et al. Mobiledets: Searching for object detection architectures for mobile accelerators[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 3825-3834.
[45] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779-788.
[46] Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 7263-7271.
[47] Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv: 1804.02767, 2018.
[48] Bochkovskiy A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv: 2004.10934, 2020.
[49] LI C, LI L, JIANG H, et al. YOLOv6: A single-stage object detection framework for industrial applications[J]. arXiv preprint arXiv: 2209.02976, 2022.
[50] WANG C Y, Bochkovskiy A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 7464-7475.
[51] 马雪浩. Yolo-Fastest: 超超超快的开源ARM实时目标检测算法[J/OL]. 漫步视觉, 2020: 1-147. [2023-10-20]. https://zhuanlan.zhihu.com/p/234506503. MA Xuehao. Yolo-Fastest: Super-super-fast open source ARM real-time object detection algorithm[J/OL]. Wandering Vision, 2020: 1-147. [2023-10-20]. https://zhuanlan.zhihu.com/p/234506503.
[52] ZHANG Y, BI S, DONG M, et al. The implementation of CNN-based object detector on ARM embedded platforms[C]//2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech). IEEE, 2018: 379-382.
[53] Womg A, Shafiee M J, LI F, et al. Tiny SSD: A tiny single-shot detection deep convolutional neural network for real-time embedded object detection[C]//2018 15th Conference on Computer and Robot Vision (CRV). IEEE, 2018: 95-101.
[54] LI Y, LI J, LIN W, et al. Tiny-DSOD: Lightweight object detection for resource-restricted usages[J]. arXiv preprint arXiv: 1807.11013, 2018.
[55] HU L, LI Y. Micro-YOLO: Exploring Efficient Methods to Compress CNN based Object Detection Model[C]//ICAART (2). 2021: 151-158.
[56] ZHANG Y M, LEE C C, Hsieh J W, et al. CSL-YOLO: A new lightweight object detection system for edge computing[J]. arXiv preprint arXiv: 2107.04829, 2021.
[57] GE Z, LIU S, WANG F, et al. YOLOx: Exceeding YOLO series in 2021[J]. arXiv preprint arXiv: 2107.08430, 2021.
[58] CUI M, GONG G, CHEN G, et al. LC-YOLO: a lightweight model with efficient utilization of limited detail features for small object detection[J]. Applied Sciences, 2023, 13(5): 3174. http://openurl.ebsco.com/contentitem/doi:10.3390%2Fapp13053174?sid=ebsco:plink:crawler&id=ebsco:doi:10.3390%2Fapp13053174
[59] WANG C, HE W, NIE Y, et al. Gold-YOLO: efficient object detector via gather-and-distribute mechanism[J]. arXiv preprint arXiv: 2309.11331, 2023.
[60] RangiLyu. YOLO之外的另一选择: 手机端97FPS的Anchor-Free目标检测模型NanoDet现已开源[Z/OL]. 我爱计算机视觉, 2020: 1-405. [2023-10-20]. https://zhuanlan.zhihu.com/p/306530300. RangiLyu. Another option besides YOLO: NanoDet, an anchor-free target detection model with 97FPS on mobile phones, is now open source~[Z/OL]. I Love Computer Vision, 2020: 1-405. [2023-10-20]. https://zhuanlan.zhihu.com/p/306530300.
[61] RangiLyu. 超简单辅助模块加速训练收敛, 精度大幅提升: 移动端实时的NanoDet升级版NanoDet-Plus来了![Z/OL]. CVer计算机视觉, 2022: 1-648. [2023-10-20]. https://zhuanlan.zhihu.com/p/449912627. RangiLyu. Super simple auxiliary module accelerates training convergence and greatly improves accuracy: NanoDet-Plus, a real-time mobile NanoDet upgrade, is here! [Z/OL]. CVer Computer Vision, 2022: 1-648. [2023-10-20]. https://zhuanlan.zhihu.com/p/449912627.
[62] YU G, CHANG Q, LV W, et al. PP-PicoDet: A better real-time object detector on mobile devices[J]. arXiv preprint arXiv: 2111.00902, 2021.
[63] ZHOU Q, SHI H, XIANG W, et al. DPNet: Dual-Path network for real-time object detection with lightweight attention[J]. arXiv preprint arXiv: 2209.13933, 2022.
[64] TU P, XIE X, LING M, et al. FemtoDet: an object detection baseline for energy versus performance tradeoffs[J]. arXiv preprint arXiv: 2301.06719, 2023.
-
期刊类型引用(4)
1. 吕伽奇,丁帅,庞静珠,许小进. 基于改进LeNet-5网络的堆芯燃料组件编码识别. 东华大学学报(自然科学版). 2024(02): 121-128 . 百度学术
2. 毛羽,郑怀华,李隆,张傲. 基于热红外图像的光伏板热斑检测方法研究. 自动化仪表. 2024(05): 25-29+34 . 百度学术
3. 王晓君,孙梓林,王雁. 基于AMP架构的青霉素结晶与发酵检测系统设计. 仪表技术与传感器. 2024(05): 66-73 . 百度学术
4. 赵兴文. 机器学习在信用贷款评分中的应用. 福建电脑. 2023(02): 31-34 . 百度学术
其他类型引用(15)