ویژن ترنسفورمر چندخروجی آگاه از انرژی با خروج زودهنگام برای استنتاج کارآمد در دستگاه‌های لبه مبتنی بر برداشت انرژی

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشکده مهندسی برق و کامپیوتر، دانشگاه صنعتی قم، قم، ایران

2 دانشکده مهندسی برق و کامپیوتر، دانشگاه صنعتی قم، قم، ایران؛

3 گروه مهندسی صنایع، دانشگاه صنعتی امیرکبیر، تهران، ایران

چکیده

شبکه‌های عصبی عمیق با وجود دقت بالا، به‌دلیل ساختار پیچیده و نیاز به منابع محاسباتی سنگین، در دستگاه‌های لبه با محدودیت انرژی با چالش‌ مواجه‌اند. در پاسخ به این چالش، پژوهش حاضر یک معماری نوین مبتنی بر ویژن ترنسفورمر چندخروجی آگاه از انرژی برای استفاده در دستگاه‌های لبه مبتنی بر برداشت انرژی ارائه می‌دهد. این معماری با بهره‌گیری از مکانیزم خروج زودهنگام، امکان توقف استنتاج در لایه‌های میانی شبکه را متناسب با سطح انرژی در دسترس فراهم می‌سازد. برای آموزش همزمان خروجی‌های میانی و نهایی، یک تابع هزینه وزن‌دار طراحی شده که با اختصاص ضرایب نمایی، دقت خروجی‌های میانی را بهینه می‌سازد. نتایج تجربی نشان می‌دهد که این روش بدون کاهش محسوس دقت نهایی، منجر به کاهش قابل‌توجه در بار محاسباتی می‌شود؛ به‌طور خاص، خروج از لایه چهارم با افت دقت ناچیز، کاهش ۲۸.۵ درصدی در عملیات محاسباتی را به همراه داشته است. این یافته‌ها مؤید کارآمدی رویکرد پیشنهادی در ایجاد توازن میان دقت، سرعت و مصرف انرژی در سناریوهای استنتاج پویا و منابع محدود است.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Energy-Aware Multi-Exit Vision Transformer with Early Exits for Efficient Inference on Energy-Harvesting Edge Devices

نویسندگان [English]

  • Nastaran Eslami 1
  • Morteza Mohajjel 2
  • Abdolreza Rasouli Kenari 1
  • Saeed Abbasi 3
1 Faculty of Electrical and Computer Engineering, Qom University of Technology, Qom, Iran
2 Electrical and Computer Engineering, Qom University of Technology, Qom, Iran
3 Department of Industrial Engineering, Amirkabir University of Technology, Tehran, Iran
چکیده [English]

Although deep neural networks (DNNs) achieve remarkable accuracy, their deployment on energy-constrained edge devices remains a significant challenge due to their intricate architectures and high computational complexity. To mitigate these limitations, this study introduces a novel energy-aware multi-exit Vision Transformer (ViT) architecture tailored for energy-harvesting edge environments. The proposed model incorporates an early-exit mechanism, enabling dynamic termination of inference at intermediate layers depending on the current energy availability. A weighted loss function is developed to facilitate joint optimization of both intermediate and final outputs, wherein exponential weighting is employed to enhance the performance of earlier exits. Experimental evaluations reveal that the proposed method substantially reduces computational overhead while maintaining competitive final accuracy. Specifically, utilizing the fourth exit results in a 28.5% reduction in FLOPs with minimal degradation in accuracy. These results underscore the potential of the proposed architecture to achieve an effective trade-off among accuracy, latency, and energy efficiency in dynamic inference scenarios on resource-limited edge platforms.

کلیدواژه‌ها [English]

  • Dynamic inference
  • Vision Transformer
  • multi-exit neural network
  • energy optimization
  • energy harvesting
Aboulfazl Ebrahimi, Mahboubeh Shamsi, & Morteza Mohajjel. (2023). Optimal adjustment of deep neural network parameters in Estimation of Missing Data of Vital Signs in Wireless Body Area Networks. Enginiiring Management and Soft Computing, 9 (1)., 9(1), 162–188. https://doi.org/10.22091/JEMSC.2022.7422.1162
Bakhtiarnia, A., Zhang, Q., & Iosifidis, A. (2021). Multi-Exit Vision Transformer for Dynamic Inference (Version 3). arXiv. https://doi.org/10.48550/ARXIV.2106.15183
Bakhtiarnia, A., Zhang, Q., & Iosifidis, A. (2022). Single-Layer Transformers for More Accurate Early Exits with Less Overhead. https://doi.org/10.5281/ZENODO.6737409
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020, May 26). End-to-End Object Detection with Transformers. arXiv.Org. https://arxiv.org/abs/2005.12872v3
Chen, H., Wang, Y., Guo, T., Xu, C., Deng, Y., Liu, Z., Ma, S., Xu, C., Xu, C., & Gao, W. (2020, December 1). Pre-Trained Image Processing Transformer. arXiv.Org. https://arxiv.org/abs/2012.00364v4
Chen, Y., Pan, X., Li, Y., Ding, B., & Zhou, J. (2023, December 8). EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism. arXiv.Org. https://arxiv.org/abs/2312.04916v3
Daghero, F., Burrello, A., Pagliari, D. J., Benini, L., Macii, E., & Poncino, M. (2022, April 7). Energy-Efficient Adaptive Machine Learning on IoT End-Nodes With Class-Dependent Confidence. arXiv.Org. https://doi.org/10.1109/ICECS49266.2020.9294863
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2020, October 22). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv.Org. https://arxiv.org/abs/2010.11929v2
Han, Y., Huang, G., Song, S., Yang, L., Wang, H., & Wang, Y. (2021, February 9). Dynamic Neural Networks: A Survey. arXiv.Org. https://arxiv.org/abs/2102.04906v4
Islam, B., & Nirjon, S. (2020). Zygarde: Time-Sensitive On-Device Deep Inference and Adaptation on Intermittently-Powered Systems. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 4(3), 1–29. https://doi.org/10.1145/3411808
Jahier Pagliari, D., Panini, F., Macii, E., & Poncino, M. (2019). Dynamic Beam Width Tuning for Energy-Efficient Recurrent Neural Networks. Proceedings of the 2019 Great Lakes Symposium on VLSI, 69–74. https://doi.org/10.1145/3299874.3317974
Jeon, S., Choi, Y., Cho, Y., & Cha, H. (2023). HarvNet: Resource-Optimized Operation of Multi-Exit Deep Neural Networks on Energy Harvesting Devices. Proceedings of the 21st Annual International Conference on Mobile Systems, Applications and Services, 42–55. https://doi.org/10.1145/3581791.3596845
Kansal, A., Hsu, J., Zahedi, S., & Srivastava, M. B. (2007). Power management in energy harvesting sensor networks. ACM Transactions on Embedded Computing Systems, 6(4), 32. https://doi.org/10.1145/1274858.1274870
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
Li, X., Lou, C., Zhu, Z., Chen, Y., Shen, Y., Ma, Y., & Zou, A. (2022, June 9). Predictive Exit: Prediction of Fine-Grained Early Exits for Computation- and Energy-Efficient Inference. arXiv.Org. https://arxiv.org/abs/2206.04685v2
Loshchilov, I., & Hutter, F. (2017, November 14). Decoupled Weight Decay Regularization. arXiv.Org. https://arxiv.org/abs/1711.05101v3
Lv, M., & Xu, E. (2022). Deep Learning on Energy Harvesting IoT Devices: Survey and Future Challenges. IEEE Access, 10, 124999–125014. https://doi.org/10.1109/ACCESS.2022.3225092
Morteza Nourmehdi,Mohammad Hadi Zahedi. (n.d.). Effective Indicators in Energy Management using the Internet of Things (IoT). https://doi.org/10.22091/JEMSC.2025.11119.1188
Naseer, M., Ranasinghe, K., Khan, S., Hayat, M., Khan, F. S., & Yang, M.-H. (2021, May 21). Intriguing Properties of Vision Transformers. arXiv.Org. https://arxiv.org/abs/2105.10497v3
Panda, P., Sengupta, A., & Roy, K. (2016). Conditional Deep Learning for Energy-Efficient and Enhanced Pattern Recognition (arXiv:1509.08971). arXiv. https://doi.org/10.48550/arXiv.1509.08971
Raghu, M., Unterthiner, T., Kornblith, S., Zhang, C., & Dosovitskiy, A. (2021, August 19). Do Vision Transformers See Like Convolutional Neural Networks? arXiv.Org. https://arxiv.org/abs/2108.08810v2
Reyhaneh Dehghan, Marjan Naderan Tahan, & Seyyed Enayatallah Alavi. (2024). Combining Convolutional Neural Network (CNN) and Grad-CAM for Parkinson’s Disease Prediction and Visual Explanation. https://doi.org/10.22091/jemsc.2024.10828.1180
Scardapane, S., Scarpiniti, M., Baccarelli, E., & Uncini, A. (2020, April 27). Why should we add early exits to neural networks? arXiv.Org. https://doi.org/10.1007/s12559-020-09734-4
Shen, L., Sun, Y., Yu, Z., Ding, L., Tian, X., & Tao, D. (2023, April 7). On Efficient Training of Large-Scale Deep Learning Models: A Literature Review. arXiv.Org. https://arxiv.org/abs/2304.03589v1
Tay, Y., Dehghani, M., Bahri, D., & Metzler, D. (2022). Efficient Transformers: A Survey (arXiv:2009.06732). arXiv. https://doi.org/10.48550/arXiv.2009.06732
Teerapittayanon, S., McDanel, B., & Kung, H. T. (2017). BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks (arXiv:1709.01686). arXiv. https://doi.org/10.48550/arXiv.1709.01686
Xu, G., Hao, J., Shen, L., Hu, H., Luo, Y., Lin, H., & Shen, J. (2023). LGViT: Dynamic Early Exiting for Accelerating Vision Transformer. Proceedings of the 31st ACM International Conference on Multimedia, 9103–9114. https://doi.org/10.1145/3581783.3611762
Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., Torr, P. H. S., & Zhang, L. (2021). Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers (arXiv:2012.15840). arXiv. https://doi.org/10.48550/arXiv.2012.15840
Zhou, L., Zhou, Y., Corso, J. J., Socher, R., & Xiong, C. (2018). End-to-End Dense Video Captioning with Masked Transformer (arXiv:1804.00819). arXiv. https://doi.org/10.48550/arXiv.1804.00819
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., & Dai, J. (2021). Deformable DETR: Deformable Transformers for End-to-End Object Detection (arXiv:2010.04159). arXiv. https://doi.org/10.48550/arXiv.2010.04159
Tavakkoli-Moghaddam, R., Akbari, A. H., Tanhaeean, M., Moghdani, R., Gholian-Jouybari, F., & Hajiaghaei-Keshteli, M. (2024). Multi-objective boxing match algorithm for multi-objective optimization problems. Expert Systems with Applications, 239, 122394. https://doi.org/10.1016/j.eswa.2023.122394
Yavari, M., Marvi, M., & Akbari, A. H. (2020). Semi-permutation-based genetic algorithm for order acceptance and scheduling in two-stage assembly problem. Neural Computing and Applications, 32, 2989-3003. https://doi.org/10.1007/s00521-019-04027-w
Tanhaeean, M., Tavakkoli-Moghaddam, R., & Akbari, A. H. (2022). Boxing match algorithm: A new meta-heuristic algorithm. Soft Computing, 26(24), 13277-13299. https://doi.org/10.1007/s00500-022-07518-6
CAPTCHA Image