Использование технологии гиперпоточности в целях повышения скорости обработки ML-алгоритмов

Vorobyev Alexander V.; Daniil  I.  Raspopin

doi:10.52575/2687-0932-2021-48-4-764-770

Authors

Vorobyev Alexander V. Kursk State University
Daniil I. Raspopin South-West State University

DOI:

https://doi.org/10.52575/2687-0932-2021-48-4-764-770

Keywords:

machine learning, ensemble algorithms, hyperthreading, single-threaded application

Abstract

The paper analyzes the data processing speed of machine learning algorithms depending on available CPU computing resources and data set size. Tests were conducted on synthesized test suites of increasing dimensionality, from 100 observations and 100 predictors, to 2000 observations and 2000 predictors, using a modern ensemble algorithm. As a result of the research it is determined that to increase the training speed of an ML-algorithm a much larger increase in computational power is required, given that the only computational power used is that of the CPU. A numerical exemplary proportion valid for a specific task is provided. Hyperthreading technology as a tool for increasing CPU performance is considered. In the course of experiments it is determined that processing of machine learning algorithms in a single threaded application – Python language environment – is not a limitation for hyperthreading; on the contrary, using this technology can increase the processing speed of ML algorithms.

Downloads

Download data is not yet available.

Author Biographies

Vorobyev Alexander V. , Kursk State University

Postgraduate Cathedra of SISA, Kursk State University, Kursk, Russia

Daniil I. Raspopin, South-West State University

Student of the Department of Customs and Global Economy of South-West State University, Kursk, Russia

References

Воробьев А.В. 2021. Метод выбора модели машинного обучения на основе устойчивости предикторов с применением значения Шепли. Экономика. Информатика, 48 (2): 350–359. DOI 10.52575/2687-0932-2021-48-2-350-359.

Chen T., Guestrin C. 2016. XGBoost: A Scalable Tree Boosting System. arXiv:1603.02754. DOI: 10.1145/2939672.2939785.

Gordienko, Y. et al (2015). IMP Science Gateway: from the Portal to the Hub of Virtual Experimental Labs in e-Science and Multiscale Courses in e-Learning. Concurrency and Computation: Practice and Experience, 27(16), 4451–4464.

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, Tie-Yan Liu: LightGBM: A Highly Efficient Gradient Boosting Decision Tree. Advances in Neural Information Processing Systems 30 (NIPS 2017).

Hamotskyi, S., Rojbi, A., Stirenko, S., Gordienko, Y.: Automatized generation of alphabets of symbols for multimodal human computer interfaces. In: Proceedings of Federated Conference on Computer Science and Information Systems, FedCSIS-2017, Prague, Czech Republic (2017).

Håvard H. Holm, André R. Brodtkorb and Martin L. Sætra. GPU Computing with Python: Performance, Energy Efficiency and Usability. Computation. MDPI. 01/2020 Volume 8. https://doi.org/10.3390/computation8010004.

Kochura Y., Stirenko S., Alienin O., Novotarskiy M., Gordienko Y. Performance Analysis of Open Source Machine Learning Frameworks for Various Parameters in Single-Threaded and Multi-threaded Modes. Advances in Intelligent Systems and Computing II pp 243–256. DOI 10.1007/978-3-319-70581-1_17.

Larsen E.; McAllister D. Fast matrix multiplies using graphics hardware. In Proceedings of the 2001. ACM/IEEE Conference on Supercomputing, SC’01, Denver, CO, USA, 10–16 November 2001.

Performance Best Practices for VMware vSphere® 5.1. VMware, Inc. 3401 Hillview Ave. Palo Alto, CA 94304. Revision: 20120910. https://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.1.pdf

Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L. Hamilton, and Jure Leskovec. 2018. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (London, United Kingdom) (KDD ’18). Association for Computing Machinery, New York, NY, USA, 974–983.https://doi.org/10.1145/3219819.3219890.

Sahil Munjal, Nikhil Singla, Nitin Sinha. Hyper-Threading Technology in Microprocessor. International Journal for Research in Applied Science & Engineering Technology (IJRASET). Volume 2 Issue X. 2014.

Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoğlu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen, Wen-mei Hwu. Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture. arXiv:2103.03330 (2021).

Wessam M. Hassanein, Layali K. Rashid & Moustafa A. Hammad. Analyzing the Effects of Hyperthreading on the Performance of Data Management Systems. International Journal of Parallel Programming volume 36, pages 206–225 (2008) DOI:10.1007/s10766-007-0066-x.

Witten I. H., Frank E., Hall M. A., Pal C. J. (2016). Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann.

Zhen Jia, Wanling Gao, Yingjie Shi, Sally A. McKee, Zhenyan Ji, Jianfeng Zhan, Lei Wang, Lixin Zhang. Understanding Processors Design Decisions for Data Analytics in Homogeneous Data Centers. IEEE Transactions on Big Data. Volume: 5, Issue: 1.2019. DOI 10.1109/TBDATA.2017.2758792.

Using hyperthreading technology to increase the processing speed of ML algorithms

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

Vorobyev Alexander V. , Kursk State University

Daniil I. Raspopin, South-West State University

References

Share

Published

How to Cite

Issue

Section

Most read articles by the same author(s)