Hierarchical Cluster Analysis in R for Production and Economic Indicators of the Penitentiary System

Authors

  • Dmitry S. Ponomarev Branch (Izhevsk) Federal State Institution Research Institute of the Federal Penitentiary Service

DOI:

https://doi.org/10.52575/2687-0932-2023-50-3-655-668

Keywords:

hierarchical cluster analysis, production and economic indicators, penitentiary system, R, systems analysis, exploratory data analysis, machine learning

Abstract

According to official information from the penitentiary system of the Russian Federation, the structure of the production sector of the penitentiary system of the Russian Federation includes 652 institutions. In 2021, the volume of production of goods and services amounted to 36.8 billion rubles. More than 131 thousand convicts are employed in the institutions of the penitentiary system. The divisions of the Federal Penitentiary Service of Russia are actively organizing work to receive orders for the manufacture of products. Thus, conducting research for the manufacturing sector using modern scientific methods is relevant not only for the penitentiary system, but also for the Russian Federation as a whole. One of these methods is machine learning hierarchical cluster analysis. Its advantages: the ability, regardless of the territories, to consider production and economic indicators of interest and the ability to segment the market with the construction of hierarchies. The purpose of this scientific study is to conduct research in the field of machine learning (hierarchical clustering) for segmenting the production and economic indicators of the penitentiary system. The main tool for implementing hierarchical clustering is the programming language and statistical processing – R (data processing was carried out in the R-Studio environment). The novelty of the work is: the study of production and economic indicators of the penitentiary system, regardless of the territories and the use of relevant machine learning methods for segmentation and division into groups of values of the volume of production of goods. The main scientific results were: the developed algorithm for carrying out hierarchical clustering for the penitentiary system; formed a number of rules and norms for the choice of parameters, data processing, the choice of hyperparameters for hierarchical clustering. In addition, new dependencies were identified for a more global consideration of production and economic indicators.

Downloads

Download data is not yet available.

Author Biography

Dmitry S. Ponomarev, Branch (Izhevsk) Federal State Institution Research Institute of the Federal Penitentiary Service

Candidate of Technical Sciences leading researcher of the branch (Izhevsk) of the Federal State Institution Research Institute of the Federal Penitentiary Service of Russian Federation; Associate Prof. of Kalashnikov Izhevsk State Technical University, Izhevsk, Russian Federation

References

Brian S.E., Sabine Landau, Morven Leese, Daniel Stah. 2011. Cluster Analysis. Wiley, 5th Edition. 71-110.

Bruce P., Bruce A., Gedeck P. 2020. Practical statistics for Data Scientists. O’Reilly. 363 p.

Hintze J.L. 1998. Violin Plots: A Box Plot – Density Trace Synergism. The American Statistician. 2(52): 181–84.

Hyndman R.J., Yanan Fan. 1966. Sample Quantiles in Statistical Packages. American Statistician. 4(50): 361–65.

Kabacoff R.I. 2011. R in action. Manning Publications Co. 451 p.

Kaufman L., Rousseeuw P. 1990. Finding Groups in Data: An Introduction to Cluster Analysis. Wiley. 335 p.

Legendre P. 2012. Numerical ecology. 3rd English ed. - Amsterdam: Elsevier. 990 p.

Metloff N. 2019. The art of R pogramming. Starch Press. 416 p.

Murtagh F. 1983. A survey of recent advances in hierarchical clustering algorithms. The Computer Journal. №26. 354–359.

Murtagh F., Contreras P. 2017. Algorithms for hierarchical clustering: an overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 7(6): 1219.

Stekh Y., Kernytskyy A., Lobur M. 2006. Hierarchical clustering algorithms for large datasets. Modern Problems of Radio Engineering, Telecommunications and Computer Science Proceedings of International Conference, TCSET 2006. 388-390.

Tukey J.W. 1962. The Future of Data Analysis. The Annals of Mathematical Statistics. № 1. 1–67.

Tukey J.W. 1977. Exploratory Data Analysis. Reading, Mass.: Addison Wesley. 688 p.

Ward J.H. 1963. Hierarchical grouping to optimize an objective function. J. of the American Statistical Association. 236 p.

Wishart D. 1969. An algorithm for hierachical classifications, Biometrics 25, 165–170.


Abstract views: 56

Share

Published

2023-09-30

How to Cite

Ponomarev, D. S. (2023). Hierarchical Cluster Analysis in R for Production and Economic Indicators of the Penitentiary System. Economics. Information Technologies, 50(3), 655-668. https://doi.org/10.52575/2687-0932-2023-50-3-655-668

Issue

Section

SYSTEM ANALYSIS AND PROCESSING OF KNOWLEDGE