The Method of Extracting Typical Requirements from Job Descriptions by Grapho-Semantic Analysis

Authors

  • Ilia V. Loginov Academy of the Federal Guard Service of the Russian Federation
  • Aleksey A. Sherbin RANEPA branch in Orel

DOI:

https://doi.org/10.52575/2687-0932-2025-52-4-928-945

Keywords:

semantic analysis, graphs, vacancies, generalization, natural language processing, linguistic expressions, fuzzy sets

Abstract

The paper addresses the problem of extracting and summarizing information about the requirements for the competencies of specialists based on the analysis of sets of textual job descriptions. The purpose of the study is to increase the validity of the generalization and extraction of data, taking into account additional information about the level of requirements. The achievement of the research goal is ensured by applying an approach to the analysis of textual data based on combining grapho-semantic analysis with a fuzzy linguistic model of the "desirability" level. Based on this approach, a formal model of job descriptions and a methodology for extracting generalized professional skills requirements from textual job descriptions of specialists are proposed. The results of applying the proposed approach to the vacancy corpus for the professions of business analyst, developer and DevOps engineer using the developed VectorCognitive software tool showed the possibility of combining a fuzzy linguistic model and grapho-semantic analysis to form generalized requirements for specialists.

Downloads

Download data is not yet available.

Author Biographies

Ilia V. Loginov, Academy of the Federal Guard Service of the Russian Federation

Doctor of Technical Sciences, employee, Orel, Russia
E-mail: loginov_iv@bk.ru

Aleksey A. Sherbin, RANEPA branch in Orel

Student of Mid-Russia Institute of Management, Orel, Russia  
E-mail: alex.sherbin1@mail.ru

References

Список литературы

Бермудес С.Х.Г. 2017. Метод измерения семантического сходства текстовых документов. Известия ЮФУ. Технические науки, 3(188): 17–29. DOI 10.23683/2311-3103-2017-3-17-29.

Деев М.В. 2024 Архитектура системы интеллектуального анализа компетенций для актуализации образовательных программ вуза. Информатика и образование, 39(3): 29–43. DOI 10.32517/0234-0453-2024-39-3-29-43.

Диков М.Е., Широбокова С.Н. 2022. О проектных решениях цифрового инструментария профориентации по определению востребованности направлений подготовки на основе анализа описаний вакансий. Инженерный вестник Дона, 12(96): 65–74.

Иванченко О.В., Барауля Е.В. 2022. Влияние обработки естественного языка (NLP) на цифровой маркетинг. Развитие логистики в условиях санкционных ограничений и международной экономической интолерантности: материалы международной научно-практической конференции: XVIII Южно-Российский логистический форум, Ростов-на-Дону, 07–08 октября 2022 года. Ростовский государственный экономический университет «РИНХ»; Южно-Российская ассоциация логистики. Ростов-на-Дону: Ростовский государственный экономический университет «РИНХ». 133–137.

Курушин Д.С., Леонов Е.Р., Соболева О.В. 2018. О возможном подходе к автоматическому построению денотатного графа гипертекста. Информационная структура текста. Сб.статей. РАН. ИНИОН. М., 113–118.

Логинов И.В., Логинова Ю.В. 2024. Сравнение схожести интеллект-карт в задачах маркетингового анализа. Вестник Кемеровского государственного университета. Серия: Политические, социологические и экономические науки, Т. 9, № 3(33): 410–423. DOI 10.21603/2500-3372-2024-9-3-410-423.

Логинова Ю.В., Логинов И.В. 2024. Применение семантического анализа в стратегическом маркетинге при использовании инструмента интеллект-карт. Вестник Тюменского государственного университета. Социально-экономические и правовые исследования, Т. 10, № 1(37): 103–123. DOI 10.21684/2411-7897-2024-10-1-103-123.

Минаев Д.В. 2022. Исследование компетентностной модели образовательной программы на основе интеллектуального анализа профессиональных требований рынка труда. Управленческое консультирование, 10: 65–83.

Перстенева Н.П., Кучко А.Ю. 2020. Компаративный анализ описаний вакансий и рабочих программ. Наука XXI века: актуальные направления развития. 1-2: 37–42.

Щербин А.А. 2024. Применение методов обработки естественного языка для решения задач анализа вакансий бизнес-аналитиков на предмет требований к знаниям. Образование и наука без границ: социально-гуманитарные науки, 23: 134–138.

Яруллин Д.В. 2019. Автоматическое построение профессионального портрета IT-специалиста на основе текстов вакансий. Филология в XXI веке, 1(3): 17–21.

Baroni М., Dinu G., Kruszewski G. 2014. Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 1: 238–247.

Blei D.M., Ng A.Y., Jordan M.I. 2003. Latent dirichlet allocation. The Journal of machine Learning research, 3: 993–1022.

Camacho-Collados J., Pilehvar M.T., Navigli R. 2015. Nasari: a novel approach to a semantically-aware representation of items. In Proceedings of NAACL, 567–577.

Collobert R., Weston J. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning, 160–167. ACM.

Deerwester S.C., Dumais S.T., Landauer Th.K., Furnas G.W., Harshman R.A. 1990. Indexing by latent semantic analysis. JAsIs, 41(6): 391–407.

Fellbaum Ch. 1998. WordNet: An Electronic Lexical Database. Bradford Books.

Gabrilovich E., Markovitch Sh. 2007. Computing semantic relatedness using wikipedia-based explicit semantic analysis. In IJCAI, 7: 1606–1611.

Guixiang M., Nesreen A., Theodore W., Philip Y. 2021. Deep graph similarity learning: a survey. Data Mining and Knowledge Discovery, 35. 10.1007/s10618-020-00733-5.

Hassan S., Mihalcea R. 2011. Semantic relatedness using salient semantic analysis. In AAAI.

Herlambang P., Nur R. 2021. Job Standard Parameters from Online Job Vacancy. IPTEK Journal of Proceedings Series, 46. 10.12962/j23546026.y2020i6.8905.

Hughes Th., Ramage D. 2007. Lexical Semantic Relatedness with Random Graph Walks. Conference: EMNLP-CoNLL 2007, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic. 581–589.

Landauer Th.K., Laham D., Rehder B., Schreiner M.E. 1997. How well can passage meaning be derived without using word order? А comparison of latent semantic analysis and humans. In Proceedings of the 19th annual meeting of the Cognitive Science Society, 412–417. Citeseer.

Litecky Ch., Aken A., Ahmad A., Nelson, H. 2010. Mining for Computing Jobs. Software, IEEE, 27: 78–85. 10.1109/MS.2009.150.

Mikolov T., Chen K., Corrado G., Dean J. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

Pennington J., Socher R., Manning Ch. 2014. Glove: Global vectors for word representation. Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014), 12: 1532–1543.

Shalaby W., Zadrozny W. 2017. Mined semantic analysis: A new concept space model for semantic representation of textual data. 2122–2131. 10.1109/BigData.2017.8258160.

Vanetik N., Kogan G. 2023. Job Vacancy Ranking with Sentence Embeddings, Keywords, and Named Entities. Information, 14: 468. 10.3390/info14080468.

Zhang Z., Gentile A., Ciravegna F. 2013. Recent advances in methods of lexical semantic relatedness – A survey. Natural Language Engineering, 19. 10.1017/S1351324912000125.

Wowczko I. 2015. Skills and Vacancy Analysis with Data Mining Techniques. Informatics, 2: 31–49. 10.3390/informatics2040031.

References

Bermudes S.H.G. 2017. Metod izmereniya semanticheskogo skhodstva tekstovyh doku-mentov [A method for measuring the semantic similarity of text documents]. Izvestiya YuFU. Tekhnicheskie nauki, 3(188): 17–29. DOI 10.23683/2311-3103-2017-3-17-29.

Deev M.V. 2024. Arhitektura sistemy intellektual'nogo analiza kompetencij dlya aktualizacii obrazovatel'nyh programm vuza [Architecture of the intellectual competence analysis system for updating university educational programs]. Informatika i obrazovanie, 39(3): 29–43. DOI 10.32517/0234-0453-2024-39-3-29-43.

Dikov M.E., Shirobokova S.N. 2022. O proektnyh resheniyah cifrovogo instrumen-tariya proforientacii po opredeleniyu vostrebovannosti napravlenij podgotovki na osnove analiza opisanij vakansij [On design solutions for a digital career guidance tool to determine the relevance of training areas based on the analysis of job descriptions]. Inzhenernyj vestnik Dona, 12(96): 65–74.

Ivanchenko O.V., Baraulya E.V. 2022. Vliyanie obrabotki estestvennogo yazyka (NLP) na cifrovoj marketing [The impact of natural language processing (NLP) on digital marketing]. Razvitie logistiki v usloviyah sankcionnyh ogranichenij i mezhdunarod-noj ekonomicheskoj intolerantnosti: materialy mezhdunarodnoj nauchno-prakticheskoj konfe-rencii: XVIII Yuzhno-Rossijskij logisticheskij forum, Rostov-na-Donu, 07–08 oktyabrya 2022 goda. Rostovskij gosudarstvennyj ekonomicheskij universitet "RINH"; Yuzhno-Rossijskaya as-sociaciya logistiki. Rostov-na-Donu: Rostovskij gosudarstvennyj ekonomicheskij universi-tet "RINH", 133–137.

Kurushin D.S., Leonov E.R., Soboleva O.V. 2018. O vozmozhnom podhode k avtomatiche-skomu postroeniyu denotatnogo grafa giperteksta [On a possible approach to the automatic construction of a denotation graph of hypertext]. Informacionnaya struktura teksta. Sb.statej. RAN. INION. M., 113–118.

Loginov I.V., Loginova Yu.V. 2024. Sravnenie skhozhesti intellekt-kart v zadachah marketingovogo analiza [Comparison of the similarity of intelligence maps in marketing analysis tasks]. Vestnik Kemerovskogo gosudarstvennogo universiteta. Seriya: Poli-ticheskie, sociologicheskie i ekonomicheskie nauki, T. 9, № 3(33): 410–423. DOI 10.21603/2500-3372-2024-9-3-410-423.

Loginova Yu. V., Loginov I.V. 2024. Primenenie semanticheskogo analiza v strategi-cheskom marketinge pri ispol'zovanii instrumenta intellekt-kart [Application of semantic analysis in strategic marketing when using the intelligence maps tool]. Vestnik Tyumenskogo gos-udarstvennogo universiteta. Social'no-ekonomicheskie i pravovye issledovaniya, T. 10, № 1(37): 103–123. DOI 10.21684/2411-7897-2024-10-1-103-123.

Minaev D.V. 2022. Issledovanie kompetentnostnoj modeli obrazovatel'noj pro-grammy na osnove intellektual'nogo analiza professional'nyh trebovanij rynka truda [The study of the competence model of the educational program based on the intellectual analysis of professional requirements of the labor market]. Upravlencheskoe konsul'tirovanie, 10: 65–83.

Persteneva N.P., Kuchko A.Yu. 2020. Komparativnyj analiz opisanij vakansij i rabochih programm [Comparative analysis of job descriptions and work programs]. Nauka XXI veka: aktual'nye napravleniya razvitiya. 1-2: 37–42.

Shcherbin A.A. 2024. Primenenie metodov obrabotki estestvennogo yazyka dlya resheniya zadach analiza vakansij biznes-analitikov na predmet trebovanij k znaniyam [Application of natural language processing methods to solve the problems of analyzing business analyst vacancies for knowledge requirements]. Obrazovanie i nauka bez granic: social'no-gumanitarnye nauki, 23: 134–138.

Yarullin D.V. 2019. Avtomaticheskoe postroenie professional'nogo portreta IT-specialista na osnove tekstov vakansij [Automatic construction of a professional portrait of an IT specialist based on vacancy texts]. Filologiya v XXI veke, 1(3): 17–21.

Baroni М., Dinu G., Kruszewski G. 2014. Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 1: 238–247.

Blei D.M., Ng A.Y., Jordan M.I. 2003. Latent dirichlet allocation. The Journal of machine Learning research, 3: 993–1022.

Camacho-Collados J., Pilehvar M.T., Navigli R. 2015. Nasari: a novel approach to a semantically-aware representation of items. In Proceedings of NAACL, 567–577.

Collobert R., Weston J. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning, 160–167. ACM.

Deerwester S.C., Dumais S.T., Landauer Th.K., Furnas G.W., Harshman R.A. 1990. Indexing by latent semantic analysis. JAsIs, 41(6): 391–407.

Fellbaum Ch. 1998. WordNet: An Electronic Lexical Database. Bradford Books.

Gabrilovich E., Markovitch Sh. 2007. Computing semantic relatedness using wikipedia-based explicit semantic analysis. In IJCAI, 7: 1606–1611.

Guixiang M., Nesreen A., Theodore W., Philip Y. 2021. Deep graph similarity learning: a survey. Data Mining and Knowledge Discovery, 35. 10.1007/s10618-020-00733-5.

Hassan S., Mihalcea R. 2011. Semantic relatedness using salient semantic analysis. In AAAI.

Herlambang P., Nur R. 2021. Job Standard Parameters from Online Job Vacancy. IPTEK Journal of Proceedings Series, 46. 10.12962/j23546026.y2020i6.8905.

Hughes Th., Ramage D. 2007. Lexical Semantic Relatedness with Random Graph Walks. Conference: EMNLP-CoNLL 2007, Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, June 28-30, 2007, Prague, Czech Republic. 581–589.

Landauer Th.K., Laham D., Rehder B., Schreiner M.E. 1997. How well can passage meaning be derived without using word order? А comparison of latent semantic analysis and humans. In Proceedings of the 19th annual meeting of the Cognitive Science Society, 412–417. Citeseer.

Litecky Ch., Aken A., Ahmad A., Nelson, H. 2010. Mining for Computing Jobs. Software, IEEE, 27. 78-85. 10.1109/MS.2009.150.

Mikolov T., Chen K., Corrado G., Dean J. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

Pennington J., Socher R., Manning Ch. 2014. Glove: Global vectors for word representation. Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014), 12: 1532–1543.

Shalaby W., Zadrozny W. 2017. Mined semantic analysis: A new concept space model for semantic representation of textual data. 2122–2131. 10.1109/BigData.2017.8258160.

Vanetik N., Kogan G. 2023. Job Vacancy Ranking with Sentence Embeddings, Keywords, and Named Entities. Information, 14: 468. 10.3390/info14080468.

Zhang Z., Gentile A., Ciravegna F. 2013. Recent advances in methods of lexical semantic relatedness – A survey. Natural Language Engineering, 19. 10.1017/S1351324912000125.

Wowczko I. 2015. Skills and Vacancy Analysis with Data Mining Techniques. Informatics, 2: 31–49. 10.3390/informatics2040031.


Abstract views: 9

Share

Published

2025-12-30

How to Cite

Loginov, I. V., & Sherbin, A. A. (2025). The Method of Extracting Typical Requirements from Job Descriptions by Grapho-Semantic Analysis. Economics. Information Technologies, 52(4), 928-945. https://doi.org/10.52575/2687-0932-2025-52-4-928-945

Issue

Section

SYSTEM ANALYSIS AND PROCESSING OF KNOWLEDGE