Application of Large Language Models and the RAG in Intelligent Educational Ecosystems
DOI:
https://doi.org/10.52575/2687-0932-2024-51-3-699-709Keywords:
RAG, LLM, intelligent educational ecosystem, large language models, python, LangchainAbstract
The article discusses the usage of the Retrieval-Augmented Generation (RAG) algorithm and large language models in intelligent educational ecosystems. The authors demonstrate the ability of large language models to improve the representation of educational resources, vacancies and user preferences in recommendation systems. The application of the RAG algorithm to supplement the knowledge of large language models with new data without additional training is considered. The example of implementation in an intelligent educational ecosystem shows the use of the Langchain library, the GigaChat large language model and the Qdrant vector database with jobs and educational resources descriptions to generate a user-friendly description of the labor market in accordance with his request.
Downloads
References
Бабкин А.В., Корягин С.И., Либерман И.В., Клачек П.М. 2022. Индустрия 5.0: Создание интеллектуальной образовательной экосистемы. Экономика и индустрия 5.0 в условиях новой реальности (ИНПРОМ-2022), 76–79.
Малышев И.О., Смирнов А.А. 2024. Обзор современных генеративных нейросетей: отечественная и зарубежная практика. Международный журнал гуманитарных и естественных наук. №1-2(88).
Оболенский Д.М., Шевченко В.И. 2019. Интеллектуальные образовательные экосистемы. Сб. науч. тр. междунар. науч.-техн. конф. «DICTUM – FACTUM: от исследований к стратегическим решениям». Севастополь. 162–171. DOI: 10.32743/dictum-factum.2020.162-1714е4
Оболенский Д.М., Шевченко В.И. 2020. Концептуальная модель интеллектуальной образовательной экосистемы. Экономика. Информатика. 47(2): 390–401. DOI: 10.18413/2687-0932-2020-47-2-390-401.4е4е
Оболенский Д.М., Шевченко В.И. 2021. Обзор современных методов построения рекомендательных систем – на основе контента и гибридные системы. Мир компьютерных технологий: сборник статей всероссийской научно-технической конференции студентов, аспирантов и молодых ученых, Севастополь, 05–09 апреля 2021 г. Министерство науки и высшего образования РФ, Севастопольский государственный университет. Севастополь: Федеральное государственное автономное образовательное учреждение высшего образования «Севастопольский государственный университет», 151–156.
Оболенский Д.М., Шевченко В.И. 2023. Построение и анализ графа компетенций на основе данных вакансий с порталов поиска работы. Экономика. Информатика, 50(1): 191–202. https://doi.org/10.52575/2687-0932-2023-50-1-191-202
Achiam J., Adler S., Agarwal S., Ahmad L., Akkaya I., Aleman F., Almeida D., Altenschmidt J., Altman S., Anadkat S., et al. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
Bowen J., Gang L., Chi H., Meng J., Heng J., Jiawei H. 2023. Large Language Models on Graphs: A Comprehensive Survey. arXiv preprint arXiv:2312.02783.
Brown T. et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33. 1877–1901.
Devlin J., Chang M., Lee K., Toutanova K. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. North American Chapter of the Association for Computational Linguistics.
Gao Y., Xiong Y., Gao X., Jia K., Pan J., Bi Y., Dai Y., Sun J., Guo Q., Wang M., Wang H. 2023. Retrieval-Augmented Generation for Large Language Models: A Survey. ArXiv, abs/2312.10997.
GigaChat. 2024. URL: https://developers.sber.ru/portal/products/gigachat
Graph Data Platform | Graph Database Management System. Neo4j. 2024. URL: https://neo4j.com/
High-Performance Vector Search at Scale. Qdrant – Vector Database – Qdrant. 2024. URL: https://qdrant.tech/
Keraghel I., Morbieu S., Nadif M. Beyond Words: A Comparative Analysis of LLM Embeddings for Effective Clustering. In: Miliou, I., Piatkowski, N., Papapetrou, P. (eds). 2024. Advances in Intelligent Data Analysis XXII. IDA 2024. Lecture Notes in Computer Science, vol 14641. Springer, Cham. https://doi.org/10.1007/978-3-031-58547-0_17
Langchain. 2024. URL: https://python.langchain.com/v0.2/docs/introduction/
Lewis P., Perez E., Piktus A., Petroni F., Karpukhin V., Goyal N., Kuttler H., Lewis M., Wen-tau Yih, Rocktaschel T., et al. 2020. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Advances in Neural Information Processing Systems, 33:9459–9474, URL https://doi.org/10.48550/arXiv.2005.11401.
Luo L., Li Y.-F., Haffari Gh., Pan Sh. 2023. Reasoning on graphs: Faithful and interpretable large language model reasoning. arXiv preprint arXiv:2310.01061.
Mikolov T., et al. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
Radford A., et al. 2019.Improving language understanding by generative pre-training.
Shadab I., Subhajit G. 2020. Efficient Ranking Framework for Information Retrieval Using Similarity Measure. DOI: 10.1007/978-3-030-37218-7_141.
Shoeybi M. et al. 2019. Megatron-lm: Training multi-billion parameter language models using model parallelism. arXiv preprint arXiv:1909.08053.
Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A. N., Kaiser L., Polosukhin I. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17). Curran Associates Inc., Red Hook, NY, USA, 6000–6010.
Weawiate – The AI-native database for a new generation of software. 2024. URL: https://weaviate.io/
YandexGPT 3 – Новое поколение генеративных текстовых нейросетей. 2024. YandexGPT. URL: https://ya.ru/ai/gpt-3
Yang L., Chen H., Li Zh., Ding X., Wu X. 2023. ChatGPT is not Enough: Enhancing Large Language Models with Knowledge Graphs for Fact-aware Language Modeling. arXiv preprint arXiv:2306.11489.
Zhang J., Lertvittayakumjorn P., Guo Y. 2019. Integrating Semantic Knowledge to Tackle Zero-shot Text Classification. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1 (Long and Short Papers), Minneapolis, Minnesota. Association for Computational Linguistics. 1031–1040.
Zhou D., Scharli N., Hou L., Wei J., Scales N., Wang X., Schuurmans D., Bousquet O., Le Q., Chi E.H. 2022. Least-to-Most Prompting Enables Complex Reasoning in Large Language Models. ArXiv, abs/2205.10625.
Abstract views: 0
Share
Published
How to Cite
Issue
Section
Copyright (c) 2024 Economics. Information Technologies
This work is licensed under a Creative Commons Attribution 4.0 International License.