Algorithm for Constructing and Analyzing Spectrograms of Audio Signals

Authors

  • Aleksei V. Boldyshev PJSC Rostelecom Belgorod branch
  • Aleksandra A. Medvedeva Belgorod State National Research University
  • Ekaterina I. Prokhorenko Belgorod State National Research University
  • Diana I. Gaivoronskaya Belgorod State National Research University

DOI:

https://doi.org/10.52575/2712-746X-2024-51-1-250-260

Keywords:

spectrogram, energy share, sound processing, speech signals, subband representations, subband matrix

Abstract

The paper describes one of the areas of sound signals research sound analysis using spectrograms as a means of visualizing dynamic changes in the intensity of the frequency components of the signal. Due to the fact that sound and, in particular, speech messages remain the most natural form of information exchange, this area is in demand in various technologies related to the processing of audio data. Spectrograms are used by recording studios to remove noise from musical works recorded on old analog media. In human speech recognition technologies, spectrograms are a promising source of data for analyzing the formant composition of speech sounds using neural networks focused on image analysis. Therefore, obtaining an image of high clarity and contrast, allowing stable identification of formants, both in music and in speech, seems to be an urgent task. Known algorithms for constructing spectrograms are based on the use of a discrete Fourier transform, which is due to the presence of a fast transformation algorithm (FFT), which significantly reduces computational costs. The paper points out the shortcomings of the FFT algorithm that may arise when studying the properties of speech signals and presents a new method for constructing spectrograms based on subband representations. The method is based on the use of subband matrices. The work demonstrates the effectiveness of the proposed approach, which consists in a clearer display of areas where the energy of the analyzed sound signal is concentrated, compared to known methods.

Downloads

Download data is not yet available.

Author Biographies

Aleksei V. Boldyshev, PJSC Rostelecom Belgorod branch

Candidate of Technical Sciences, Leading Engineer of the Operation Center of the Belgorod Branch of PJSC Rostelecom,
Belgorod, Russia.

Aleksandra A. Medvedeva, Belgorod State National Research University

Candidate of Technical Sciences, Associate Professor of the Department of Information and Telecommunication Systems and Technologies, Institute of Engineering and Digital Technologies of Belgorod State National Research University,
Belgorod, Russia.

Ekaterina I. Prokhorenko, Belgorod State National Research University

Candidate of Technical Sciences, Associate Professor, Associate Professor of the Department of Information and Telecommunication Systems and Technologies, Institute of Engineering and Digital Technologies of Belgorod State National Research University,
Belgorod, Russia.

Diana I. Gaivoronskaya, Belgorod State National Research University

Candidate of Technical Sciences, Associate Professor of the Department of Information and Telecommunication Systems and Technologies, Institute of Engineering and Digital Technologies of Belgorod State National Research University,
Belgorod, Russia.

References

Балабанова Т.Н., Трубицына Д.И., Болдышев А.В., Прохоренко Е.И., Гайворонский В.А. 2022. Обработка речевых данных в ИТС: Практикум лабораторный. Белгород: Изд-во НИУ «БелГУ», 62 с.

Белов С.П., Белов А.С., Прохоренко Е.И., Балабанова Т.Н. 2022. Субполосная идентификация словных фрагментов речевых сигналов по заданному образцу. Экономика. Информатика. 49(3): 589–596. DOI: 10.52575/2687-0932-2022-49-3-589-596.

Белов С.П., Жиляков Е.Г., Коськин А.В., Трубицына Д.И. 2019. Субполосный анализ и синтез сигналов в рамках косинусного преобразования. Информационные системы и технологии. 4(114): 13–22.

Болдышев А.В., Медведева А.А., Прохоренко Е.И. 2017. Параметрическое описание звуков речи в задаче распознавания. Научные ведомости Белгородского государственного университета. Серия: Экономика. Информатика. 23(272): 159–168.

Гантмахер Ф.Р. 2010. Теория матриц. 5-е изд., М.: ФИЗМАТЛИТ, 560 с.

Жиляков Е.Г. 2015. Оптимальные субполосные методы анализа и синтеза сигналов конечной длительности. Автоматика и телемеханика. 4: 51–66.

Жиляков Е.Г., Белов С.П., Олейник И.И., Трубицына Д.И. 2019. Обобщенный субполосный анализ и синтез сигналов. Инфокоммуникационные технологии. 17(2): 139–145. DOI: 10.18469/ikt.2019.17.2.01.

Жиляков Е.Г., Белов С.П., Прохоренко Е.И. 2007. Методы обработки речевых данных в информационно-телекоммуникационных системах на основе частотных представлений. Белгород: Изд-во БелГУ. 136 с.

Жиляков Е.Г., Трубицына Д.И., Прохоренко Е.И., Болдышев А.В. 2019. Об использовании субполосного анализа и синтеза сигналов в области определения косинус-преобразования при решении задач сжатия речевых сигналов. Научные ведомости Белгородского государственного университета. Серия: Экономика. Информатика. 46(4): 700–709. DOI: 10.18413/2411-3808-2019-46-4-700-709.

Жиляков Е.Г., Туяков С.В. 2011. О вычислении собственных функций субполосного ядра. Вопросы радиоэлектроники. Сер. Электронная вычислительная техника (ЭВТ). 1: 25–34.

Михайлов В.Г., Златоустова Л.В. 1987. Изменение параметров речи. Под. ред. М.А. Сапожникова. М.: Радио и связь, 168 с.: ил.

Сорокин В.Н. 1985. Теория речеобразования. М.: Радио и связь, 312 с.: ил.

Сергиенко А. 2011. Цифровая обработка сигналов 3-е изд. Санкт-Петербург.: БХВ-Петербург, 768 с. ISBN: 978-5-9775-0606-9.

Zhilyakov E.G., Belov S.P., Chernomorets A.A., Trubitsyna D.I., Balabanova T.N. 2019. Subband analysis and synthesis of signals. Compusoft. 8(6): 3206–3211.

Zhilyakov E.G., Belov S.P., Oleinik I.I., Babarinov S.L., Trubitsyna D.I. 2020. Generalized sub band analysis and signal synthesis. Bulletin of Electrical Engineering and Informatics. 9(3): 964–972. DOI: 10.11591/eei.v9i3.1709.


Abstract views: 74

Share

Published

2024-03-30

How to Cite

Boldyshev, A. V., Medvedeva, A. A., Prokhorenko, E. I., & Gaivoronskaya, D. I. (2024). Algorithm for Constructing and Analyzing Spectrograms of Audio Signals. Economics. Information Technologies, 51(1), 250-260. https://doi.org/10.52575/2712-746X-2024-51-1-250-260

Issue

Section

INFOCOMMUNICATION TECHNOLOGIES

Most read articles by the same author(s)