Automatic Detection of Anger and Aggression in Speech Signals

Authors

  • Tatiana N. Balabanova Belgorod State National Research University
  • Kirill V. Abramov Moscow Technical University of Communication and Informatics
  • Alexey V. Boldyshev Belgorod branch of PJSC Rostelecom
  • Dmitry M. Dolbin Belgorod University of Cooperation, Economics and Law

DOI:

https://doi.org/10.52575/2712-746X-2023-50-4-944-954

Keywords:

speech data, speech databases, classification, classification methods, low-level descriptors, anger recognition, aggression recognition

Abstract

The article discusses the issue of detecting anger and aggression in a speech signal. The fundamental differences between anger and aggression are considered. A review of solutions for recognizing destructive behavior in the form of anger and aggression using a speech signal, presented in various modern publications, was carried out. The main classification methods used to solve the problem of recognizing emotions from speech are considered. Information support in the form of Russian-language and non-Russian-language speech databases used to train models for recognizing emotions is analyzed. The main problems of using speech databases are formulated. The issue of choosing speech signal parameters used to classify emotions in general and destructive behavior in particular is considered. Implemented anger recognition on the Russian-language Dusha database using two approaches and three classification methods.

Downloads

Download data is not yet available.

Author Biographies

Tatiana N. Balabanova, Belgorod State National Research University

Сandidate of Technical Sciences, Associate Professor of the Department of Information and Telecommunication Systems and Technologies, Belgorod State National Research University,
Belgorod, Russia

Kirill V. Abramov, Moscow Technical University of Communication and Informatics

4th year student of the Faculty of Information Technologies of the Moscow Technical University of Communications and Informatics,
Moscow, Russia

Alexey V. Boldyshev, Belgorod branch of PJSC Rostelecom

Candidate of Technical Sciences, Leading Telecommunications Engineer of the Belgorod branch of PJSC Rostelecom,
Belgorod, Russia

Dmitry M. Dolbin, Belgorod University of Cooperation, Economics and Law

магистрант 2 курса факультета таможенного дела и информационных технологий, Белгородский университет кооперации, экономики и права,
г. Белгород, Россия

References

Величко А.Н. 2022. Метод анализа речевого сигнала для автоматического определения агрессии в разговорной речи. Вестник Воронежского государственного университета. Серия: Системный анализ и информационные технологии. № 4. С. 180-188.

Кажберова В.В., Чхартишвили А.Г., Губанов Д.А., Козицин И.В., Белявский Е.В., Федянин Д.Н., Черкасов С.Н., Мешков Д.О. 2023. Агрессия в общении медиапользователей: анализ особенностей поведения и взаимного влияния Вестник Московского университета. Серия 10: Журналистика. № 3. С. 26-56.

Buss A., Durkee A.An inventory for assessing different kinds of hostility. 1957. Journal of Consulting Psychology. 21(4): 343–349. URL: https://doi.org/10.1037/h0046900.

Dellaert F., Polzin T., Waibel A. 1996. Recognizing emotion in speech. Proceedings of the 4th Int. Conf. Spoken Lang. Process (ICSLP). pp. 1970–1973.

Eyben F., Weninger F., Gross F., et al. 2013. Recent developments in opensmile, the munich open-source multimedia feature extractor. Proceedings of ACM International Conference on Multimedia. pp. 835–838.

Kim J., Truong K.P., Englebienne G., et al. 2017. Learning spectro-temporal features with 3D CNNs for speech emotion recognition. Proceedings of the 7th International Conference on Affective Computing and Intelligent Interaction (ACII). pp. 383–388.

Kruse R., Borgelt C., Klawonn F., et al. 2022. Multi-layer perceptrons. Computational Intelligence. Springer, Cham. pp. 53-124.

Lefter I., Burghouts G.J., Rothkrantz L.J.M. 2014. An audio-visual dataset of human–human interactions in stressful situations. Journal on Multimodal User Interfaces. 8(1): 29-41.

Lefter I., Jomker C.M., Tuente S.K., et al. 2017. NAA: A multimodal database of negative affect and aggression. Proceedings of the Seventh International Conference on Affective Computing and Intelligent Interaction (ACII). IEEE. pp. 21-27.

Lefter I., Rothkrantz L.J.M., Burghouts G., et al. 2011. Addressing multimodality in overt aggression detection. Proceedings of the International Conference on Text, Speech and Dialogue. Springer, Berlin, Heidelberg. pp. 25-32.

Makarova V. 2000. Acoustic cues of surprise in Russian questions. Journal of the Acoustical Society of Japan (E), 21 (5): 243-250.

Neiberg D., Elenius K., Laskowski K. 2006. Emotion recognition in spontaneous speech using GMMs. Proceedings of the 9th Int. Conf. Spoken Lang. Process. pp. 809– 812.

Nogueiras A., Moreno A., Bonafonte A., et al. 2001. Speech emotion recognition using hidden Markov models. Proceedings of the 7th Eur. Conf. Speech Commun. Technol. pp. 746–749.

Perepelkina O., Kazimirova E., Konstantinova M. 2018. RAMAS: Russian multimodal corpus of dyadic interaction for affective computing. Proceedings of the International Conference on Speech and Computer. Springer, Cham. pp. 501-510.

Raudys Š. 2003. On the universality of the single-layer perceptron model. Neural Networks and Soft Computing. Physica. Heidelberg. pp. 79-86.

Sainath T.N., Vinyals O., Senior A., et al. 2015. Convolutional, long short-term memory, fully connected deep neural networks. Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 4580–4584.

Schuller B.W., Batliner A., Bergler C., et al. 2021. The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates. Proceedings of Interspeech. pp. 431–435.


Abstract views: 37

Share

Published

2023-12-29

How to Cite

Balabanova, T. N., Abramov, K. V., Boldyshev, A. V., & Dolbin, D. M. (2023). Automatic Detection of Anger and Aggression in Speech Signals. Economics. Information Technologies, 50(4), 944-954. https://doi.org/10.52575/2712-746X-2023-50-4-944-954

Issue

Section

INFOCOMMUNICATION TECHNOLOGIES

Most read articles by the same author(s)