About Detecting Incorrect Use of Terminology in Russian Texts Based on Domain Ontology
semantic analysis of text, domain ontology, automatic text processingAbstract
Semantic analysis of text is actively explored as an important direction in computational linguistics. Advancements in its development can be traced in intelligent systems that process natural language text. One type of such systems is aimed at finding and correcting errors in the text. However, these systems are incapable of working with the specialized lexicon of scientific texts. Therefore, the task of detecting lexico-semantic errors related to the incorrect use of terminology cannot be delegated to an intelligent system, as there is a lack of corresponding theoretical and programmatic solutions. Consequently, the research aims to formalize the task of detecting incorrect usage of terminology in the Russian text within a specific subject area. The problem formulation arises from its practical application and the subject area. The proposed mathematical model of the task is described using the conceptual apparatus of set theory. The transition to the mathematical model of ontology and its logical description is carried out, allowing for the further development of the subject area ontology to detect incorrect terminology usage in text using the OWL 2 ontology language. During the formalization of the task, which also implies the use of the designed ontology, a mechanism for comparing the contexts of the term from the processed text and the ontology is suggested. Errors I and II types were assessed, and an algorithm for making a decision about the presence or absence of a connection between the analyzed term and the ontology subject area was formed. Thus, the result of the research is the formalization of an approach to detecting the incorrect use of terminology in Russian scientific texts based on the ontology of the subject area.
