12.2014 - volume 18 issue 4 (PDF)
This article will discuss solving the problem of recognizing illegal content in ads on the Avito website [1], published on the popular service for solving machine learning problems Kaggle [2]. We considered an algorithm based on transforming text fields using natural language processing methods [3] and finding rules from words and phrases that identify an ad as spam. The paper examined the main options for processing text information, and also proved on the basis of experiments that complex processing, such as building bigrams and stemming, does not lead to increased accuracy.
Keywords: big data, text analysis, natural language processing, data mining, association rules.
To solve a number of problems of medical diagnostics, possibilistic methods for modeling medical objects and the process of making a diagnosis, based on the modern theory of possibilities [1], have been developed and studied. The article is devoted to the application of probabilistic and possibilistic algorithms and programs for learning and recognition in the conditions of fuzzy description of medical objects and variability of their probabilistic characteristics over time. The software package for diagnosing diseases allows for the training and diagnosis procedure to be carried out in a dialog mode with the participation of a doctor and is used in this work to solve the problem of diagnosing acute appendicitis.
Keywords: pattern recognition, software package, identification problem, probabilistic diagnostic model, possibilistic diagnostic model, granulation, Kohr's classification algorithm, acute appendicitis.
This work considers the possibility of using a multi-agent approach to calculate the energy balance of the city of Kaliningrad and its neighboring countries. The stages of constructing a multi-agent model in the Anylogic simulation environment are shown, and the results of modeling are presented. The advantages and limitations of this approach and the possibilities of its application to solve other problems are discussed.
Keywords: energy, agent, model, smart grid.
For the first time, a fundamental methodology of intelligent systems of full intelligence with intuition for a wide practical purpose, capable of dynamically and effectively self-learning, self-organizing and making decisions quickly, is presented. Technical solutions for the practical implementation of an intelligent system with full intelligence and intuition are presented.
Keywords: intelligent system, full intelligence, intuition, learning, recognition of information patterns.
A system of control of the operator's condition of a complex technical system is proposed, oriented on the analysis of his speech response. To enable work in strong noise, additional channels of information input are provided, as well as a special scenario of the dialogue of the operator with the computer.
Keywords: speech recognition, strong noise, self-comparison.
A variant of the theory of possibilities is considered, which allows describing the agreement of a research group on their common interpretation of some values of possibility and necessity. This makes it possible to clarify the criterion of optimality of the decision rule in the identification problem. For the case of modeling a stochastic object, an algorithm for restoring the possibility taking values in a scale with fixed points, and these points themselves, is described.
Keywords: possibility, necessity, agreement, optimal decision rule.
This paper introduces iterative propositional calculi, which are finite sets of propositional formulas together with the modus ponens operation and the superposition operation defined by the set of Mal'tsev operations. For such calculi, the question of decidability of the expressibility problem is studied. In particular, it will be shown that there are undecidable iterative calculi. An approach to describing decidable iterative calculi will be proposed, based on defining such calculi by clones of k-valued logics. In addition, a lattice of clones of three-valued logic generating iterative calculi will be described, and the continuity of the set of decidable iterative calculi will be proved.
Keywords: iterative propositional calculus, interpretation, finite model, clones of three-valued logic.
The information monitoring technology was developed to analyze complex, weakly formalized problems (processes) based on all available information, to build forecasts of their development and to develop recommendations for managing their development. In this paper, the information monitoring technology is formalized using the classical apparatus of discrete mathematics — functional element schemes and functions of k-valued logic. This formalization solves two key problems of the information monitoring technology — checking the stability of the model and the problem of optimal resource allocation.
Keywords: information monitoring, optimal resource allocation, stability of discrete systems.
The article discusses the problems of generating specifications, characteristics and diagrams of ventilation units in dxf format, as well as generating assembled units in 3D format. This functionality should be implemented using web technologies, since it will be embedded in a web application for calculating ventilation equipment. In particular, the ability to rotate the 3D model of the unit in the browser should be implemented.
Keywords: drawing generation, AutoCAD, openjscad, 3D, dxf, ventilation units.
The article considers the task of developing an industrial ventilation equipment configurator in the form of a web application. We implement a web interface that allows you to assemble any supply, exhaust and supply and exhaust unit in the browser by simply dragging sections with a mouse or finger. This solution is ready for embedding in a web application for calculating ventilation units.
Keywords: JavaScript, HTML, CSS, configurator, industrial ventilation.
A method for solving the problem of "exponential explosion" of the number of states of a finite automaton recognizing a set of regular languages defined by the union of regular expressions of the form . ∗R1.∗R2.∗, where R1 and R2 are arbitrary regular expressions is considered. An extension of this method is proposed for the case of a union of an arbitrary number of regular expressions of this type. Estimates are given for the number of states of the automaton with such a change in the case of an alphabet consisting of at least three symbols. It is shown that the relative decrease in the number of states can be arbitrary. The practical efficiency of the proposed method is analyzed as applied to regular expressions of the Snort system.
Keywords: finite automata, regular expressions, intrusion detection systems.
This paper studies the functioning of the lungs of a smoker in a clean environment. The dependence of the deterioration of the efficiency of cilia on the time during which nicotine was in the lungs was revealed. A time estimate of the complexity of the process of self-cleaning of the lungs from nicotine was also obtained.
Keywords: lungs, self-cleaning process, clean environment, nicotine.
In this paper, we will consider propositional calculuses whose formulas are formed by logical connectives containing classical implication, and whose inference rules are the modus ponens and substitution operations. It is known that in the general case, the problem of recognizing the expressibility of some calculuses through others is algorithmically unsolvable. In this paper, we will consider special cases of this problem: recognition of axiomatization, recognition of extension, and recognition of completeness. In particular, it will be shown that the problem of recognizing an extension is algorithmically undecidable for any calculus, and the problems of recognizing axiomatization and completeness are algorithmically undecidable for any calculus from which the formula x→(y→x) is derivable.
Keywords: classical and intuitionistic propositional calculus, implicative calculus, decidability, recognizing axiomatization, extension, and completeness, tag system.
The concept of stellar height was first introduced by Eggan in 1963 along with the concept of cyclic complexity of a finite automaton. The cyclic complexity of an automaton is related to the stellar height of the language it accepts: the stellar height of a regular language is equal to the minimum cyclic complexity among automata that accept this language. However, the problem of finding a minimal (with respect to cyclic complexity) automaton still remains open. This paper considers the relationship between the cyclic complexity of a minimal automaton of a regular language and its stellar height.
Keywords: finite automata, regular languages, stellar height, cyclic complexity, minimal automaton.
The problem of finding the chromatic number of graphs is one of the most attractive and challenging problems in graph theory. It is known that for biplanar graphs (graphs realized without intersections of edges on two sides of the plane), the chromatic number is not less than 9 and not more than 12. In this paper, we show that the chromatic number of biplanar graphs without triangles is not less than 5 and not more than 8.
Keywords: biplanar graph, chromatic number, triangle-free graph, graph of thickness 2.
The paper continues the study of issues of algorithmic recognition of the property of reversibility for cellular automata. A class of two-dimensional binary cellular automata with a fixed local transition function with 91 variables is constructed, in which the problem of reversibility property recognition is algorithmically undecidable.
Keywords: cellular automata, reversible cellular automata, Turing machines, halting problem.
The article presents a new proof of Gödel's theorem on the incompleteness of formal logical systems, based on concepts of functional programming.
Keywords: mathematical logic, incompleteness of formal systems, Gödel's theorem, functional programming.
09.2014 volume 18 issue 3 (PDF)
By analyzing the basic definitions of intelligence from various sources, the question is raised about the admissibility of applying the term intelligence to constructions and materials.
Keywords: intelligence, cognition, reason (reason), creativity, knowledge, abstraction, cybernetics, adaptation.
Based on the analysis of the history of scientific discoveries, as well as data from cognitive psychology and the theory of artificial intelligence, it was established that the main factors that exclude the algorithmic nature of creative activity in science are the probabilistic nature of inductive inference, the trial and error method (the method of sequential enumeration), the factor of chance in scientific discovery, and Gödel's incompleteness theorem. This allows us to solve S. Smale's 18th problem: what are the limits of intelligence, both artificial and human?
Keywords: creative activity, artificial intelligence, inductive inference, the factor of chance in scientific discovery, Gödel's incompleteness theorem.
The paper considers the problem of restoring a three-dimensional image from the codes of its flat projections. An algorithm is described that works significantly faster than the one previously proposed in [3]. The results of computer experiments are presented to compare the speed of these algorithms: the old and the new.
Keywords: pattern recognition, affine transformations, geometric recognition, stereo vision.
The article considers computer modeling of probabilistic and possibility models of measuring and computing converters (MCC) based on two versions of the possibility theory and the dependence of the quality of the MCC as a measuring instrument on the quality of the measuring converter. A comparison of the quality of the MCC for the probabilistic and possibility models of measurement is carried out, and the problem of measurement reduction for the measurement model described by the second version of the possibility theory is solved.
Keywords: measuring and computing systems, reduction, sensor quality, second version of the possibility theory.
The problem of forecasting the formation of ice jams on northern rivers is considered. In the case when the locations of jams are known, it is necessary to forecast the power of the phenomenon. The combinatorial-logical approach of pattern recognition theory was used for the solution.
Keywords: pattern recognition, tests.
In the traditional formulation, the problem of semantic image segmentation uses a training sample of images labeled pixel by pixel. Obtaining such labeling requires significant human effort. A method for learning semantic segmentation is proposed that allows using less detailed information, the acquisition of which in practice requires less effort, for example, dense frames around objects in the image or a set of unique image labels.
Keywords: machine learning, structural support vector machine, loss function, semantic image segmentation.
The paper studies algebras of closed terms whose operations are defined by first-order formulas with a single equality predicate. For such algebras, it is shown that in the general case, the problem of derivability of terms is algorithmically undecidable. When considering special cases of the derivability problem, a Galois correspondence is constructed between operations and finite automata over terms. On its basis, sufficient conditions for the algorithmic decidability of special cases of the problem of derivability of terms and the problem of expressibility of operations on terms are proved.
Keywords: algebras of terms, derivability problem, expressibility problem, automata over terms, Galois correspondence.
The paper studies search algorithms used in the background mode and proposes a mathematical model of these algorithms based on the concept of an information graph. The paper also proposes a background algorithm for solving a two-dimensional dominance problem with linear memory costs and a constant average search time. For comparison, we note that a non-background algorithm, whose average search time is equal to a constant plus the average time to enumerate the answer, requires quadratic memory costs.x
Keywords: information graph model, dominance problem, background search.
The paper presents the derivation of an estimate of the number of ways to partition the vertices of a unit cube by a polynomially defined surface.
Keywords: Boolean function, polynomial separability.
The task of sentiment analysis plays an important role in natural language processing. The problem of classifying Russian-language text into two classes depending on its emotional coloring is considered: positive and negative. The naive Bayesian classifier is used as a classifier. Various methods for feature selection are used, the obtained results are compared with the results of classification of English-language text. The accuracy of 78.4% is achieved on the given test data set.
Keywords: sentiment analysis, naive Bayesian classifier, feature selection.
The paper considers the problem of constructing a signal routing tree with given values of signal delays to the tree leaves. An algorithm is proposed that, for a given multiset of natural numbers, constructs a tree in linear time with delays to the tree leaves that coincide with the elements of the given multiset, or says that such a tree cannot be constructed.
Keywords: synthesis of large integrated circuits, signal routing, buffer tree.
The completeness problem for the class of linear-automaton functions over the field GF(2) with composition operations was previously solved. In the present paper, this result is generalized to the class of linear-automaton functions over GF(p) for a prime p. All precomplete subclasses are found in the class under consideration.
Keywords: finite automaton, linear automaton, linear-automaton function, composition operations, completeness problem, completeness criterion, precomplete classes, adder, delay.
The general problem of constructing various types of circuit control systems and some functionals of their complexity are considered. Particular attention is paid to the complexity of circuit modeling on a computer. A general method for modeling circuits that implement Boolean functions is proposed.
Keywords: circuits from functional elements, contact circuits, complexity of control systems, modeling.
06.2014 - Volume 18 Issue 2 (PDF)
The article describes the MaTIS department, its main areas of activity, tasks and achievements.
Keywords: mathematical theory of intelligent systems.
Methods for modeling incomplete and unreliable knowledge of the model M(x) of an object, depending on an unknown x ∈ X, expressed in the form of subjective judgments of the researcher are considered. The mathematical model of subjective judgments is defined as an uncertain element xe, canonical for the space (X,ρ(X),Pl˜x) with measures of likelihood and confidence, which characterizes the subjective judgments of the researcher about the truth of each value x ∈ X by the values of likelihood Pl˜x of the equality ˜x=x and confidence Bel˜x=x of the inequality ˜x≠x. Examples of studying uncertain models of a measuring and computing converter, ionospheric radio sounding, and Maxwell's pendulum are given.
Keywords: subjective judgments, likelihood, confidence, integral, measure, random uncertain element, intelligent dialogue, measuring and computing systems.
This paper sets the task of analyzing the security of an information network. An automata-theoretic approach is used to model the behavior of an intruder, a partial order relation is introduced on a set of states - vectors, the value of the coordinates of which reflects the level of access rights to the corresponding services on the analyzed nodes. The concept of an attack path is considered and the corresponding quantitative characteristics are studied. The problem of detecting network vulnerabilities is reformulated in terms of finding states reachable from a certain initial state and describing possible paths to them. Based on the constructed mathematical model, algorithms are proposed that solve the problems and their possible modifications. Approaches to reducing the constructed automaton are also considered.
Keywords: computer security, security assessment, finite state machines, attack paths.
The paper shows how a flat environment "generates" images, and how pattern recognition is represented in this case. Some statements about the properties of image codes are proved.
Keywords: pattern recognition, visual images, mathematical modeling in biology, robotic vision.
A system of relations for knowledge formalisms, which constitute a fragment of the family of recognized approaches to knowledge description formats in intelligent systems, is proved.
Keywords: knowledge representation formalism, semantic structure, algebraic structure, semantic embedding.
The paper considers the problem of automatic recognition of texts of laws related to filling out reporting forms. This task is as follows: to create a computer program that would accept texts of laws as input and would output another program that would fill out a form in accordance with these laws. A method for solving this problem is proposed, based on the technology of computer modeling of logical processes.
Keywords: semantic analysis of texts, normative legal acts, computer modeling of logical processes.
A number of papers study the relationship between such important, from the point of view of theoretical cryptography, properties of Boolean functions as nonlinearities of various orders (the distance of a function to polynomials of degree no higher than a fixed one) and algebraic immunity. One of the main areas of research is the issue of obtaining lower bounds for nonlinearity of a fixed order via the value of algebraic immunity of a function. In particular, exact bounds for nonlinearities of the first and second orders were obtained. In this paper, we prove a new bound for nonlinearity of an arbitrary order via the value of algebraic immunity, which is stronger than previous similar results by different authors.
Keywords: Boolean function, algebraic immunity, Boolean function degree, nonlinearity, high-order nonlinearity, Reede-Muller code.
This paper considers the problem of approximating time series by shortest-length broken lines with a given accuracy. A time series is interpreted as a sequence of plane points, and the problem of approximating this sequence by a broken line leads to finding the shortest path between two segments. An algorithm is obtained that solves the problem using the construction of a visibility graph and Dijkstra's algorithm.
Keywords: plane point approximation, visibility graph, plane with polygonal obstacles, shortest path, Dijkstra's algorithm.
The article discusses the implementation of a CRM system for the Android platform. The system allows you to manage clients, contacts, and contacts. The article describes the interface of this system and its implementation, as well as the implementation of data synchronization with the server part.
Keywords: Android, CRM, synchronization.
The problem of synchronizing data and code between servers physically located in different places is considered. A database replication system is implemented, a version control system is configured, as well as a directory content replication system.
Keywords: version control system, database replication, git, unison.
The work is devoted to the presentation of the results of computer modeling of logical processes in the logical system "Iskra". The issues related to the training of computer problem solvers and automatic synthesis of techniques were studied.
Keywords: intelligent system, self-learning, logical system, computer problem solver, automatic synthesis of techniques.
Given a natural number N, a scheme for checking the affine equivalence of two flat N-point images is constructed using a basis of linear functions, a logarithm, an exponential, a signum function, and a frame root extraction function. The nonlinear depth of the schemes does not depend on N. Thus, the processing time of images by the obtained schemes does not depend on the size of the problem.
Keywords: keypoints of images, keypoint dynamics, video sequence, affine transformation of a plane, checking the affine equivalence of images, nonlinear depth scheme.
The article considers the power of flat circuits simulating VLSI. A flat circuit is a laying out of a circuit of functional elements on a two-dimensional integer grid. All power estimates for flat circuits are also true for more complex VLSI models. The considered power measure models the electrical power allocated both to logical elements and to circuit wires. It is assumed that the electrical power consumed by a wire of unit length is commensurate with the power consumed by one logical element. A lower bound (m√|D|)/(√min(m,log|D|) for the average cardinality of almost all partial Boolean operators with domain D and m outputs is obtained. Lower bounds for the cardinality under some constraints on the arrangement of the circuit outputs are also obtained.
Keywords: circuits from functional elements, VLSI model, flat circuits, circuit cardinality, Shannon function, lower bounds, partial Boolean operators.
This paper generalizes the construction of a parametric family of Latin squares over a direct product of Abelian groups, previously proposed in work [2], to the multidimensional case in order to increase the number of parameters. A criterion for the feasibility of this design is given. The issue of the relationship between correctness and cycles in the essential dependence graph for functions over residue rings is also considered.
Keywords: correct family of functions, Latin squares, parametric family of Latin squares, criterion for a multidimensional Latin square, essential dependence graph of a family of functions, relationship between correctness and cycles in the essential dependence graph.
In the author's dissertation "On the Optimization of Structural Implementation of Neural Networks", neural circuits were considered over the field of real numbers. Most of the results related to McCulloch-Pitts circuits are literally transferred to neural circuits over the field of rational numbers, but the concept of a transcendental number was significantly used to prove one of the theorems. Nevertheless, the result of the theorem for neural circuits over the field of rational numbers is preserved. A new proof is given in this paper.
Keywords: neural circuits, Shannon function.
This paper studies quasi-cyclic low-density codes. An upper bound for the minimum distance is given, which depends on the minimum and maximum number of ones in the columns of the parity-check matrix of the code.
Keywords: minimum distance, quasi-cyclic low-density code.
03.2014 - Volume 18 Issue 1 (PDF)
The article presents a methodology for constructing a phenomenological dictionary of the theory of intelligent systems based on the post-non-classical information-evolutionary approach to systems analysis and modeling of objective reality, the attributive-ingredient concept of information and the concept of controlled evolution of natural language. The concepts of "information", "evolution" and "modeling" are defined as the terminological basis of the proposed methodology.
Keywords: information, modeling, thinking, knowledge, theory of intelligent systems, phenomenology, phylogenesis, evolution.
The article discusses the main methods for implementing search using regular expressions used in network intrusion detection systems to examine the contents of network packets. Both the simplest non-deterministic and deterministic finite automata and their various modifications that demonstrate higher performance and/or use less RAM are presented.
Keywords: finite automata, regular expressions, network intrusion detection systems, examination of the contents of network packets.
The paper investigates the properties of the operation of multiplication with a parameter, which is significantly used in the cryptographic standards of the Republic of Uzbekistan. The structure of the group of invertible elements is studied; the complexity of the discrete logarithm operation is established.x
Keywords: multiplicative group, cryptography, electronic signature, discrete logarithm.
The problem of developing a modern online support system for scientific and educational processes DvinemNauku (http://dvinemnauku.ru) is considered. The system allows to support scientific and educational processes with online displays of these processes. The system is divided into two parts: server and client. The server part is used to store, process and output data. The client part is used to present and collect data. Within the framework of the work, a subsystem for managing processes, events and comments was implemented, the system architecture and database structure were designed, and auxiliary modules were implemented.
Keywords: scientific and educational process, database design, scientific event, DvinemNauku, web application.
The task of developing a website for familiarization with the materials of the seminars held by the MSU Rector V.A. Sadovnichy is considered. The website http://sa.msu.ru was created, allowing the results of the seminars to be posted to the public: articles, reports and photographs. The architecture of the site implies the use of a ready-made online support system for scientific and educational processes DvinemNauku http://dvinemnauku.ru for storing and maintaining events related to seminars.
Keywords: site, DvinemNauku, database, program interface, seminar materials.
This paper considers a mathematical model of dynamic databases that processes three types of queries: search, insertion and deletion. It is based on the interaction of an information graph [1] and a finite deterministic automaton [2]. The model allows solving dynamic search problems and assessing the complexity of their solution. As an example illustrating the operation of the model, a well-known dynamic problem of searching for identical objects is considered and solved.
Keywords: dynamic databases, mathematical modeling, information graph.
The paper shows that the correctness of a family of functions is equivalent to the correctness of families arising on strong components and blocks of components of the essential dependence graph of the family of functions. The problem of simplifying a family whose essential dependence graph contains a "trivial" path is solved. With this simplification of the family, the property of correctness does not change. The results obtained are valid for families of functions of k-valued logic, where k≥2.
Keywords: regular families of functions, Latin squares.
The paper considers the problem of constructing moving images by a cellular automaton on a screen that is a finite or infinite strip. Algorithms for constructing moving images for some classes of images are found. It is shown that for any infinite screen there is a law of motion that is not implemented on this screen.
Keywords: cellular automaton, infinite screen, universal screen, moving image construction, autonomous motion, motion speed.
Within the framework of the discrete geometric approach to pattern recognition, an algorithm is presented that calculates all classes of a’-equivalent images with a given code. The algorithm is applicable to images of arbitrary dimension n≥2, and its time complexity depends linearly on the number of points in the image.
Keywords: pattern recognition, image code, image restoration algorithm, time complexity.
Natural languages have the property of constant frequency of occurrence of letters and letter pairs. In the article, regular languages with this property are studied.
Keywords: natural language, regular language, Markov chain, Markov language.
The paper studies spectrum of the class of thin languages. A classification of languages by their spectral properties is given. Two concepts of dimension are defined for the spectrum of a thin language. It turns out that for an arbitrary natural number n, one can give an example of a spectrum in which both the first and second dimensions are equal to n. The question is considered of what valuesof these pairs of dimensions in the aggregate are found in the spectrum of thin languages. For each pair m, n of natural numbers such that m=n=1 or 2≤m≤m, it is possible to construct a thin language with a spectrum in which the first dimension is equal to m and the second is equal to n. The technique used is non-trivial and allows one to pose and solve several number-theoretical problems at once, which in themselves are of some interest.
Keywords: thin languages, regular expressions, alphabetic coding.
The paper proves the property of preserving periodic sequences by pushdown automata and studies estimates for the length of the period of the outgoing sequence depending on the period of the incoming sequence and the characteristics of the automaton.
Keywords: pushdown automaton, deterministic function, periodic sequences.
All A-precomplete subclasses in classes of linear automata functions over finite fields are found. Algorithmic decidability of checking the A-completeness of finite subsets in the classes under consideration is proven.
Keywords: finite automaton, linear automaton, linear automaton function, A-closure operator, A-completeness problem, completeness criterion, A-precomplete class, adder, delay.
The article refines the upper bound for the stratification of arbitrary complete systems of Boolean functions. It also answers the question of whether the stratification of any complete system in the class of k-valued logic functions is finite. The concept of stratification of closed classes of k-valued logic functions is introduced and stratification estimates for all closed classes of Boolean functions are given.
Keywords: Boolean function, k-valued logic function, complete system, complexity, stratification, closed class.