2019 year, volume 23, issue 4 (PDF)
Interpretability, linear increase in complexity with data growth, scalability made topic modeling one of the most popular tools for statistical text analysis. However, there are a number of disadvantages caused by the dependence of the solution on the initialization. It is known that the building of a topic model is reduced to solving an ill- posed problem of the non-negative matrix factorization. The set of its solutions in the general case is infinite. Every time the model finds a local extremum. Repeated training of the model for the same collection can lead to detection of more and more new topics. In practice, it is often necessary to define all the topics of the corpus. To solve this problem, the article proposed and investigated a new algorithm for finding the complete set of topics based on the construction of a convex hull. It was shown experimentally that it is possible to construct a basis for the finite number of models. The likelihood of the basis is higher than for a single model with a similar number of topics. Compare of the basis of LDA models (latent Dirichlet allocation) and ARTM models (additive regularization for topic modeling) suggests that the topics of the sets coincide with high accuracy.
Keywords: probabilistic topic modeling, stability of topic models, complete set of topics of topic models, latent Dirichlet allocation, LDA, regularization, ARTM, BigARTM.
It is well known that recognition of a language specified by a regular expression of the form \(\cup_{i=1}^n . * \alpha_i . * \beta_i . * \), where \(\alpha_i\), \(\beta_i\) are words over some alphabet, in the general case requires a deterministic automaton with the number of states which is exponential in n. We propose a design of a structural automaton of a polynomial spatial complexity that recognizes languages from a given class.
Keywords: DFA, structural automaton, regular language, exponential blowup.
Automated proof search in single-conclusion sequential variant of classical predicate calculus is considered. In this algorithm, metavariables and partial skolemization are used. Theorems of soundness and completeness for the considered algorithm are proved.
Keywords: automated theorem proving, mechanical proof search, first order language, predicate calculus, sequent calculus, production system, artificial intelligence.
This article investigates the completeness problem for polynomial functions with rational coefficients, as well as problems of a functional nature generated by its solution, namely: studying the structure of closed and precomplete classes, the problem of bases of complete systems. In particular, it was proved that 1) the system of functions is complete if and only if it is not wholly contained in any precomplete class; 2) the cardinality of the set of all precomplete classes is equal to the continuum; 3) there is a complete system that does not have a basis.
Keywords: polynomial, functional system, problem completeness, rational function, closed classes.
We consider the problem of exact attribute-efficient learning functions of Post’s closed classes with the help of comparation queries. Here, we show that the complexity of learning by comparation queries is not worse than by membership queries. Particularly, for some classes, if we use comparation queries, we get better value of complexity function.
Keywords: exact learning, attribute-efficient learning, membership queries, comparation queries, Post’s closed classes.
A brief description of results of A-completeness problem for linear finite state automata, functioning over the ring of dyadic rationals is given. Conditions of A-completeness for systems, containing all one- valued linear automata with finite additive and finite systems with a summing automaton are stated.
Keywords: finite state automata, linear automata, dyadic rationals, A-completeness, maximum subclasses.
Русский