Ordinal Aggregation of Word Embedding Sets
Received: 15 May 2026 Revised: 19 May 2026 Accepted: 19 May 2026
Published: 2026, vol. 30, issue 2, pp. 10–26
Abstract
This paper investigates the problem of merging word embedding sets from various pre-trained models. Since direct vector space alignment faces computational challenges and local minima issues, it is proposed to transition to ordinal aggregation methods. The absence of cycles in individual ordinal matrices of pairwise distance comparisons is proven, which guarantees the equivalence of the Kemeny and Copeland algorithms in this case. For aggregated matrices where the Condorcet paradox is possible, Copeland’s algorithm serves as an efficient heuristic. The Borda count method, which does not require building tournament graphs, is also considered. Experiments were conducted on 10 models and 3 datasets (WordSim-353, MEN, SimLex-999). It is shown that ordinal methods can identify optimal model combinations (triples, quads, and quints) that outperform individual models in Spearman correlation, with minimal difference between the Borda and Kemeny/Copeland methods.
Keywords: word embeddings, model aggregation, Borda count, Kemeny rule, Copeland algorithm, tournament graphs, ordinal matrix.
BibTeX
@article{IS-Kolosov2026,
author = {Kolosov, Alexey Mikhajlovich},
title = {{Ordinal Aggregation of Word Embedding Sets}},
journal = {Intelligent Systems. Theory and Applications},
year = {2026},
volume = {30},
number = {2},
pages = {10--26},
}
AMSBIB
\Bibitem{IS-Kolosov2026}
\by A.\,M.~Kolosov
\paper Ordinal Aggregation of Word Embedding Sets
\jour Intelligent Systems. Theory and Applications
\yr 2026
\vol 30
\issue 2
\pages 10--26
\lang In Russian
RU