Binarization of language models
Received: 25 Mar 2026
Published: 2025, vol. 29, issue 3, pp. 119–145
Abstract
Large language models are widely used in the field of natural language processing. However, despite their high eficiency, the application of large language models becomes dificult due to their high computational and memory costs. One of the ways to solve this problem is neural network quantization, that is, converting the weights and activations of the network to a representation with lower bit-width. A special case of quantization is binarization, which is the compression of network parameters to a bit-width of 1 bit. In this paper, the structure of binary neural networks is considered, an overview of current methods of language model binarization is provided, and the results obtained are described.
Keywords: natural language processing, binary neural networks, binarization, quantization, large language models.
BibTeX
@article{IS-Davydova2025,
author = {Davydova, Daria Nikolaevna},
title = {{Binarization of language models}},
journal = {Intelligent Systems. Theory and Applications},
year = {2025},
volume = {29},
number = {3},
pages = {119--145},
}
AMSBIB
\Bibitem{IS-Davydova2025}
\by D.\,N.~Davydova
\paper Binarization of language models
\jour Intelligent Systems. Theory and Applications
\yr 2025
\vol 29
\issue 3
\pages 119--145
\lang In Russian
Русский