TY - JOUR
T1 - Fast and Efficient Entropy Coding Architectures for Massive Data Compression
AU - Auli-Llinas, Francesc
N1 - Funding: This research was funded by the Spanish Ministry of Science, Innovation and Universities (MICIU) and by the European Regional Development Fund (FEDER) under Grant PID2021-125258OB-I00 and by the Catalan Government under Grant SGR2021-00643.
PY - 2023/10
Y1 - 2023/10
N2 - The compression of data is fundamental to alleviating the costs of transmitting and storing massive datasets employed in myriad fields of our society. Most compression systems employ an entropy coder in their coding pipeline to remove the redundancy of coded symbols. The entropy-coding stage needs to be efficient, to yield high compression ratios, and fast, to process large amounts of data rapidly. Despite their widespread use, entropy coders are commonly assessed for some particular scenario or coding system. This work provides a general framework to assess and optimize different entropy coders. First, the paper describes three main families of entropy coders, namely those based on variable-to-variable length codes (V2VLC), arithmetic coding (AC), and tabled asymmetric numeral systems (tANS). Then, a low-complexity architecture for the most representative coder(s) of each family is presented—more precisely, a general version of V2VLC, the MQ, M, and a fixed-length version of AC and two different implementations of tANS. These coders are evaluated under different coding conditions in terms of compression efficiency and computational throughput. The results obtained suggest that V2VLC and tANS achieve the highest compression ratios for most coding rates and that the AC coder that uses fixed-length codewords attains the highest throughput. The experimental evaluation discloses the advantages and shortcomings of each entropy-coding scheme, providing insights that may help to select this stage in forthcoming compression systems.
AB - The compression of data is fundamental to alleviating the costs of transmitting and storing massive datasets employed in myriad fields of our society. Most compression systems employ an entropy coder in their coding pipeline to remove the redundancy of coded symbols. The entropy-coding stage needs to be efficient, to yield high compression ratios, and fast, to process large amounts of data rapidly. Despite their widespread use, entropy coders are commonly assessed for some particular scenario or coding system. This work provides a general framework to assess and optimize different entropy coders. First, the paper describes three main families of entropy coders, namely those based on variable-to-variable length codes (V2VLC), arithmetic coding (AC), and tabled asymmetric numeral systems (tANS). Then, a low-complexity architecture for the most representative coder(s) of each family is presented—more precisely, a general version of V2VLC, the MQ, M, and a fixed-length version of AC and two different implementations of tANS. These coders are evaluated under different coding conditions in terms of compression efficiency and computational throughput. The results obtained suggest that V2VLC and tANS achieve the highest compression ratios for most coding rates and that the AC coder that uses fixed-length codewords attains the highest throughput. The experimental evaluation discloses the advantages and shortcomings of each entropy-coding scheme, providing insights that may help to select this stage in forthcoming compression systems.
KW - entropy coding
KW - variable-to-variable length codes
KW - arithmetic coding
KW - asymmetric numeral systems
UR - https://www.mendeley.com/catalogue/1a68f534-b603-3f59-9a52-1a092a2cb249/
U2 - 10.3390/technologies11050132
DO - 10.3390/technologies11050132
M3 - Article
SN - 2227-7080
VL - 11
JO - Technologies
JF - Technologies
IS - 5
M1 - 132
ER -