Word-based block-sorting text compression

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Citations (Scopus)

Abstract

Block sorting is an innovative compression mechanism introduced in by M. Burrows and D.J. Wheeler (1994). It involves three steps: permuting the input one block at a time through the use of the Burrows-Wheeler transform (BWT); applying a move-to-front (MTF) transform to each of the permuted blocks; and then entropy coding the output with a Huffman or arithmetic coder. Until now, block-sorting implementations have assumed that the input message is a sequence of characters. In this paper, we extend the block-sorting mechanism to word-based models. We also consider other transformations as an alternative to MTF, and are able to show improved compression results compared to MTF. For large text files, the combination of word-based modelling, BWT and MTF-like transformations allows excellent compression effectiveness to be attained within reasonable resource costs.

Original languageEnglish
Title of host publicationProceedings - 24th Australasian Computer Science Conference, ACSC 2001
EditorsMichael Oudshoom
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages92-99
Number of pages8
ISBN (Electronic)0769509630, 9780769509631
DOIs
Publication statusPublished - 2001
Event24th Australasian Computer Science Conference, ACSC 2001 - Gold Coast, Australia
Duration: 29 Jan 20012 Feb 2001

Publication series

NameProceedings - 24th Australasian Computer Science Conference, ACSC 2001

Conference

Conference24th Australasian Computer Science Conference, ACSC 2001
Country/TerritoryAustralia
CityGold Coast
Period29/01/012/02/01

Fingerprint

Dive into the research topics of 'Word-based block-sorting text compression'. Together they form a unique fingerprint.

Cite this