Move-to-front transform: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 03:46, 25 February 2020 edit AnomieBOT (talk \| contribs) Bots 6,437,826 edits m Dating maintenance tags: {{Inappropriate person}} ← Previous edit		Latest revision as of 08:01, 9 August 2024 edit undo Cedar101 (talk \| contribs) Extended confirmed users 16,377 edits m →‎The transform: monospace
(27 intermediate revisions by 17 users not shown)
Line 1: {{more citations needed\|date=May 2011}} ~~{{multiple issues\|~~ ~~{{More footnotes\|date=May 2011}}~~ ~~{{refimprove\|date=May 2011}}~~ ~~{{Inappropriate person\|date=February 2020}}~~ }} The '''move-to-front (MTF) transform''' is an [[code\|encoding]] of [[data]] (typically a stream of [[byte]]s) designed to improve the performance of [[entropy encoding]] techniques of [[Data compression\|compression]]. When efficiently implemented, it is fast enough that its benefits usually justify including it as an extra step in data compression [[algorithm]]s. This algorithm was first published by [[:ru:Рябко, Борис Яковлевич\|Boris Ryabko]] under the name of "book stack" in 1980.<ref>{{cite journal \|last=Ryabko \|first=Boris Yakovlevich \|author-link=:ru:Рябко, Борис Яковлевич \|title=Data compression by means of a "book stack” \|journal=Problems of Information Transmission \|year=1980 \|volume=16 \|issue=4 \|pages=265-269 \|url=https://www.mathnet.ru/links/185b89a03327cb0466fb2846e55f6404/ppi1458.pdf \|zbl=0466.94007}}</ref> Subsequently, it was rediscovered by J.K. Bentley et al. in 1986,<ref>{{cite journal\|url=https://www.cs.cmu.edu/~sleator/papers/Adaptive-data-compression.htm\|title=A Locally Adaptive Data Compression Scheme\|first1=Jon Louis \|last1=Bentley \|author-link1=Jon Bentley (computer scientist) \|first2=Daniel Dominic Kaplan \|last2=Sleator \|author-link2=Daniel Sleator \|first3=Robert Endre \|last3=Tarjan \|author-link3= Robert Tarjan \|first4=V. K. \|last4=Wei \|journal=Communications of the ACM\|volume=29\|issue=4\|pages=320–330\|year=1986\|doi=10.1145/5684.5688\|citeseerx=10.1.1.69.807\|s2cid=5854590 }}</ref> as attested in the explanatory note.<ref>{{cite journal \|last1=Ryabko \|first1=Boris Yakovlevich \|author-link1=:ru:Рябко, Борис Яковлевич \|first2=R. Nigel \|last2=Horspool \|author-link2=Nigel Horspool \|first3=Gordon Villy \|last3=Cormack \|author-link3=Gordon Cormack \|title=Comments to: "A locally adaptive data compression scheme" by J. L. Bentley, D. D. Sleator, R. E. Tarjan and V. K. Wei.\|journal=Comm. ACM\|volume=30\|year=1987\|issue=9\|pages=792–794\|doi=10.1145/30401.315747\|s2cid=16138142 \|doi-access=free}}</ref> ==The transform== The main idea is that each symbol in the data is replaced by its index in the stack of “recently used symbols”. For example, long sequences of identical symbols are replaced by as many zeroes, whereas when a symbol that has not been used in a long time appears, it is replaced with a large number. Thus at the end the data is transformed into a sequence of integers; if the data exhibits a lot of local correlations, then these integers tend to be small. This algorithm was published in a paper by Ryabko.<ref>Ryabko, B. Ya Data compression by means of a "book stack”, Problems of Information Transmission, 1980, v. 16: (4), pp. 265–269</ref> The original name of this code is “book stack”.<ref>{{cite journal\|author1=Ryabko, B. Ya. \|author2=Horspool, R. Nigel \|author3=Cormack, Gordon V. \|title=Comments to: "A locally adaptive data compression scheme" by J. L. Bentley, D. D. Sleator, R. E. Tarjan and V. K. Wei.\|journal=Comm. ACM\|volume=30\|year=1987\|issue=9\|pages=792–794\|doi=10.1145/30401.315747}}</ref> Let us give a precise description. Assume for simplicity that the symbols in the data are [[byte]]s. Line 22 ⟶ 18: The b moves to the front of the list, producing (bacdefghijklmnopqrstuvwxyz). The next letter is a, which now appears at index 1. So we add a 1 to the output stream. We have: 1,1 and we move ~~back~~ the letter a back to the top of the list. Continuing this way, we find that the sequence is encoded by: 1,1,13,1,1,1,0,0 {\| class="wikitable" ~~border~~style="1font-family:monospace;" \|- ! Iteration Line 76 ⟶ 72: Details of implementation are important for performance, particularly for decoding. For encoding, no clear advantage is gained by using a [[linked list]], so using an [[Array data structure\|array]] to store the list is acceptable, with worst-case performance [[Big O notation\|O]](<var>n</var><var>k</var>), where <var>n</var> is the length of the data to be encoded and <var>k</var> is the number of values (generally a constant for a given implementation). The typical performance is better because frequently-used symbols are more likely to be at the front and will produce earlier hits. This is also the idea behind a [[Self-~~organizing_list~~organizing list#~~Move_to_Front_Method_~~Move to Front Method (MTF)\|Move-to-front self-organizing list]]. However, for decoding, we can use specialized data structures to greatly improve performance.{{Examples\|date=February 2012}} Line 83 ⟶ 79: This is a possible implementation of the move-to-front algorithm in [[Python (programming language)\|Python]]. <~~source~~syntaxhighlight lang="python"> # """mtfwiki.py""" ~~from typing import List, Tuple, Union~~ ~~# Instead of always transmitting an "original" dictionary, it is simpler to just agree on an initial set.~~ ~~# Here we use the 256 possible values of a byte:~~ ~~common_dictionary = list(range(256))~~ from collections.abc import Generator, Iterable ~~def encode(plain_text: str) -> List[int]:~~ from itertools import chain ~~# Change to bytes for 256.~~ ~~plain_text = plain_text.encode('utf-8')~~ ~~# Changing the common dictionary is a bad idea. Make a copy.~~ ~~dictionary = common_dictionary.copy()~~ class MoveToFront: ~~# Transformation~~ """ ~~compressed_text = list() # Regular array~~ ~~rank~~>>> mtfwiki = 0MoveToFront() >>> list(mtfwiki.encode("Wikipedia")) [87, 105, 107, 1, 112, 104, 104, 3, 102] >>> mtfwiki.decode([87, 105, 107, 1, 112, 104, 104, 3, 102]) 'Wikipedia' >>> list(mtfwiki.encode("wikipedia")) [119, 106, 108, 1, 113, 105, 105, 3, 103] >>> mtfwiki.decode([119, 106, 108, 1, 113, 105, 105, 3, 103]) 'wikipedia' """ def __init__(self, common_dictionary: Iterable[int] = range(256)): ~~# Read in each character~~ ~~for~~ c in ~~plain_text:~~ """ ~~rank = dictionary.index(c) # Find the rank~~Instead of ~~the~~always ~~character~~transmitting inan ~~the~~"original" dictionary ~~[O(k)]~~, it is simpler to just agree on an initial set. ~~compressed_text.append(rank) # Update the encoded text~~ Here we use the 256 possible values of a byte. """ self.common_dictionary = common_dictionary def encode(self, plain_text: str) -> Generator[int]: ~~# Update the dictionary [O(k)]~~ # Changing the common dictionary is a bad idea. Make a copy.~~pop(rank)~~ dictionary~~.insert(0,~~ c= list(self.common_dictionary) # Read in each character ~~return compressed_text # Return the encoded text~~ for c in plain_text.encode("latin-1"): # Change to bytes for 256. ~~</source>~~ # Find the rank of the character in the dictionary [O(k)] rank = dictionary.index(c) # the encoded character yield rank # Update the dictionary [O(1)] ~~The inverse will recover the original text:~~ dictionary.pop(rank) dictionary.insert(0, c) def decode(self, compressed_data: Iterable[int]) -> str: ~~<source lang="python">~~ """ ~~def decode(compressed_data: List[int]) -> str:~~ Inverse function that recover the original text ~~compressed_text = compressed_data~~ """ ~~dictionary = common_dictionary.copy()~~ dictionary = list(self.common_dictionary) ~~plain_text = []~~ plain_text = [] # Read in each rank in the encoded text for rank in ~~compressed_text~~compressed_data: # Read the character of that rank from the dictionary plain_text.append(dictionary[rank]) # Update the dictionary e = dictionary.pop(rank) dictionary.insert(0, e) return bytes(plain_text).decode(~~'utf~~"latin-8'1") # Return original string </syntaxhighlight> ~~</source>~~ ~~Example output:~~ ~~<source lang="pycon">~~ ~~>>> import mtfwiki~~ ~~>>> mtfwiki.encode('Wikipedia')~~ ~~[87, 105, 107, 1, 112, 104, 104, 3, 102]~~ ~~>>> mtfwiki.decode([119, 106, 108, 1, 113, 105, 105, 3, 103])~~ ~~'wikipedia'~~ ~~</source>~~ In this example we can see the MTF code taking advantage of the three repetitive {{code\|i}}'s in the input word. The common dictionary here, however, is less than ideal since it is initialized with more commonly used [[ASCII]] printable characters put after little-used control codes, against the MTF code's design intent of keeping what's commonly used in the front. If one rotates the dictionary to put the more-used characters in earlier places, a better encoding can be obtained: <~~source~~syntaxhighlight lang="~~pycon~~python"> def block32(x): ~~>>> import mtfwiki~~ ~~>>>~~ ~~block32~~ = ~~lambda~~ ~~x :~~return [(x + i for i in range(32)]) ~~>>> # Sort the ASCII blocks: first lowercase, then uppercase, punctuation/number, and finally the control code and the non-ASCII stuff~~ ~~>>> mtfwiki.common_dictionary = block32(0x60) + block32(0x40) + block32(0x20) + block32(0x00) + list(range(128, 256))~~ class MoveToFrontMoreCommon(MoveToFront): ~~>>> mtfwiki.encode('Wikipedia')~~ """ ~~[55, 10, 12, 1, 17, 9, 9, 3, 7]~~ >>> mtfwiki = MoveToFrontMoreCommon() ~~</source>~~ >>> list(mtfwiki.encode("Wikipedia")) [55, 10, 12, 1, 17, 9, 9, 3, 7] """ def __init__(self): super().__init__( chain( # Sort the ASCII blocks: block32(ord("a") - 1), # first lowercase, block32(ord("A") - 1), # then uppercase, block32(ord("!") - 1), # punctuation/number, block32(0), # the control code, range(128, 256), # and finally the non-ASCII stuff ) ) if __name__ == "__main__": import doctest doctest.testmod() </syntaxhighlight> ==Use in practical data compression algorithms== Line 163 ⟶ 182: ===Example=== {{Confusing\|following section\|reason=What is the method used to arrive at these numbers?\|date=February 2011}} As an example, imagine we wish to compress [[To be, or not to be\|Hamlet's soliloquy]] (''To be, or not to be...''). We can calculate the ~~entropy~~size of this message to be 7033 bits. Naively, we might try to apply the MTF transform directly. The result is a message with 7807 bits ~~of entropy~~ (higher than the original). The reason is that English text does not in general exhibit a high level of local frequency correlation. However, if we first apply the Burrows–Wheeler transform, and then the MTF transform, we get a message with 6187 bits ~~of entropy~~. Note that the Burrows–Wheeler transform does not decrease the entropy of the message; it only reorders the bytes in a way that makes the MTF transform more effective. One problem with the basic MTF transform is that it makes the same changes for any character, regardless of frequency, which can result in diminished compression as characters that occur rarely may push frequent characters to higher values. Various alterations and alternatives have been developed for this reason. One common change is to make it so that characters above a certain point can only be moved to a certain threshold. Another is to make some algorithm that runs a count of each character's local frequency and uses these values to choose the characters' order at any point. Many of these transforms still reserve zero for repeat characters, since these are often the most common in data after the Burrows Wheeler Transform. Line 170 ⟶ 189: {{main \| Self-organizing list#Move to Front Method (MTF) }} * The term Move To Front (MTF) is also used in a slightly different context, as a type of a dynamic [[linked list]]. In an MTF list, each element is moved to the front when it is accessed.<ref>{{Cite journal \| doi = 10.1145/359997.360000\| title = On self-organizing sequential search heuristics\| journal = Communications of the ACM\| volume = 19\| issue = 2\| pages = 63–67\| year = 1976\| last1 = Rivest \| first1 = R.Ronald Linn \| author-link1=Ron Rivest \| s2cid = 498886\| doi-access = free}}</ref> This ensures that, over time, the more frequently accessed elements are easier to access. It can be proved that the time it takes to access a sequence of elements in <ref>[http://courses.csail.mit.edu/6.851/spring07/scribe/lec01_2.pdf Lecture notes in advanced data structures], by Prof. Erik Demaine, Scribe: Ray C. He, 2007.</ref> ==References== {{reflist}} *{{cite journal\|url=https://www.cs.cmu.edu/~sleator/papers/Adaptive-data-compression.htm\|title=A Locally Adaptive Data Compression Scheme\|author1=J. L. Bentley \|author2=D. D. Sleator \|author3=R. E. Tarjan \|author4=V. K. Wei \|journal=Communications of the ACM\|volume=29\|issue=4\|pages=320–330\|year=1986\|doi=10.1145/5684.5688\|citeseerx=10.1.1.69.807}} ==External links== Line 182 ⟶ 200: {{DEFAULTSORT:Move-To-Front Transform}} [[Category:Data compression transforms]] [[Category:Lossless compression algorithms]] [[Category:Articles with example Python (programming language) code]] ~~[[Category:Transforms]]~~ ~~[[Category:Articles with example Python code]]~~