Huffman coding and the like use an integer number (k) of bits for each symbol, hence k is never less than 1. Sometimes, e.g., when sending a 1-bit image, compression becomes impossible.

- Idea: Suppose alphabet was
*X*,*Y*prob(X) = 2/3 prob(Y) = 1/3

- If we are only concerned with encoding length 2 messages, then
we can map all possible messages to intervals in the range [0..1]:
- To encode message, just send enough bits of a binary fraction
that uniquely specifies the interval.
- Similarly, we can map all possible length 3 messages to
intervals in the range [0..1]:
- Q: How to encode X Y X X Y X ?
Q: What about an alphabet with 26 symbols, or 256 symbols, ...?

- In general, number of bits is determined by the size of the
interval.
Examples:

- first interval is 8/27, needs 2 bits -> 2/3 bit per symbol (X)
- last interval is 1/27, need 5 bits

- first interval is 8/27, needs 2 bits -> 2/3 bit per symbol (X)
- In general, need bits
to represent interval of size
*p*. Approaches optimal encoding as message length got to infinity. - Problem: how to determine probabilities?
- Simple idea is to use adaptive model: Start with guess of symbol
frequencies. Update frequency with each new symbol.
- Another idea is to take account of intersymbol probabilities, e.g., Prediction by Partial Matching.

- Simple idea is to use adaptive model: Start with guess of symbol
frequencies. Update frequency with each new symbol.
- Implementation Notes: Can be CPU and memory intensive; patented.