		FREEZE / MELT COMPRESSION PROGRAM

This is Alpha version. It is tested under ISC 2.2.

The following preprocessor symbols control the compilation of Freeze
package:

	o BITS                  The size of hash table (default is 18,
				reducing to 14 gives a 50% speeddown).
	o COMPAT                Turns on backwards compatibility
				with Freeze 1.0
	o M_XENIX & M_I286      Makes arrays < 65536 bytes each
	o BSD4_2		Allow long filenames ( > 14 characters) &
				Call setlinebuf(stderr)
	o INT_SIG               signal is int (*)() unstead of void
	o MSDOS                 Turns off some UNIX-dependencies
				and MSDOS' ones - vice versa
				(this version is completely untested)
	o __TURBOC__            For compiling under TURBO C
				(untested)

Other preprocessor symbols (DEBUG, GATHER_STAT) are for internal use
only.

The format of frozen (2.1) file is incompatible with that of frozen (1.0),
but if this package is compiled with -DCOMPAT switch, you will able to
unpack frozen (1.0) files, if you have them.

----

	Format of a frozen file is as follows:
	(version 1.0 had only 2-bytes header)

offset    type      value    comment
  0       byte       037     2 byte magic header
  1       byte       0237    (version 1.0 - 0236)
  2       short       X      (little endian)
  4       byte        Y

X = 0 e e e e e d d d d c c c b b a   \
				       > [a-f] are binary digits
Y = 0 0 f f f f f f                   /

a - number of 1-bit static Huffman codes in the `matching positions'
table (see freeze.1)
bb - number of 2-bit codes,
	etc.

The numbers of 7- and 8-bits codes are evaluated from the
conditions: sum(codes) = 62(dec), max code = 1111111(bin).

The default values are: 0 1 1 1 4 10 27 18, what means:
no 1-bit codes,
one 2-bit, 3-bit and 4-bit codes, etc., so Huffman codes are:
00, 010, 0110, 01110, 01111, 10000, .... , 11111111.

------------------- !!!!!!!!!! -----------------

(If you do not deal with compression algorithms, you may skip
until asterisks.)

General format of frozen file is:

magic header - table description - stream of bits

The stream of bits is considered as a sequence of variable length dynamic
Huffman codes (if their values are in the range of 0-255, they mean single
bytes, special value of 256 means EOF, and further values mean the lengths
of matched string.) If we have the value greater than 256, we get a static
Huffman code from the stream (his value is 6 higher bits of matched
string's position in the buffer), and then we get 7 bits literally.

Because buffer length is 8192 bytes and maximum match length is 256 bytes,
the position of matched string cannot be greater than 8192-256, that's why
there is only (8192-256)/2^7 = 62 static codes.

			*        *       *

The default table is tuned for both C texts and executable files (as in
LHARC). If you will freeze any other files (databases, images, fonts,
etc.) you can calculate the matching positions distribution using the
`statist' program, which calculates and displays the mentioned
distribution for the given file. It is useful for large (100K or more)
files !!!

Though the built-in position table is polyvalent, the tuning can increase
the compression rate up to one additional percent.  (Or even more, if the
matching strings distribution is very bizarre!)

Usage: statist < sample_file ; you can also see the intermediate values
and watch their changes by pressing INTR key when you wish.

Note: If you use "gensample | statist", remember that INTR influence BOTH
processes !!

The sum of numbers which are given by statist can be not equal to 62. This
means the sample was too trivial or random-like.

You may create the /etc/default/freeze file (if you don't like
/etc/default/ directory, choose another - in MS-DOS it is FREEZE.CNF in
the directory of FREEZE.EXE), which has the following format:  name =
``statist's output (8 numbers)'', f.ex.:

---------- cut here -----------
# This is freeze's defaults file
gif =   0 0 0 0 2 60 0 0        # This is NOT! a optimal data
				# for GIF files
doc=0 0 1 2 7 16 36 0           # The sample was gcc.lp
# End of file
---------- cut here -----------

If you find values, which are better THAN DEFAULT both for text (C
programs) and binary (executable) files, please send them to me.

Important note: statist.c is NOT a part of freeze package, it is an
aditional feature.

------------------- LINT ----------------------------

Lint complains about MANY `constant in conditional context', but ALL these
contexts aren't `conditional', because they are unconditional (!)
expressions:

#define BITS 18
. . .
#define LEN0    (BITS/3 + (BITS%3 != 0))
. . .
	... + (key[0] >> LEN0) ....

Do you think about /*CONDITIONAL*/ pseudo-comment for lint?

Other lint's complaints about `used/declared inconsistently' are (in my
case) due to inconsistencies of /usr/include/* and /usr/lib/llib*ln. It
isn't dangerous.

------------------- BUGS ----------------------------

Found bugs descriptions, incompatibilities, etc.  please send to
leo@s514.ipmce.su.  MS-DOS version will not be supported in future
versions !!  (If they will be :-) )

------------ SPEED & COMPRESSION RATE ---------------

When using 18 bits table (about 600K) and gcc, the speed of freeze is more
than the same of ARJ 1.00, but is less than of LHA 2.05.

My aim is not the maximum speed, the same range is enough.

Note: the percents mean 'relatively to compressed size', if you want have
them relatively to original size, divide them to 2-2.5.

Compression rate is *INDEPENDENT* of the hash table size, but may vary
with different static Huffman tables.  It is about 2% worse than the same
of ARJ and LHA.

My aim is not the maximum compression, the same range is enough. :-)

Note: if you see Compress works nearly as Freeze (on some files), this
means the maximum is gained, so LHA and ARJ won't better more than
1-1.5%.

--------------- POSSIBLE IMPROVEMENTS ---------------

The high-level routines (freeze, melt) are almost independent from
low-level routines (Get_Next_Match, Insert/Delete_Node,
Encode/Decode_Char/Position), so if you want the speed and/or compression
rate `a la vogue' you may replace the low-level routines with the homebrew
(f. ex.) ones and enjoy the results.
