Winlink 2000 is a hot topic in the amateur community right now. It provides a mechanism to access Internet email over several different transports, including an Internet connection, local V/UHF packet radio, and long-distance HF Pactor. The idea is great, and in practice, it works reasonably well. There are a lot of complaints to be made about the design and implementation of the system, but at the end of the day, those guys put in the time, effort and money and made it work.
One of the (many) complaints I have is their use of the ancient B2F forwarding protocol. It’s fine to use that over slow Pactor links (I suppose), but why aren’t we just using something like POP3 for the Internet hops? Rather silly, I think. Anyway, one of the design points of the B2F protocol is the use of an even more ancient compression algorithm called “lzhuf”.
The algorithm and code for this was written in Japan in 1988 to run on a 16-bit machine. The source, and many disparate alterations since then are sprinkled around the Internet and are easy to find via google. However, most people use either the command-line LZHUF_1.EXE file that has been around forever, or the DLL-ized version that the Winlink applications deliver and use. This effectively limits its use to Windows machines (and dosemu-installed Linux boxes). When I tried to compile several of the variants under Linux, I found that the code makes a bunch of assumptions about type sizes and thus crashes and/or fails to decode compressed text as a result. In fact, depending on where it fails, sometimes it runs off in an endless loop writing garbage to the output file until you kill it!
After spending a lot of time trying to find someone who had fixed the code to compile on a modern 32-bit system, I finally found a copy of the source that would compile on my system with g++ and actually run properly. I found the updated source code here and have archived a copy of the source on my system, as well as a static Linux binary in case you don’t want to compile it yourself. If you run the binary with no arguments, it will print a usage message.
Now, you might ask “Dan, why does Winlink 2000 use this old, unmaintained, fragile, and obscure compression algorithm?”. Well, in the days of freely available code, algorithms, and libraries to do advanced compression, encoding, etc, I can assure you that the top notch Winlink engineers have a good reason. ….Right? I figured that this obscure gem from the golden age of 4MHz PCs must be an undiscovered compression miracle, one that makes the extremely slow Pactor connections able to transfer data as efficiently as possible. So, I decided to compress some test files with lzhuf, as well as the freely-available-and-well-regarded gzip and bzip2 algorithms and compare the results.
As input, I used the lzhuf source code itself, which is about 19KB in size. That’s a pretty good sized email even with a potential file attachment. Below are the results:
|Method||Size||Reduced size by:|
So there you go: with bzip2, you’d get almost a kilobyte less data to transfer than you would using lzhuf. Does a kilobyte really matter? Well, Pactor-I is 200 baud (at most), with very small block sizes. Yes, I think I’d rather save that kilobyte.
So, I ask the Winlink 2000 developers: Why not move to bzip2 compression? It’s free. It’s widely available. It’s considered one of the best. You put “2000” in your name to sound like the system is new, fresh, and modern, why not use a compression algorithm to match?