Another View of Archivers | - by Joey Lindstrom |
The main reason I decided to write this article was that I felt there were a few "underdog" products available to OS/2 users that I'm quite sure they'd be using regularly, if only they knew about them. As we all know, OS/2 itself is one such "underdog" -- so I felt I'd be speaking to a receptive audience.
Specifically, I'm talking about compression products. We're all familiar with the ubiquitous "zip" files that we see on myriad FTP sites all over the world. It seems that nearly all shareware is distributed as a .ZIP file these days... and I feel it's a shame that we, as enlightened OS/2 users, should blindly follow along with the rest of the lemmings on this. Why do I feel that way? Two reasons:
In contemplating how I'd go about writing this article, I initially decided I'd do a "round up" of all available archivers and put each one through its paces. But then I said to myself, "hey, Joey, there's nothing new and interesting about an archiver round up... others have done this work already". Ignoring for the moment the dangers of talking to myself, I decided the simplest and best approach was to compare these products with the existing standard (PKZip) and highlight the advantages of the made-for-OS/2 alternatives. Hey, it beats working for a living.
Like PKZIP, you use simple easy-to-remember commands to invoke the program to begin archiving, extract files, create self-extracting files, etc., although with RAR you don't have to remember "it's ZIP for making an archive, and UNZIP to unpack it" -- it's all done with the main RAR.EXE program. But some of us find remembering command-line switches to be something to be avoided. Enter Interactive Mode (GIF, 14.2k).
But there are some who despise front-ends and continue to rely on command-line switches. After all, menus are for wimps, right?. But wait: RAR is so feature-packed, you'd never remember all the switches to configure it for, let's say, "maximum compression, solid mode, multimedia compression enabled". No problem. Invoke the RAR menu program once (honest, you only have to do it once!) and set those options as default behaviour. From then on, you can:
RAR A ARCHIVENAME filename filename etc.
to your heart's content, and RAR will behave the way you want it. Every time. And, best of all, all these options can still be overridden by command-line switches, so if you want to disable multimedia compression just this one time, one switch is all you need. Typing "RAR -?" gives you a list of 'em at the drop of a hat: no looking through manual files (which, if you're like me, are stored in the most out-of-the-way directory possible).
OK, how about overall compression? RAR can use five different compression levels, ranging from "fast, minimal compression" (level 1) to "slow, maximal compression" (level 5). When set to level 2, RAR performs essentially on a par with Info-ZIP. When set to any level higher than 2 (I use 5 as my default), the result is slower compression but much better compression than Info-ZIP can manage. In addition to boasting a better basic compression engine, RAR has two additional features that can greatly increase its ability to reduce archive size.
"SOLID MODE" - When you activate solid mode, RAR operates slightly differently. Assuming you were to create an archive using several different types of files (i.e.: some .TXT files, a few .DOCS and .HLPS, and maybe one or two .WAVS), RAR will group files with identical extensions together and then compress them together. I won't get into a long-winded discussion of how file compressors work internally, but basically they're looking for patterns. Files of similar type will usually have similar patterns, and unlike Info-ZIP, RAR can take advantage of those patterns in common between separate files to create smaller archives.
Here's an example: let's say you've got a 23K text file. You compress it with Info-ZIP and it reduces to 10K. Now, make a copy of the uncompressed file and compress them both together into one archive. Because you're now compressing 46K of data, Info-ZIP will yield a 20K archive. But watch what happens when RAR discovers that the two files are identical:
RAR 2.00 Copyright (c) 1993-96 Eugene Roshal 8 May 1996 Shareware version Type RAR -? for help Solid archive test2.rar Name Size Packed Ratio Date Time Attr CRC-32 Meth Ver ------------------------------------------------------------------------------ temp.txt 23553 10077 42% 01-01-97 19:49 .....A 378F052C m5a 2.0 temp2.txt 23553 54 0% 01-01-97 19:49 .....A 378F052C m5a 2.0 ------------------------------------------------------------------------------ 2 47106 10131 21%The second file takes up a whopping 54 bytes. Now, obviously, you're unlikely to be compressing two identical (but differently named) files together that often, but RAR will note similarities (i.e.: common words, phrases, etc.) between successive files and use those patterns to make better reduction possible. The result is that using solid mode will, in real world situations, reduce file sizes from 5% to 50% compared with regular, non-solid compression.
There is a downside to this though. Solid files are not easy to manipulate once built. That is, it's difficult to update them, add new files to them, or even selectively extract individual files from them. Previous versions of RAR, and most other archivers that offer "solid mode", did not have the ability to modify solid-archives. However, RAR 2.00 can do so, although you do take a performance hit. So, if you're going to be building archives to actually work with, it's best to disable solid mode (unless you've got a really fast machine). For storage purposes, solid mode is the way to go.
Another nice compression feature of RAR is that it has not one but two compression engines. The first one handles most real-world files, but the second zooms in on multimedia files, primarily bitmap and sound files. While not the first archiver to sport this feature (that kudo belongs to the UC2 archiver for DOS), it's the first one to do it at a reasonable speed. In fact, it's nearly as fast as Info-ZIP in dealing with .WAV files, yet consistently yields savings of 30%-50% compared to Info-ZIP (and that's not counting additional savings if you use solid mode).
Again, there's a trade-off... in using multimedia mode, you take a slight performance hit on every file you archive, since the archiver must check every file and determine which engine to use.
RAR also comes with a utility called RCVT, which allows you to convert archives from one format to another (RAR to ZIP, ZIP to RAR, pretty much all formats are covered providing you have a copy of the relevant archiver). It will, optionally, invoke a virus-scanner during the conversion process, which makes it very valuable to people who run bulletin boards, maintain FTP sites, etc.
ZipStream, like RAR and Info-ZIP, is an archiver. But not the way you think it is.
ZipStream, like DCF/2 and Stacker, is a disk compression utility. But not the way you think it is.
Based on PKZip (and using code licensed from PKWare), ZipStream combines the best of both worlds. After installation, you can define "zip volumes": literally, a drive letter that points to a particular directory on one of your hard drives. For example, I've got drive W: pointed at G:\ZIPVOL-W. Your OS/2 system will then believe that W: is a networked drive, and will allow you to perform any file operation on any file contained in that drive. But the beauty is: ZipStream will compress and decompress those files on-the-fly, allowing you the ability to simultaneously compress your files and be able to use them without having to unpack them first.
ZipStream doesn't have this drawback. Every file is stored as a separate file, using the original filename, in the "container" directory that you defined. Files can then be transported via floppy disk, modem, or whatever, to another system in compressed form, simply by copying the files from the container directory. In my example, if I wanted to give you a disk of compressed files, I'd just copy those files to drive W:, and then:
COPY G:\ZIPVOL-W\*.* A:Or, alternately:
ZSATTACH Z: A:\ COPY Z: Q: ZSATTACH Z: /DIn the second example, I created a new ZipStream volume, drive Z:, and pointed it at drive A:'s root directory. After doing so, the COPY operation caused ZipStream to decompress the files on-the-fly before copying them to drive Q: (at which point they'd be compressed again - but you could just as easily have copied them to a non-compressed drive). Finally, I destroyed the Z: drive (using the /D flag).
ZipStream features a handy status window (GIF, 12k) which minimizes (GIF, 1.6k) to get out of your way quickly.
Today, of course, this beast on my desktop is badly outdated and badly underpowered. Today's software tends to be extremely bloated, requiring 12 megs of dynamic link libraries just to draw a window. If those 12 megs are compressed, and have to be decompressed before being loaded into your system's memory, that's only going to increase loading time.
While this is a concern, ZipStream's decompression engine is extremely efficient. In real-world application, I've found that it has added between 10%-30% to my program load times. On my machine, powered as it is by a little hamster on a spinning wheel, I barely notice the extra delay. Anyone using a faster machine very definitely won't notice any difference.
The only time I run into problems is when manipulating very large database files in unusual ways (specifically, using the "Squish" mail processor that comes with the Maximus BBS system), which forces ZipStream to load and decompress the entire file into memory. While the developer managed to resolve that particular problem, it's still best to leave large files that will continually be modified on uncompressed drives (or set a low byte ceiling, described below).
ZipStream is a made-for-OS/2 application. It supports both FAT and HPFS drives, including extended attributes. It was designed from the ground up to use multithreading and to take full advantage of the fact that it is operating in a multitasking environment. One big bonus of this attitude is file writing. If you copy 20 files to a ZipStream volume, those 20 files are copied immediately, in uncompressed form. ZipStream will then, one by one, reload each file, compress it, and save it. It does this as a low-priority background task, using up CPU time that would otherwise be unused. The upshot is that there is absolutely no performance hit when writing files to ZipStream volumes. In fact the only way you'd even know the files were being compressed was if you looked up at the CPU meter on the WarpCenter and saw it maxed out at 100%. Even then, because the compression thread runs at low priority, your other applications will continue to run at normal speed. The only performance hit you'll ever see is when reading files (as mentioned earlier), because ZipStream must decompress the files before it gives them to OS/2.
ZipStream is based on the PKZIP v2.04g code, meaning file compression efficiency should be pretty much identical to what you regularly expect from PKZIP.
Because files are compressed on an individual basis, the worst that could happen during a power failure or system crash is that the file currently being written to disk may be corrupted. You face this risk all the time anyways, so in effect ZipStream adds zero risk to you and your data. There's no container file to corrupt. And, if there were several file compressions pending before the crash, you can resume compression after you reboot
ZipStream doesn't care if the files in a ZipStream archive are compressed or uncompressed: if they're uncompressed, they're simply handed off to the OS/2 file system as-is.
ZipStream is intelligent. It will:
ZipStream offers no-risk compression of files on an individual basis, completely invisible to the user, requiring nothing more than the user simply copying the files to a ZipStream volume. Decompression is similarly invisible and unobtrusive: you simply access the file as you normally would and decompression is handled automatically. It supports FAT and HPFS and creates files that are transportable to other machines.
The only thing ZipStream needs is a major compression-engine overhaul... maybe we could call the new version "RARStream".
ZipStream v1.20
by Carbon Based Software
download from Carbon Based Software Australia or USA (ZIP, 540k)
MSRP: US$99.95
[Our Sponsor: Post Road Mailer - A high performance, 32-bit, email program.]
Copyright © 1997 - Falcon Networking
This page is maintained by Falcon Networking. We welcome your suggestions.