Snag-o-rama v1.2 Release Notes					April 21/96

Changes since v1.1:
- URLs with a "~" in them should work properly now
- a new version number ...

What it is:
-----------

The snag-o-rama package attempts to transfer an entire html source tree
from a remote URL to the local filesystem.  Starting by downloading the
URL provided on the command line, it downloads all images (and links to
images) it encounters, and tries to follow links to other documents, 
recursively.  Snag will not follow hypertext links to different domain 
names, nor will it follow links to "parent" directories of the original 
URL.  Along the way, all links are made *relative*, so as to remove 
dependency on where the files have to be.

The http transfers are done via lynx (the terminal-based www browser).  
Therefore, you need to find and compile lynx for any of this stuff to 
work.

Perhaps the best way to find out how the program works, and how to apply 
it to your needs, is to try "snag http://www.remote.site/path/start.html" 
and see what happens.  

Documentation:
--------------

The snag-o-rama distribution includes the following files:
snag		shell script that does it all
snagit		helper script; fetches an individual url
chaserefs.c	source of chaserefs; filters incoming html
lynxerror.c	source of lynxerror; guesses if lynx barfed on a url
Makefile	a weak makefile that saves some typing

Usage:
	snag <fqurl> [verbose]

<fqurl> is a fully qualified url that you want to start from.
With the optional verbose flag, snag will yield a play-by-play of what 
it's doing to stderr.  Things go into the current directory, so you 
probably will want to start in an empty directory.

lynx, snagit, chaserefs, and lynxerror should all be visible in the path for 
things to work properly.

If everything works just right, a html document tree should appear in the
current working directory, replicating approximately what is on the remote
end of things.  All anchor and inline references are translated to point
relative to the referring file, so you don't have to worry about absolute
path names.  This actually alters the html files; the original,
unmutilated html source is left in a file with the extension ".real"
appended to it.  You can probably "rm `find . -name \*.real -print`"
afterwards if you don't have any use for these original files.  I use them
for debugging.  The tilde (~) in any URL is changed to an underscore (_)
to avoid problems with your httpd trying to find the home directories of
users that are on remote machines. 

Image maps and cgi scripts and anything else resembling interactivivity 
on the remote end won't work.  Bummer.

Documentation is conveniently found in chaserefs.c, snag, and snagit.  
:-).  If there is sufficient interest, I may spiff up this distribution, 
and do a manpage.

The binaries have been compiled on Linux and SCO Unix systems; if you 
have problems compiling, then my code probably isn't as portable as I 
thought it was.  Your shell should be able to handle large-ish scripts; 
I've found that bash works quite nicely.

Please report as to any successes with this program.  Please also email 
reports of things breaking down, so that I can improve it for future 
generations...  Please don't comment on the sorry state of the source 
code; it is loaded with dead code, counter-intuitive hacks, and stuff 
that's been duplicated.  It sometimes works, though.  :-)

Hope you find it useful.

Joseph Clancy
jclancy@wimsey.com / jclancy@vc.bc.ca
