[previous] [next] [top] [search] [index]

Setting up Searches


One of the design goals of WN is provide the maintainer with tools to create extensive navigational aids for the server. A variety of search mechanisms are available and provide this capability.

Title searches

In response to the URL
     <http://host/dir/search=title>

the server will provide an HTML form (automatically generated or prepared by the maintainer) asking for a regular expression search term. When supplied the server will search the index.cache files in /dir and designated subdirectories for a items whose titles contain a match for the search term. An HTML document with a menu of these items is returned. Subdirectories are designated for recursive searching by an entry in directory record of the index file like

     Subdir=dir1,dir2,dir3

You can customize the message offered requesting a search term by creating an HTML form whose ACTION is the URL http://host/dir/search=title and which uses the GET method to return the search term with NAME=query.

Keyword searches

Like title searches except matches are sought in keywords instead of titles. Keywords for HTML documents are automatically obtained from headers. For other documents (or HTML documents) they can be manually supplied in the index file. This is done by including a line like
     Keywords=keyword1, keyword2, etc.

in the relevant document's record in the index file. The URL to use to cause this search to be done is

     <http://host/dir/search=keyword>

Title/keyword search

Like the above except the match can be either in the keyword or the title. The URL to use as the ACTION in a form or simply to invoke the search is
     <http://host/dir/search=title>

Fielded searches for user supplied fields

The maintainer can supply up to 20 additional field values associated with a document. These are used for searching purposes in the same way that Keywords are. This is intended to give some additional "keyword like" fields, for example, document author or document id number. It works exactly like keywords except these values are not extracted from HTML files, but must be created with a line like
     Field3=any text here

in the index file. The '3' in this example can be replaced with any single digit. The URL to use as the ACTION in a form or simply to invoke the search in the example above is

     <http://host/dir/search=field3>

Like keyword and title searches the search term for a fielded search can be any grep-like regular expression.

Context searches

Unlike the title, keyword, and field searches this is a full text search of all text/* documents in one directory (not subdirectories). The returned HTML document contains a list of titles of documents containing a match each with a sublist of the lines from those documents containing the match. This provides one line of context for the match. For HTML documents selecting the matched expression in one of these takes you to the document with your viewer focused on the matching location. The primary intent of this feature is to provide full text searching for an HTML "document" which might consist of a substantial number of files.

The URL to use as the ACTION in a form or simply to invoke the search is

     <http://host/dir/search=context>

It is possible to mark HTML documents with comments so that only part of them is searched. This is done with lines consisting of the comment <!-- #WN_search_off --> which turns off searching until the line consisting of <!-- #WN_search_on --> is encountered.

Grep searches

A grep search is just like a context search, except that only a list of anchors pointing to files containin a match is returned. There are no lines of context showing the match. To do a grep search on the files in directory dir use
     <http://host/dir/search=grep>

File context and grep searches

A file context search is just like a context search, except limited to a single file. The file grep search returns a text/plain document consisting of the lines in the file matching matching the regular expression. The URL's to invoke these searches on file foo are
     <http://host/dir/foo;search=context>
     <http://host/dir/foo;search=grep>

List searches

The server will search an HTML document looking for an unordered list of anchors linking to WWW objects. The contents of each anchor will be searched for a match to the supplied regular expression. The search returns an HTML document containing an unordered list of those anchors with a match. This is quite useful with the digest utility which creates HTML documents to be searched in this way from files with internal structure like mail or news digests, mailing lists, etc.

The URL to invoke this search on file foo

     <http://host/dir/foo;search=list>

Index searches

Not yet implemented A collection of files can be indexed using WAISindex (or maybe glimpse) and the server will produce an HTML document listing all files containing a match. The form in which search terms are placed can be automatically generated or supplied by the maintainer.

All of the searching methods listed above except the index searches are built into the server and require no additional effort for the maintainer. They are simply referenced with URLs like <http://host/dir/search=context> where /dir is any directory containing files to be served and an index.cache listing them. Of course search permission can be denied for any directory or any file contained in that directory.


WN -- for those who think the Web should be more than a user friendly version of ftp

John Franks <john@math.nwu.edu>
[previous] [next] [top] [search] [index]