Re: htdig: New deployment of Dig Question

Lionel Siau (
Sat, 02 Jan 1999 14:05:33 +0000

Geoff Hutchison wrote:

> On Thu, 31 Dec 1998, Lionel Siau wrote:
> > (i) It is 'suggested' that my database would contain files made up of
> > text records ie each line would be 1 record with several fields.
> > HOwever, most search engines claim they can search text and index it.
> > BUT can they return that SPECIFIC record/line instead of the entire
> > text/file. Can Dig do this or can any search engine do this?
> In your case, you're probably better off writing a simple perl script to
> match the record. HotWired's WebMonkey site, among others has nice
> tutorials on lots of topics, including this one.
> The problem is that a website search engine essentially considers the
> entire text/file as a record. A custom script would consider the
> lines as records, whether they're URLs (as someone asked recently) or
> something else.

I considered doing that but I'm afraid of the 'delay' involved if I have
database text files of several thousand records. Any idea what the
performance issues are esp for 'grep' under Linux? Although it is only a
prototype system, it should be scalable. I don't want it to look good for
100-200 records and tend conk out when it reaches several thousand.

A normal DBM would of several thousand records would probably fall over and
die if I tried to query its 'Title' field for a specific word.

Lionel Siau

Imperial College of Science, Techonolgy and Medicine Department of Computing

