Re: [htdig] Multiple merging

Nathaniel Irons (
Wed, 3 Nov 1999 10:32:37 -0500

On 11/3/99 at 8:55 AM, (Gilles Detillieux)

> but it seems it would be more efficient to merge 2 into 1, 4 into 3, 6
> into 5, 8 into 7, 3 into 1, 7 into 5, and finally 5 into 1. I'm
> guessing though. I don't know that anyone ever benchmarked it.

I haven't benchmarked it, but I now do my incremental digs this way, and
it's a big win.

I host about a dozen mailing list archives, and to keep the load on the
server even, I index in month-sized blocks. I'm just now finalizing the
automated procedure, where the current month's files for each list are
dug early every morning, and merged with each other before finally being
merged with the master DB.

It's dramatically faster (and easier on the server) than the old way,
where every current month's htdig output would be laboriously fed to the
main archive, necessitating a huge sort even if the new material was
relatively puny. That was un-automatable on a shared server.


