[htdig3-dev] [htdig] Indexing huge websites (really Linux 2G max file size problem)


loic@ceic.com
Thu, 5 Aug 1999 10:44:23 +0200 (MEST)


marc@nozell.com writes:
>
> I maintain a very large archive (archiver.rootsweb.com) of almost
> 9,000 mailing lists with more being added every day. Currently we
> have 900,000 messages (7.5G) which is also growing rapidly.

 Transparent compression might be a solution :

http://www.netspace.net.au/~reiter/e2compr/

 seems to work well. I don't have personal experience on it.

 There also is bigfile support for linux, glibc-2.1 required. I don't
have URL or opinion but would like to. Any pointer appreciated.
 Running freebsd with ufs is another option.
 Pressing SGI people to release xfs *now* (http://oss.sgi.com/projects/xfs/)
is more hazardous ;-)
 
 Is it possible to get a copy of your current dataset ? I have 12Gb of
data to run the tests on htdig-3.2 but would like to have 8Gb more. I
can rsync them, even if it takes a full week to complete.

    Thanks in advance,

-- 
		Loic Dachary

ECILA 100 av. du Gal Leclerc 93500 Pantin - France Tel: 33 1 56 96 09 80, Fax: 33 1 56 96 09 61 e-mail: Loic@Dachary.org URL: http://www.senga.org/

------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to htdig3-dev@htdig.org containing the single word "unsubscribe" in the SUBJECT of the message.



This archive was generated by hypermail 2.0b3 on Thu Aug 05 1999 - 01:34:09 PDT