Subject: Re: [htdig3-dev] Creating a SQL backend...
From: Terry Luedtke (LuedtkT@mail.nlm.nih.gov)
Date: Wed Jun 28 2000 - 05:53:26 PDT
>>> "tomi" <tomi@galactica.it> 28-Jun-00 07:57:40 >>>
<SNIP>
> There were about 20,000,000 of html
> and hypertextual documents in all the servers (this would make me assume
> they' d be much more, because a great deal of the non reached where in .ps,
> .pdf, or other format non followble).
> I did not test ht://Dig working to index this great ammount of datas, but
> everything let me thing that BerkeleyDB is not the appropriate way to do it.
BerkeleyDB can handle databases of up to 281E12 bytes (2^48). See http://www.sleepycat.com/featdetail.html#big_dbs Of course you would need a fairly healthy machine to do that (lots of memory, large raid etc).
There are other ways to improve the speed while still using BerkeleyDB (or any other db for that matter). The ability to run concurrent digs into the same database for one. An htsearch that stays in memory, similar to fast-cgi programs, for another.
Terry Luedtke
National Library of Medicine
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
htdig3-dev-unsubscribe@htdig.org
You will receive a message to confirm this.
This archive was generated by hypermail 2b28 : Wed Jun 28 2000 - 03:09:18 PDT