Re: htdig: MS Office files -- help indexing them, please!


Frank Guangxin Liu (frank@ctcqnx4.ctc.cummins.com)
Tue, 1 Dec 1998 10:06:27 -0500 (EST)


>
> $TMPDIR is set to a partition w/ ~10Gb of free space. I think I may have
> figured out the problem (and a nasty work-around): it seems that GNU sort is
> interpreting the sort column of db.wordlist as a command line argument; i.e.
> if the line in db.wordlist begins "-something" sort is freaking out! I

sort -- -something
should disable sort from intepreting -something after.

> simply did a `sed 's?^-??'` on db.wordlist and then moved the new file into
> place. htmerge ran successfully. This being said, there's got to be a
> better way!!
>
> On a different note, I am having other problems. I know that this is not a
> htdig limitation, but the Solaris 2.5.1 machine I'm running this on has a 2Gb
> file size limitation. Is there any way to have htdig split into multiple 2Gb
> files? I know that I can manually limit things to the point where the
> various db's are <2Gb, but that's not really a solution either. I need a
> dynamic db! I guess I could move to Solaris 2.6, which doesn't have the 2Gb
> limitation. I'd like to hear how other folks have dealt with this problem.
> As you can see, I'm indexing a *huge* amount of documents...
>
> Thanks!
>
> Tyson
>
> On 01-Dec-98 Geoff Hutchison wrote:
> > At 4:56 PM -0500 11/24/98, Tyson Bigler wrote:
> >>I dl'd the latest snapshot (htdig-3.1.0b3-112298) and I'm using GNU sort,
> >>but
> >>I still get the same 'invalid argument' error from sort... Maybe I need to
> >>rebuild the index because I'm trying to merge the same index everytime...
> >
> > This shouldn't be a problem. What is the environment variable TMPDIR? (If
> > you're using rundig, it should be set in there somewhere.)
> >
> >
> > -Geoff Hutchison
> > Williams Students Online
> > http://wso.williams.edu/
> >
>
>
>
>
> ---
> M. Tyson Bigler SEPTCo Computing Solutions Group
> Infrastructure Support Bellaire Technology Center
> bigler@shellus.com 3737 Bellaire Blvd., Room 1007B
> 713-245-7476 Houston, TX 77025
>
>
> ----------------------------------------------------------------------
> To unsubscribe from the htdig mailing list, send a message to
> htdig-request@sdsu.edu containing the single word "unsubscribe" in
> the body of the message.
>

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:29:44 PST