Re: htdig: MS Office files -- help indexing them, please!


Tyson Bigler (bigler@shellus.com)
Tue, 24 Nov 1998 15:56:57 -0600 (CST)


I dl'd the latest snapshot (htdig-3.1.0b3-112298) and I'm using GNU sort, but
I still get the same 'invalid argument' error from sort... Maybe I need to
rebuild the index because I'm trying to merge the same index everytime...

Tyson

On 24-Nov-98 Geoff Hutchison wrote:
> At 11:40 AM -0500 11/24/98, Tyson Bigler wrote:
>>powerpoint). Does anyone have an external parser for me??!! My peers keep
>>telling me that AltaVista has all of these "filters" (aka parsers), but I
>>haven't seen/used them...
>
> You never know--you might be able to "steal" the AltaVista filters. I don't
> know the details of their filters, but if they use external programs too,
> you can use those (at least as examples). If not, I suggest looking in
> something like Yahoo for a PowerPoint -> HTML converter.
>
>>I am also having difficulty with htmerge on a fairly large (and it will
>>only
>>grow larger) index. The specific error seems to be coming from the sort
>>command. When using the standard sort included with Solaris 2.5.1 I get:
>
> Use GNU sort. The sort program on Solaris seems to have some nasty bugs
> like this.
>>sort: can't create /home/atlantis8/bigler/stmAAAa00598/a: Not a directory
>>htmerge: Word sort failed
>
>># htmerge -c conf/unix.conf -v -s
>>htmerge: Sorting...
>>/home/atlantis3/bigler/opt/bin/sort: read error: Invalid argument
>>htmerge: Word sort failed
>
> This may be a bug from me. Try the latest snapshot from
> http://www.htdig.org/files/snapshots/ (if there's a great need, I'll
> "bless" something soon as a beta, even though there are some unresolved
> bugs.)
>
>>Any help would be *greatly* appreciated. I had rather not go the other
>>direction and be forced into AltaVista.... ;-D And I'd like to deliver a
>>solution way ahead of the "other guy". ;-D
>
> Sounds good to me. :-)
>
>
> -Geoff Hutchison
> Williams Students Online
> http://wso.williams.edu/
>

---
M. Tyson Bigler                  SEPTCo Computing Solutions Group
Infrastructure Support           Bellaire Technology Center
bigler@shellus.com               3737 Bellaire Blvd., Room 1007B
    713-245-7476                 Houston, TX 77025

---------------------------------------------------------------------- To unsubscribe from the htdig mailing list, send a message to htdig-request@sdsu.edu containing the single word "unsubscribe" in the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:28:52 PST