[htdig] Antwort: Re: [htdig] url_part_alias with single slash as from


Subject: [htdig] Antwort: Re: [htdig] url_part_alias with single slash as from
From: kai.krebber@syseca.de
Date: Thu Mar 09 2000 - 07:19:14 PST


>>Do I need to escape/quote the slash in any way? How?

>I would certainly quote it. The quotes should be removed automatically.

No, they're not removed. Now I get back URLs like:
subdir1'/'subdir2/somepage.html
I tried Quotes(') and - desperately - Backticks (`) too. Neither worked.

Maybe I tell you my original problem and somebody has a different idea:

1st Problem: htdig stores URLs as absolute URLs, but I need URLs without
hostpart:
instead of http://intra1/index.htm it should return just /index.htm

2nd Problem: The site sometimes uses framesets and all "body_something.htm" -
hits should be converted to "something.htm", 'cause something.htm is the
frame-container. I can't manipulate the pages themself (to load the appropriate
frameset with some fancy javascript-function) and in fact: I'm not willing to
solve the problem elsewhere, e.g. on the server with some perlscript.
It shouldn't be a big deal for htdig / htsearch to replace "/body_" with "/" or
"body_" with nothing at all, or am I wrong?

I tried out:

url_part_aliases: http://intra1 *1 \
                  body_ *2

for the digging and it worked fine, but

url_part_aliases: /. *1 \
                  / *2

for the searching gives me the hitcount without the actual url's at all or only
those without frames.

>BTW, I would strongly suggest upgrading from 3.1.3 to 3.1.5.
OK. I did that today, hoping my problem would go away with the upgrade. It did
not.

If anybody get's the right url_part_alias for replacing *2 with a slash, please
mail it to me.

TIA,
     Kai

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-unsubscribe@htdig.org
You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Thu Mar 09 2000 - 07:21:04 PST