Re: [htdig] local_url and spaces in docs url


Subject: Re: [htdig] local_url and spaces in docs url
From: Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Date: Fri Feb 04 2000 - 09:03:43 PST


According to GOMEZ Henri:
> All documents are local to the htdig system so I
> use local_url tag in conf file.
...
> <a href="http://mysys/mydocs/Informations%2008092000.pdf">
> /home/mydocs/Informations%2008092000.pdf</a><BR>
> </body></html>
>
> I've got a problem since some documents contains space
> in the file name and so the indexer perl script encode url (ie replace ' '
> by %20)
>
> But now htdig couldn't find the file like these (ie
> /home/mydocs/Informations%2008092000.pdf)
> Anything to do to force htdig to decode url (ie convert %20 by ' ') ?

Nothing to do but patch the code, I guess. The GetLocal function should
hex-decode the URL or filename at some point. I'm not sure which is
best, though - to decode the URL before comparing against local_urls,
or to decode the filename after the comparison. Given that local_urls
and local_user_urls are string lists, and not quoted string lists, it
might make sense to allow hex encoding there too. Maybe if we decode
everything, then we're most likely to get the match we want. Come to
think of it, this would also help with your earlier problem of how to
get an "=" into the URL-portion of a local_urls entry. You could do
that by hex encoding it, if the decoding is added. Try this patch...

--- htdig/Retriever.cc.nodecodelcl Tue Feb 1 09:16:04 2000
+++ htdig/Retriever.cc Fri Feb 4 11:01:00 2000
@@ -783,12 +783,21 @@ Retriever::GetLocal(char *url)
                    continue;
             }
                *path++ = '\0';
- prefixes->Add(p);
- paths->Add(path);
+ String *pre = new String(p);
+ decodeURL(*pre);
+ prefixes->Add(pre);
+ String *pat = new String(path);
+ decodeURL(*pat);
+ paths->Add(pat);
             p = strtok(0, " \t");
         }
     }
 
+ // Begin by hex-decoding URL...
+ String hexurl = url;
+ decodeURL(hexurl);
+ url = hexurl.get();
+
     // Check first for local user...
     if (strchr(url, '~'))
     {
@@ -862,9 +871,15 @@ Retriever::GetLocalUser(char *url)
                 continue;
             }
             *dir++ = '\0';
- prefixes->Add(p);
- paths->Add(path);
- dirs->Add(dir);
+ String *pre = new String(p);
+ decodeURL(*pre);
+ prefixes->Add(pre);
+ String *pat = new String(path);
+ decodeURL(*pat);
+ paths->Add(pat);
+ String *ptd = new String(dir);
+ decodeURL(*ptd);
+ dirs->Add(ptd);
             p = strtok(0, " \t");
         }
     }

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------ To unsubscribe from the htdig mailing list, send a message to htdig-unsubscribe@htdig.org You will receive a message to confirm this.



This archive was generated by hypermail 2b28 : Fri Feb 04 2000 - 09:05:50 PST