htdig: SIGCHLD bug in vixie-cron under Linux


Gilles Detillieux (grdetil@scrc.umanitoba.ca)
Mon, 2 Nov 1998 14:55:33 -0600 (CST)


Hello, Erik. Several users of the ht://Dig web search indexing package
have reported problems running the htmerge utility under Red Hat Linux
5.0 & 5.1, from within a cron job. The error results from a failure
of the pclose() function in htmerge, which ends with error 10, ECHILD.
This resulted in the infamous "htmerge: Word sort failed" error message.
The problem disappears if htmerge is run from tcsh or ash, but when run
by bash (the default shell) it fails.

Further testing revealed that this wasn't a problem under Red Hat Linux
4.2, nor does it fail if I replace 5.1's /usr/sbin/crond with the crond
from 4.2. Digging into the vixie cron source RPM for 5.1 revealed this
note in the .spec file:

* Thu Oct 23 1997 Erik Troan <ewt@redhat.com>
- force it to use SIGCHLD instead of defunct SIGCLD

Unfortunately, the patch for this (vixie-cron-3.0.1-sigchld.patch)
does more than just change the symbol used for the signal. It enables
#ifdef'ed code that sets this signal action to SIG_IGN in the child
process, rather than SIG_DFL as it does for the defunct SIGCLD. This is
the heart of the problem, and can potentially interfere with any program
using popen() and pclose() (or anything else that uses SIGCHLD) which is
launched from bash in a cron job. I imagine that ash and tcsh must reset
the SIGCHLD action to SIG_DFL, and bash doesn't, to explain the different
behaviour I noted. Maybe bash should be patched to reset this as well,
but I don't see any point in crond setting it to SIG_IGN to begin with,
as it works fine setting it to SIG_DFL instead.

My fix was to install vixie-cron-3.0.1-24.src.rpm, and replace the
vixie-cron-3.0.1-sigchld.patch in /usr/src/redhat/SOURCES with the one
below, then rebuild and reinstall cron. For the benefit of ht://Dig
users, which I'm cc'ing, the commands to rebuild were:

        rpm -ba /usr/src/redhat/SPECS/vixie-cron-3.0.1.spec
        rpm -Uvh --force /usr/src/redhat/RPMS/i386/vixie-cron-3.0.1-24.i386.rpm

I didn't bother to bump up the release number in the .spec file, to avoid
confusion with future updates from Red Hat, which I hope will include
this fix. Erik, if you're no longer maintaining the vixie-cron RPM,
please forward this to the person who is. Thanks.

Regards,
Gilles Detillieux

---------- begin vixie-cron-3.0.1-sigchld.patch ----------
--- vixie-cron-3.0.1/compat.h.sigchld Wed May 31 16:37:20 1995
+++ vixie-cron-3.0.1/compat.h Mon Nov 2 13:58:09 1998
@@ -110,7 +110,7 @@
 # define HAVE_SAVED_UIDS
 #endif
 
-#if !defined(ATT) && !defined(__linux) && !defined(IRIX) && !defined(UNICOS)
+#if !defined(ATT) && !defined(IRIX) && !defined(UNICOS)
 # define USE_SIGCHLD
 #endif
 
--- vixie-cron-3.0.1/do_command.c.sigchld Mon Nov 2 13:50:09 1998
+++ vixie-cron-3.0.1/do_command.c Mon Nov 2 14:03:33 1998
@@ -101,7 +101,11 @@
          * use wait() explictly. so we have to disable the signal (which
          * was inherited from the parent).
          */
+#ifdef linux
+ (void) signal(SIGCHLD, SIG_DFL); /* SIG_IGN bad for bash jobs */
+#else
         (void) signal(SIGCHLD, SIG_IGN);
+#endif
 #else
         /* on system-V systems, we are ignoring SIGCLD. we have to stop
          * ignoring it now or the wait() in cron_pclose() won't work.
---------- end vixie-cron-3.0.1-sigchld.patch ----------

-- 
Gilles R. Detillieux              E-mail: <grdetil@scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-request@sdsu.edu containing the single word "unsubscribe" in
the body of the message.



This archive was generated by hypermail 2.0b3 on Sat Jan 02 1999 - 16:28:43 PST