Re: /etc/init.d/opendkim fails to start opendkim when run over ssh with a pseudo-tty

From: Murray S. Kucherawy <msk_at_blackops.org>
Date: Sun, 17 Apr 2011 13:51:54 -0700 (PDT)

On Sun, 17 Apr 2011, Sam Umbach wrote:
>> Does anything get logged during the failure to start?
>
> When the daemon fails to start, nothing is logged or written to STDERR
> or STDOUT.

What about the syslog?

> Based on my tests today, I strongly suspect there is a race condition
> when the daemon is started from a pty. This affects both opendkim and
> dk-filter, and I have seen the same results on Ubuntu 10.04 (lucid),
> 10.10 (maverick), and 11.04 beta 2 (natty). I'm not sure whether the
> issue lies in the opendkim and dk-filter executables, start-stop-daemon,
> or elsewhere.

But what's the race condition? My guess is the shell script allocates a
pty and (presumably) assigns it to descriptors 0, 1 and 2. Then it forks
and execs opendkim, waiting for a return status. The child process,
opendkim, operating in background and/or autorestart mode, it almost
immediately forks again, and in the child it closes 0, 1 and 2 and
replaces them with newly opened files that are read-write to /dev/null,
and calls setsid() to create a new session, detaching it from a
controlling terminal. The pty is thus closed, never used. In autorestart
*and* background mode, the fork/reopen process is repeated. Since all of
the processes I'm talking about either wait on each other or exit
immediately, there's no race condition involved that I can think of. The
only thing I can imagine is that the shell script has some expectation
having to do with the pty that opendkim isn't satisfying. If that's the
case, I'd like to know what that is, because it's something I've never
heard of before.

The only time the pty might get used is if during startup there's an
attempt to print some error condition, but that should be the exception
and not the rule.

> maverick, root user, no sudo, no pty
> result: SUCCESS, started opendkim daemon on 10 of 10 attempts
> ssh -T root_at_maverick '/etc/init.d/opendkim stop'
> ssh -T root_at_maverick 'pidof opendkim'
> ssh -T root_at_maverick '/etc/init.d/opendkim start'
> ssh -T root_at_maverick 'pidof opendkim'
>
> maverick, non-root user w/ passwordless sudo, pty
> result: FAIL, started opendkim daemon on 5 of 10 attempts
> ssh -tt ubuntu_at_maverick 'sudo /etc/init.d/opendkim stop'
> ssh -tt ubuntu_at_maverick 'pidof opendkim'
> ssh -tt ubuntu_at_maverick 'sudo /etc/init.d/opendkim start'
> ssh -tt ubuntu_at_maverick 'pidof opendkim'
>
> maverick, root user, no sudo, pty
> result: FAIL, started opendkim daemon on 7 of 10 attempts
> ssh -tt root_at_maverick '/etc/init.d/opendkim stop'
> ssh -tt root_at_maverick 'pidof opendkim'
> ssh -tt root_at_maverick '/etc/init.d/opendkim start'
> ssh -tt root_at_maverick 'pidof opendkim'
>
> Mixed results (inconsistent behavior starting the daemon over ssh w/ a
> pty) in the following environments:
> * opendkim on lucid (FAILED 10 of 10 attempts)
> * dk-filter on maverick (FAILED 1 of 10 attempts)
> * dk-filter on lucid (FAILED 10 of 10 attempts)
> * opendkim on natty (using `sudo service opendkim start`, FAILED 10 of
> 10 attempts)
> * dk-filter on natty (using `sudo service dk-filter start`, FAILED 10
> of 10 attempts)
>
> Interestingly, dk-filter on Ubuntu 9.10 karmic succeeded on 10 of 10
> attempts. This is obviously a timing issue, but it is possible this
> issue was introduced between karmic and lucid. I'm going to raise
> this with the Ubuntu maintainers and try to determine whether opendkim
> and dk-filter are "at fault" or if this is an issue with the Ubuntu or
> Debian distro.
>
> As a workaround, I've added '&& sleep 1' to my start commands. This
> works consistently for opendkim and dk-filter on all four Ubuntu
> releases tested.

This makes me think the shell script is imposing some dependency on that
pty, but I'm at a loss to understand what that might be. This is all very
puzzling for a script that's starting a daemon.

-MSK
Received on Sun Apr 17 2011 - 20:52:14 PST

This archive was generated by hypermail 2.3.0 : Mon Oct 29 2012 - 23:20:17 PST