Re: 2.9.1 Beta1 available

From: Scott Kitterman <ietf-dkim_at_kitterman.com>
Date: Wed, 22 Jan 2014 13:50:42 -0500

On Tuesday, January 21, 2014 13:22:13 Murray S. Kucherawy wrote:
> On Tue, 21 Jan 2014, Scott Kitterman wrote:
> > Libmilter is whichever version is shipped with Sendmail 8.14.4.
> >
> > I have a suspicion that the problem is related to the amount of time it
> > takes for the milter to shut down as fiddling with
> > MILTERTEST_RETRY_SPEED_FACTOR seems to help and the number of failing
> > tests varies per architecture in the Debian archive. It also,
> > sometimes, varies when I retry with the same package on the same
> > architecture.
>
> The issue is that the configuration object (of which there should be only
> one) appears to have a reference count that isn't zero when something
> tries to destroy the object. The reference count increases when a new
> client connects, and decreases when a client disconnects. For a test
> where there's only ever one client, the test should really never fail.
>
> Given what you said above, it looks like there's a race condition between
> a client connecting (and incrementing the refcount) and miltertest giving
> up, closing the connection, and making a new attempt. But this would
> suggest there's possibly an issue with libmilter; it shouldn't be possible
> to complete connection setup without going through connection shutdown.
> I'll see if there's anything like this known.
>
> > If you can give me the command line to invoke one of those tests to get
> > a core, I'll do it by hand and see what I get.
>
> After a full build, you'd have to do this:
>
> - add these to the .lua file for the failing test, right before
> mt.connect():
>
> mt.set_timeout(300)
> mt.sleep(15)
>
> - add "-v -v -v" to the miltertest command in the test shell script; this
> will cause the pid of the child opendkim to be shown when it's created
>
> - run the script; the child will be created, but no traffic will be sent
> to it for 15 seconds (the mt.sleep())
>
> - attach gdb to the child opendkim (the pid will be shown in the script
> output)
>
> - miltertest will resume and you should see the failure resulting in
> SIGABRT; the mt.set_timeout() call will keep miltertest waiting for five
> minutes while you poke around
>
> - use whatever gdb commands you like to poke around, e.g. "thread apply
> all bt"
>
> - "quit" when you're done; miltertest will complete (but fail, of course)
>
> -MSK

I can't seem to get it to fail when running the tests one by one, so I think
it's some kind of slow to stop problem and then a race condition.

Here's the output:

*** simple/simple verify revoked key
lt-miltertest: argv[0] =
`/tmp/buildd/opendkim-2.9.1~Beta1/opendkim/tests/../opendkim'
lt-miltertest: argv[1] = `-x'
lt-miltertest: argv[2] = `t-verify-revoked.conf'
lt-miltertest: argv[3] = `-p'
lt-miltertest: argv[4] =
`unix:/tmp/buildd/opendkim-2.9.1~Beta1/opendkim/tests/t-verify-revoked.sock'
lt-miltertest: '/tmp/buildd/opendkim-2.9.1~Beta1/opendkim/tests/../opendkim'
started in process 23349
lt-miltertest: pausing for 15.000000 seconds
lt-miltertest: connected to
'unix:/tmp/buildd/opendkim-2.9.1~Beta1/opendkim/tests/t-verify-revoked.sock',
fd 3
lt-miltertest: mt_milter_write(3): cmd O, len 12
lt-miltertest: mt_milter_read(3): cmd O, len 12
lt-miltertest: mt_milter_write(3): cmd C, len 23
lt-miltertest: mt_milter_read(3): cmd c, len 0
lt-miltertest: connection details sent on fd 3, reply 'c'
lt-miltertest: mt_milter_write(3): cmd D, len 20
lt-miltertest: 1 'M' macro(s) sent on fd 3
lt-miltertest: mt_milter_write(3): cmd M, len 17
lt-miltertest: mt_milter_read(3): cmd c, len 0
lt-miltertest: MAIL sent on fd 3, reply 'c'
lt-miltertest: mt_milter_write(3): cmd R, len 24
lt-miltertest: mt_milter_read(3): cmd c, len 0
lt-miltertest: mt_milter_write(3): cmd L, len 442
lt-miltertest: mt_milter_read(3): cmd c, len 0
lt-miltertest: header sent on fd 3, reply 'c'
lt-miltertest: mt_milter_write(3): cmd L, len 23
lt-miltertest: mt_milter_read(3): cmd c, len 0
lt-miltertest: header sent on fd 3, reply 'c'
lt-miltertest: mt_milter_write(3): cmd L, len 38
lt-miltertest: mt_milter_read(3): cmd c, len 0
lt-miltertest: header sent on fd 3, reply 'c'
lt-miltertest: mt_milter_write(3): cmd L, len 22
lt-miltertest: mt_milter_read(3): cmd c, len 0
lt-miltertest: header sent on fd 3, reply 'c'
lt-miltertest: mt_milter_write(3): cmd N, len 0
lt-miltertest: mt_milter_read(3): cmd c, len 0
lt-miltertest: EOH sent on fd 3, reply 'c'
lt-miltertest: mt_milter_write(3): cmd B, len 17
lt-miltertest: mt_milter_read(3): cmd c, len 0
lt-miltertest: 17 byte(s) of body sent on fd 3, reply 'c'
lt-miltertest: mt_milter_write(3): cmd E, len 0
lt-miltertest: mt_milter_read(3): cmd i, len 151
lt-miltertest: mt_milter_read(3): cmd a, len 0
lt-miltertest: EOM sent on fd 3, reply 'a'
lt-miltertest: mt_milter_write(3): cmd Q, len 0
lt-miltertest: disconnected fd 3
lt-miltertest: waiting for process 23349

There are several seconds of waiting for process 23349. I'll give the next
beta a try and see if it's any different.

Scott K
Received on Wed Jan 22 2014 - 18:50:53 PST

This archive was generated by hypermail 2.3.0 : Wed Jan 22 2014 - 18:54:01 PST