RE: thread leakage

From: Murray S. Kucherawy <msk_at_cloudmark.com>
Date: Mon, 14 Mar 2011 14:15:01 -0700

> -----Original Message-----
> From: opendkim-dev-bounce_at_lists.opendkim.org [mailto:opendkim-dev-
> bounce_at_lists.opendkim.org] On Behalf Of Daniel Black
> Sent: Monday, March 14, 2011 1:30 PM
> To: opendkim-dev_at_lists.opendkim.org
> Subject: Re: thread leakage
>
> Thread 16 (Thread 0xb7901b70 (LWP 21329)):
> #0 0xb7f5d424 in __kernel_vsyscall ()
> #1 0xb7b5116d in do_sigwait () from /lib/i686/cmov/libpthread.so.0
> #2 0xb7b51210 in sigwait () from /lib/i686/cmov/libpthread.so.0
> #3 0x0804d185 in dkimf_reloader (vp=0x0) at opendkim.c:4533
> #4 0xb7b48955 in start_thread () from /lib/i686/cmov/libpthread.so.0
> #5 0xb7ac8e7e in clone () from /lib/i686/cmov/libc.so.6

The opendkim reloader signal handler; sits around waiting for SIGUSR1 to cause a configuration reload.

> Thread 15 (Thread 0xb7100b70 (LWP 21330)):
> #0 0xb7f5d424 in __kernel_vsyscall ()
> #1 0xb7b5116d in do_sigwait () from /lib/i686/cmov/libpthread.so.0
> #2 0xb7b51210 in sigwait () from /lib/i686/cmov/libpthread.so.0
> #3 0xb7f33508 in mi_signal_thread (name=0x9f81750) at signal.c:110
> #4 0xb7b48955 in start_thread () from /lib/i686/cmov/libpthread.so.0
> #5 0xb7ac8e7e in clone () from /lib/i686/cmov/libc.so.6

The libmilter signal handler; sits around waiting for SIGHUP, SIGINT or SIGTERM to cause shutdown.

> Thread 14 (Thread 0xb68ffb70 (LWP 21331)):
> #0 0xb7f5d424 in __kernel_vsyscall ()
> #1 0xb7abb696 in poll () from /lib/i686/cmov/libc.so.6
> #2 0xb7f35b94 in mi_pool_controller (arg=0x0) at worker.c:458
> #3 0xb7b48955 in start_thread () from /lib/i686/cmov/libpthread.so.0
> #4 0xb7ac8e7e in clone () from /lib/i686/cmov/libc.so.6

Something having to do with libmilter's worker pools. Not familiar with this code.

> Thread 13 (Thread 0xb4effb70 (LWP 21454)):
> #0 0xb7f5d424 in __kernel_vsyscall ()
> #1 0xb7ac96f6 in epoll_wait () from /lib/i686/cmov/libc.so.6
> #2 0xb79883cc in ?? () from /usr/lib/libev.so.3
> #3 0xb798a9d4 in ev_loop () from /usr/lib/libev.so.3
> ---Type <return> to continue, or q <return> to quit---
> #4 0xb798c874 in event_base_loop () from /usr/lib/libev.so.3
> #5 0xb798c8a5 in event_base_dispatch () from /usr/lib/libev.so.3
> #6 0xb7b9ba56 in ?? () from /usr/lib/libunbound.so.2
> #7 0xb7b714d1 in ?? () from /usr/lib/libunbound.so.2
> #8 0xb7b48955 in start_thread () from /lib/i686/cmov/libpthread.so.0
> #9 0xb7ac8e7e in clone () from /lib/i686/cmov/libc.so.6

The libunbound master thread, I'm guessing. It appears to be waiting for a reply from a nameserver, or possibly waiting for a new request from libopendkim.

> Thread 12 (Thread 0xb46feb70 (LWP 21847)):
> #0 0xb7f5d424 in __kernel_vsyscall ()
> #1 0xb7b4d482 in pthread_cond_timedwait_at__at_GLIBC_2.3.2 () from /lib/i686/cmov/libpthread.so.0
> #2 0x080622a0 in dkimf_unbound_wait (srv=0x9f81768, qh=0xa231498,
> to=0xb46f88f0, bytes=0xb46f890c, error=0x0, dnssec=0x0)
> at opendkim-dns.c:196
> #3 dkimf_ub_waitreply (srv=0x9f81768, qh=0xa231498, to=0xb46f88f0,
> bytes=0xb46f890c, error=0x0, dnssec=0x0) at opendkim-dns.c:417
> #4 0xb7f4234f in dkim_get_policy_dns_excheck (dkim=0xa230cd0, query=0xa221780
> "mtnl.net.in", qstatus=0xb46fb7ec) at dkim-policy.c:276
> #5 0xb7f42b99 in dkim_get_policy_dns (dkim=0xa230cd0, query=0xa221780
> "mtnl.net.in", excheck=true, buf=0xb46fb3eb "", buflen=1025,
> qstatus=0xb46fb7ec) at dkim-policy.c:425
> #6 0xb7f4adee in dkim_get_policy (dkim=0xa230cd0, query=0x2223c <Address 0x2223c out of bounds>, excheck=252, qstatus=0xb46fb94c,
> policy=0xb46fb944, pflags=0xb46fb948) at dkim.c:2504
> #7 0xb7f4b04c in dkim_policy (dkim=0xa230cd0, pcode=0xb4f1f00c, pstate=0x0) at dkim.c:4860
> #8 0x08050b21 in mlfi_eom (ctx=0xa2312c8) at opendkim.c:11355
> #9 0xb7f313ef in st_bodyend (g=0xb46fe2e4) at engine.c:1614
> #10 0xb7f30f65 in mi_engine (ctx=0xa2312c8) at engine.c:405
> #11 0xb7f354bd in mi_worker (arg=0xa20f9b0) at worker.c:652
> #12 0xb7b48955 in start_thread () from /lib/i686/cmov/libpthread.so.0
> #13 0xb7ac8e7e in clone () from /lib/i686/cmov/libc.so.6

This and the seven others like it are sitting in opendkim functions awaiting a reply to a DNS query that has been submitted to unbound. They are in pthread_cond_timedwait() which means they are not deadlocked, but rather are waiting for a condition to be signaled that hasn't happened yet. There is some timeout in effect (can't tell what it is from here; try "up 2" and "print timeout" to find out) that hasn't been reached.

Thread 8 is slightly different. It has volunteered to be the one that actually waits on a descriptor to see if unbound has an answer to return, and will then notify the eight threads above if/when their replies arrive. It's the one sitting dkimf_wait_fd(). There should always be exactly one of those except when the system is idle, and that's the case here. It too has a timeout that has not yet been reached, though we can't see what that timeout is (try "up 2" and "print *until").

> Thread 3 (Thread 0xb02feb70 (LWP 13413)):
> #0 0xb7f5d424 in __kernel_vsyscall ()
> #1 0xb7b4d482 in pthread_cond_timedwait_at__at_GLIBC_2.3.2 () from
> /lib/i686/cmov/libpthread.so.0
> #2 0xb7f355c7 in mi_worker (arg=0xa2049b0) at worker.c:724
> #3 0xb7b48955 in start_thread () from /lib/i686/cmov/libpthread.so.0
> #4 0xb7ac8e7e in clone () from /lib/i686/cmov/libc.so.6

This and the next one are milter workers that are currently idle.

> Thread 1 (Thread 0xb79786c0 (LWP 21328)):
> #0 0xb7f5d424 in __kernel_vsyscall ()
> #1 0xb7abb696 in poll () from /lib/i686/cmov/libc.so.6
> #2 0xb7f324cd in mi_listener (conn=0x9f7e9e8 "inet:8891", dbg=0,
> smfi=0x9f81708, timeout=7210, backlog=128) at listener.c:766
> #3 0xb7f32d57 in smfi_main () at main.c:242
> #4 0x0805d617 in main (argc=7, argv=0xbfe79e93) at opendkim.c:14485

This is libmilter's master thread, listening for new connections from MTAs.

So this looks pretty normal, depending on whether or not you would expect the DNS to be answering fast enough that there shouldn't be this many threads waiting for answers all at once. It might be interesting to see what's in the timeout variables for each of the waiting threads just in case they're somehow absurdly high values.
Received on Mon Mar 14 2011 - 21:15:09 PST

This archive was generated by hypermail 2.3.0 : Mon Oct 29 2012 - 23:33:09 PST