Re: Now opendkim-2.3.2 on a busy e-mail server

From: Gary Mills <mills_at_cc.umanitoba.ca>
Date: Mon, 25 Apr 2011 11:37:53 -0500

On Sun, Apr 24, 2011 at 09:28:04PM -0700, Murray S. Kucherawy wrote:
> On Sun, 24 Apr 2011, Gary Mills wrote:
> >I did notice a couple of peculiarities in error messages. The oddest
> >one was this one:
> >
> > Apr 22 12:16:48 electra opendkim[7846]: [ID 861613 mail.info]
> >p3MHGmT6025602: can't parse From: header value ' fido <www.fido.ca>'
>
> Those were the entire contents of the header field. The part inside angle
> brackets, which is supposed to be 'local-part_at_domain', is clearly
> malformed.

Ah, I never noticed the missing `_at_'! I suppose sendmail would just
qualify that name with the default domain and carry on.

> >I also saw quite a few errors indicating that DNS queries timed out.
> >Here are a couple of examples:
> >
> > Apr 22 19:57:42 electra opendkim[7846]: [ID 467235 mail.error]
> >p3N0vFvh004842: key retrieval failed (s=201006181024,
> >d=facebookmail.com): '201006181024._domainkey.facebookmail.com' query
> >timed out
> >
> > Apr 23 20:56:18 electra opendkim[7846]: [ID 467235 mail.error]
> >p3O1uDZl024176: key retrieval failed (s=beta, d=google.com):
> >'beta._domainkey.google.com' query timed out
> >
> >The curious thing was that when I did the query almost immediately
> >against the same DNS server, it always found the TXT record:
>
> Can you paste the entire log excerpt for one or both of those queue IDs?
> That way we can verify that it did indeed wait for a reply rather than
> reporting this immediately.

Sure:

    Apr 22 19:57:15 electra sm-mta[4842]: [ID 801593 mail.info] p3N0vFvh004842: from=<notification+o4ststs9_at_facebookmail.com>, size=8511, class=0, nrcpts=1, msgid=<3d47d6c3ccf18534e7b964cac73f1fc7_at_async.facebook.com>, proto=ESMTP, daemon=MTA, relay=outmail018.snc4.facebook.com [66.220.144.150]
    Apr 22 19:57:42 electra opendkim[7846]: [ID 467235 mail.error] p3N0vFvh004842: key retrieval failed (s=201006181024, d=facebookmail.com): '201006181024._domainkey.facebookmail.com' query timed out
    Apr 22 19:57:42 electra sm-mta[4842]: [ID 801593 mail.info] p3N0vFvh004842: Milter add: header: X-DCC-UofM-Metrics: electra; whitelist
    Apr 22 19:57:42 electra sm-mta[4939]: [ID 801593 mail.info] p3N0vFvh004842: to=<XXXX_at_cc.umanitoba.ca>, delay=00:00:27, xdelay=00:00:00, mailer=local, pri=38774, relay=localhost, dsn=2.0.0, stat=Sent
    
    Apr 23 20:56:13 electra sm-mta[24176]: [ID 801593 mail.info] p3O1uDZl024176: from=<3F3azTREKB44569w73G+xwwv796FGy66y3w.u64C4GCs5Luu.C4s50B6ts.us_at_feedburner.bounces.google.com>, size=81819, class=-60, nrcpts=1, msgid=<20cf30564123b46c9d04a19f9c4c_at_google.com>, proto=ESMTP, daemon=MTA, relay=mail-iy0-f199.google.com [209.85.210.199]
    Apr 23 20:56:18 electra opendkim[7846]: [ID 467235 mail.error] p3O1uDZl024176: key retrieval failed (s=beta, d=google.com): 'beta._domainkey.google.com' query timed out
    Apr 23 20:56:18 electra sm-mta[24176]: [ID 801593 mail.info] p3O1uDZl024176: Milter: data, reject=452 4.2.1 id p3O1uDZl024176 from 209.85.210.199 temporary embargo. See http://mail.cc.umanitoba.ca/grey/
    Apr 23 20:56:18 electra sm-mta[24176]: [ID 801593 mail.info] p3O1uDZl024176: to=<XXXX_at_cc.umanitoba.ca>, delay=00:00:05, pri=219819, stat=id p3O1uDZl024176 from 209.85.210.199 temporary embargo. See http://mail.cc.umanitoba.ca/grey/

The first was 27 seconds, but the second only 5 seconds.

> Also of interest might be the ResolverTracing output around the time of
> those if you have that enabled and are using libar.

No, I don't have that enabled. The volume is too high for production.

> >Was it looking for a different type? Was the timeout so short that it
> >didn't get the reply?
>
> It was looking for the same query you did. The timeout in the library
> defaults to ten seconds, though if you have DNSTimeout set it will use
> that value instead.
>
> At a guess, your nameserver was just busy at the moment it tried and
> dropped or lost the query. libar will attempt to resend queries at
> intervals matching the default for the system resolver, which is five
> seconds, meaning one retry before the timeout expires.

Our nameserver was unlikely to be busy then, but I suppose that's
possible. Queries could be dropped externally, of course. We even
had a silly firewall configuration once that did rate limiting on DNS
queries from our nameservers. I don't trust firewalls at all!

-- 
-Gary Mills-        -Unix Group-        -Computer and Network Services-
Received on Mon Apr 25 2011 - 16:38:07 PST

This archive was generated by hypermail 2.3.0 : Mon Oct 29 2012 - 23:33:09 PST