Re: br-msk-stats-update from Alessandro Vesely on 2010-08-29 (OpenDKIM Developers mailing list)

From: Alessandro Vesely <vesely_at_tana.it>
Date: Sun, 29 Aug 2010 21:21:56 +0200

On 29/Aug/10 20:12, Murray S. Kucherawy wrote:
> Your patch adds the job ID to the signature lines, but we've found
> that this by itself is incomplete as, unlike sendmail, postfix
> recycles its job IDs. For the messages table we now have a unique
> constraint across the job ID and the message time, so to accomodate
> this we'd need to include that field in signature lines as well.

That is only relevant to processes that don't lock before writing the
file. But then the plain stat files can be sorted _before_ feeding
them to opendkim-importstats. The chances to get conflicting job IDs
are totally negligible at that point.

On importing the plain stat file, a check that the sorting has been
successful may still be worth, perhaps. Occasionally, a signature may
also get lost if log files are rotated while they are still being
written, and the repeated job ID might ease debugging... Just forget
that patch if it doesn't make much sense to you.

For zdkimfilter, I may have to do file locking so as not to rely on
fprintf buffering behavior. I still have to code the logging function.

>> I'm not familiar with how headers are delivered to the milter's
>> mlfi_header function, so this question may be silly: shouldn't there
>> be some code in opendkim/stats.c to guard against \t, \n, and
>> perhaps spaces possibly occurring in hdr_val?
>
> Why? We do want those because "simple" canonicalization needs to see
> how the header field's contents were wrapped and what spaces were
> where in order to produce the header hash correctly.
>
> mlfi_header() passes the header field's name and contents, including
> all spaces, CRs and LFs, minus the trailing CR and LF characters.

That would disrupt the plain file in case you get, say

Content-Type: text/plain<TAB>; charset: utf8

If that <TAB> makes it to the plain stats file, strtok will choke on
it, and give an invalid M line.

>> One further question: why does stats.c collapse return codes as in
>>
>> fprintf(out, "\t%d", err == DKIM_SIGERROR_VERSION ||
>> err == DKIM_SIGERROR_DOMAIN || ...)
>>
>> rather than printing the raw err? (Hm... that may require logging
>> the library version if you intend to alter codes in dkim.h)
>
> The current stats reporting model doesn't differentiate between
> various kinds of syntax errors. You're right though that it could, at
> no additional expense, and now would be the time to make that change.

I suggest not to miss this occasion.

> The trial stats DB using the new schema has already logged 15 errors
> without the extra detail, so I'll have to figure out what to do with
> those.

One possibility is to add another field and leave it NULL for existing
data, set it to 1 for libopendkim v2, leaving further values for other
packages, if they'll ever come. Generic stats may select counts for
err<>0, so it doesn't matter if it is 1 or 43. Error analysis will
only make sense on recordsets where the added field's values are equal.
Received on Sun Aug 29 2010 - 19:22:07 PST

This archive was generated by hypermail 2.3.0 : Mon Oct 29 2012 - 23:32:53 PST