Re: br-msk-stats-update from Murray S. Kucherawy on 2010-08-29 (OpenDKIM Developers mailing list)

From: Murray S. Kucherawy <msk_at_blackops.org>
Date: Sun, 29 Aug 2010 11:12:41 -0700 (PDT)

Sorry for the delayed reply; was out on vacation this past week.

On Wed, 25 Aug 2010, Alessandro Vesely wrote:
> I've had a look at the source, and I think I'll be able to update
> zdkimfilter so as to write a compatible log file. I'll need a script to
> reorder the log, which may be eased in case you take the patch attached.
> I don't know why you'd want to take it, possibly for catching some rare
> errors, e.g. someone running multiple milters that log on the same stats
> file. If you don't take it, I can use sed to remove the jobid from "S"
> lines, after sorting, in the sending script. I leave it to you to
> decide. Mind that I haven't even tried to compile that stuff, let alone
> test it, it is just an idea...

Your patch adds the job ID to the signature lines, but we've found that
this by itself is incomplete as, unlike sendmail, postfix recycles its job
IDs. For the messages table we now have a unique constraint across the
job ID and the message time, so to accomodate this we'd need to include
that field in signature lines as well.

At the moment the association between a message and its signatures is made
by doing an insertion into the messages table, then getting that message's
ID from the database (auto-generated, retrieved by a specific SELECT),
then using that to create the relation between the signatures that get
inserted next. To break that assumption the message insertion would need
to be followed immediately by a query to get the auto-generated ID, and
then use that for the signature insertions.

There are already examples of how to do this elsewhere in the code if
you'd like to (and have time to) revise your patch. A recent patch added
a check to see that a message being added isn't already there. This logic
could simply be copied.

> I'm not familiar with how headers are delivered to the milter's
> mlfi_header function, so this question may be silly: shouldn't there be
> some code in opendkim/stats.c to guard against \t, \n, and perhaps
> spaces possibly occurring in hdr_val?

Why? We do want those because "simple" canonicalization needs to see how
the header field's contents were wrapped and what spaces were where in
order to produce the header hash correctly.

mlfi_header() passes the header field's name and contents, including all
spaces, CRs and LFs, minus the trailing CR and LF characters.

> One further question: why does stats.c collapse return codes as in
>
> fprintf(out, "\t%d", err == DKIM_SIGERROR_VERSION ||
> err == DKIM_SIGERROR_DOMAIN || ...)
>
> rather than printing the raw err? (Hm... that may require logging the
> library version if you intend to alter codes in dkim.h)

The current stats reporting model doesn't differentiate between various
kinds of syntax errors. You're right though that it could, at no
additional expense, and now would be the time to make that change. The
trial stats DB using the new schema has already logged 15 errors without
the extra detail, so I'll have to figure out what to do with those.

-MSK
Received on Sun Aug 29 2010 - 18:13:01 PST

This archive was generated by hypermail 2.3.0 : Mon Oct 29 2012 - 23:32:53 PST