Re: Another idea of a stat to track

From: Todd Lyons <tlyons_at_ivenue.com>
Date: Thu, 9 Sep 2010 20:21:02 -0700

On Thu, Sep 9, 2010 at 4:52 PM, Murray S. Kucherawy <msk_at_blackops.org> wrote:
> The current stats tables track domains seen.  Mostly this is a compression
> mechansim, mapping domain names (up to 255 bytes) to integers (4 bytes).
>
> We could also include in the domains table a timestamp of the first time the
> domain is seen and reported.  With a long enough data collection period, it
> becomes possible to identify with some accuracy "new" domains that might
> have been recently registered for spamming purposes and filter accordingly,
> with or without correlation to DKIM results and spam scores.
>
> It's easy to do this in the database without affecting the data already
> there, except I'll have to go back and set that column to match the
> timestamps in the messages table.

I think this could be useful in the right hands.

Is this intended to be used by us as end users to try and track these
trends on our own systems? Or is it more your intention to use it on
aggregated data from multiple sources? To date I have not attempted
to store or track any of the stats that the milter is amassing, I'm
just lobbing it all in your direction.

Which leads me to a second question. I have the dev system set to not
anonymize the data as it writes it to the stats file.
Correspondingly, it submits this raw data to you, and the stats page
shows all domains non-anonymized. Is it going to stay this way after
we come out of beta? I'm wondering if it would be possible to
anonymize it as it's being displayed with a simple tertiary command in
the script that generates the html. Or am I worried about nothing?

-- 
Regards...      Todd
I seek the truth...it is only persistence in self-delusion and
ignorance that does harm.  -- Marcus Aurealius
Received on Fri Sep 10 2010 - 03:21:25 PST

This archive was generated by hypermail 2.3.0 : Mon Oct 29 2012 - 23:32:54 PST