SpamHash

From ThorxWiki
Revision as of 11:46, 12 March 2009 by Nemo (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

aka SMTP, aka rfc2822

  • Some problems associated with email...
    • spam.

Solution to spam problem

  • spamsum

How does this solve?

A spampot delivery point is known (assumed) to recieve 100% spam. All incoming messages to this mailbox are spamsum'd.

Then as a seperate filter, ALL incoming messages to anyone are spamsum checked against the database of messages generated by the spamsum scores. If the new message rates as "too similar", then it is also assumed to be spam, and dropped. bye bye.

Notes

  • The spamsum score db should be rotated. New scores are appended at the end, so approx a week of scores to be kept. (note however that this is would be a flat text file with no internal dates (would spamsum mind if dates are munged in as "not formatted correctly spamsum scores"?), so rotation would likely have to be performed in a "number of lines" manner. Say, 'keep 10,000 scores', rotate daily.
  • filtered messages should be kept for sanity checking

Pros

  • spamsum is quick, saves message being filtered by heavier bayesian/etc filters
  • dynamically reacts to new spam - so long as spampot is sufficiently knowledgable

Cons

  • spampot address requires accepting messages we consider to be known spam
  • requires totality of message to be accepted
Personal tools
Namespaces

Variants
Actions
Navigation
meta navigation
More thorx
Tools