John Aycock and Nathan Friess from University of Calgary have a new
paper on next-generation spam generation. Their big idea is to
have spam zombies mine the corpus of e-mails sent and received by the
user of the infected machine. It than uses that information to craft
spam that emulates that user's style, thus in theory
making it harder to filter. I've never seen this exact
idea before--though there are obviously similar ones--and its
kind of interesting but
it's not entirely clear if it works. The authors show that
they can extract stylistic features, but that's well known.
Unfortunately, they don't benchmark their technique against
existing anti-spam measures, so it's not clear whether it's
any better than existing techniques for fooling spam filters.
2 Comments
Leave a comment
October 2012
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
7 | 8 | 9 | 10 | 11 | 12 | 13 |
14 | 15 | 16 | 17 | 18 | 19 | 20 |
21 | 22 | 23 | 24 | 25 | 26 | 27 |
28 | 29 | 30 | 31 |
I haven't read the paper -- it sounded too speculative for my tastes -- but I'd say it's entirely possible to fool a Bayes-style probabilistic classifier driven from the body text that way. Pasting random paragraphs above and below the spam payload would probably work well for that.
Classifiers that use other features -- such as reputation of URLs found in the content, like the SURBL/URIBL rules in SpamAssassin -- would help guard against this danger.
Finally, I know someone famous - John and I were classmates.