How dumb do we think spammers are?
Why is it we easily admit that spammers are people smart enough to run massive bot nets, design custom malware, create rootkits, and adapt to changing protection technologies but we still think that they’re unable to write a pattern to match “user at domain dot com”?
Kudos to the first person who puts such a pattern in the comments below.
How about:
[a-zA-Z0-9._%+-]+s+ats+([a-zA-Z0-9.-]+s+dots)+[a-zA-Z]+
For bonus point, this works for “foo at bar dot co dot uk” and other domains not directly within a TLD 🙂
You’re thinking about it the wrong way.
Who do spammers want to reach? The nontechnical user, the person with more money than sense, the gullible. These people don’t use “user at domain dot com” when they add their email address; half of the time they may not even realised they’ve exposed an email address at all.
Why would spammers waste their time decoding the odd maybe-address formats, when they could simply catch their target market without bothering?
Because we falsely presume that because the spammers often have blatant errors in their English, that they are fundamentally stupid.
I wonder sometimes who is really fundamentally stupid.
However, I’ve done this before without opening the flood gates:
walt dot williams at gmail dot com
and don’t worry about doing so, as the spammers also have good engines for generating common email aliases such as that one – so not putting my alias in public forums is just relying on security through obscurity – and in this case this doesn’t work well. The filters catch 99.9% of it, and because of the poor grammar etc, I catch the rest.
It’s just like everything else on the Internet. You can’t destroy spammers, you can only hope to contain them. By obfuscating your email address you decrease your chances of being spammed by a very small percentage. On the other hand you really piss off people that want to send you an email and can no longer utilize their copy+paste command (I’m one of those pissed off people sometimes).
When in doubt, you could use irregular patterns
“Mail is Firstname at domain nemersonhoover dot com or org”
(that’s what I do)
Or perhaps turnabout is fair play?
“Male iss Firstname aat nemersonhoover pheriodd comm not bhiger peenis now!”
Is there something in here about just having to outrun the other guy, not the bear?
> Why…
It’s human nature to call your enemy stupid. If he was smart, he would see past your game and defeat you, and we don’t want to believe that. Or, civilised behavior suggests that smart people don’t attack each other … so if the enemy is smart, that would make us the dumb ones.
Cognitive dissonance is a very powerful thing. I believe I know what I’m talking about, and if there is a problem here, it’s because you’re dumb, not me 😉
They are not that smart and most of them are lazy.
Name one cyber-criminal you think is smarter than Ron Rivest. We got more like him on our side and I don’t think they have even one on theirs.
But the reason such simple obfustication works is the same reason that a cheap burglar alarm works almost as well as a good one – it does not need to be very good to persuade the opposition it would be easier to steal a different car.
PHB is right. If they can get 90% of the email addresses they crawl past with 10% of the effort, then that’s a much better rate of return.
Incidentally, in the rare cases where I’ve wanted to post my email address to a public place in such a way that it was highly unlikely to get collected I posted this Python:
(lambda f: lambda *a:f(f,*a))(
lambda f,l,i:l[i][1]+f(f,l,l[i][0]) if l[i][0]>0 else “”)(
sorted(enumerate(‘~ooiirnncc@-kk.og’),key=lambda a:a[1]),0)[1:]
Kudos to the first person (other than mordaxus) to identify the origin of the function!
Phill, at the risk of destroying my cred, I don’t associate enough with cyber-criminals to name one of them that is dumber than Ron Rivest, let alone smarter. I can name a few horrid little cyber-punks, but they don’t count in my book.
I understand the point of the economic return. However, this post first occurred to me when I saw a recent talk in the amazing sophistication of today’s cyber-criminals, and then the sign-off page had the email of user at example dot com, and a remark about thwarting spammers. Clearly, this signature had been there for long enough that there was no irony at the time, but in context — it detracted from the message. They’re smart, they’re evil, they rob from old ladies, but they can’t write a regexp.
Doesn’t matter anyway. While spammers certainly “harvest” addresses from mail message/web page text, much higher acquisition rates are available to them by harvesting
mail headers found on compromised systems. Given the
enormous number of compromised systems (on the order of 10e8;
Cerf has given the estimate 2.5 X 10e8) any email address
that’s actually used will sooner or later end up on one of
those in un-obfuscated form.
(The way I’ve put it is this: how many people do you correspond with? What’s the probability that at least one of them is using a compromised system? Or that they might forward a message you’ve sent them to someone who’s using
a compromised system? Or that they might be receiving mail through a compromised system? And so on.)
Address harvesters (who number spammers among their clients) have long since moved past this and are now busy
working out the social graph implied by message headers…
because in an age of whitelisting, that social graph has
higher value on the open market than ordinary address lists.
I’ve long had the address “master AT barefaced DOT cheek” in my usenet sig. What spammers do try to use as addresses is message-ids.