Forums

Unfortunately no one can be told what FluxBB is - you have to see it for yourself.

You are not logged in.

#1 2010-05-05 21:17:30

damaxxed
Member
From: Germany
Registered: 2008-05-16
Posts: 353

Anti Spam in core

Hey

I'm just thinking about anti spam functions for FluxBB. It's my opinion that a basic spam protection (that works - no email confirmation) should be implemented in the core. I didn't follow FluxBB for a long time, please correct me if I have something wrong.

Currently

  • email verification
    + makes sure the provided email is real
    - no protection

Proposals

  • hide a field with CSS
    + no annoyance for users
    - a specialized bot can adapt easily (?)

  • CAPTCHA
    + very good protection
    - users have to spend time to do it

  • use blacklists (Akismet, stopforumspam.com)
    + no annoyance for users
    - recognizes only already known spammers
    - chances of false positives

What is your opinion on this? Any more proposals?

Offline

#2 2010-05-05 22:01:30

Reines
Administrator
From: Scotland
Registered: 2008-05-11
Posts: 3,197
Website

Re: Anti Spam in core

I've been experimenting with a few ideas on the forums the past few days.

  • Hidden field with CSS - May have stopped some spam, but didn't stop all.

  • Akismet - Gets quite a few false positives when trying to post links and so on. I am thinking of implementing Akismet here for users with < 10 posts for example.

  • Stopforumspam - Didn't seem to catch much at all

  • DNSBLs - Either didn't catch much, or had too many false positives, depending on which BL was used.

I'm not sure if maybe the best solution is a combination of a few, I'm still trying to decide what the best solution to run here is.

Offline

#3 2010-05-05 22:14:55

damaxxed
Member
From: Germany
Registered: 2008-05-16
Posts: 353

Re: Anti Spam in core

Hey Reines

what is your exact implementation of "Hidden field with CSS"?

I'm thinking about something like the following:

  1. the username field is renamed to something new for every user/session (e.g. A1B2)

  2. a new hidden field with the name "username" is added as honeypot for bots

  3. some additional hidden fields with names with the same scheme like the real username field are added (e.g. A2B2, A2B3) as honeypots

I'm pretty sure no bot is able to pass this test. Any input?

Offline

#4 2010-05-05 22:16:45

Franz
Lead developer
From: Germany
Registered: 2008-05-13
Posts: 6,719
Website

Re: Anti Spam in core

Wouldn't they just go for the one field that isn't hidden (not really hard to parse) after they'd figured it out?


fluxbb.de | develoPHP

"As code is more often read than written it's really important to write clean code."

Offline

#5 2010-05-05 22:29:53

damaxxed
Member
From: Germany
Registered: 2008-05-16
Posts: 353

Re: Anti Spam in core

True. Nevertheless I'm convinced it's hard for bots to figure out which field is "hidden", because there are many ways of hiding a field:

  • visibility:hidden

  • display:none

  • position:absolute; left:-9999px;

  • .....

Offline

#6 2010-05-05 22:41:30

Reines
Administrator
From: Scotland
Registered: 2008-05-11
Posts: 3,197
Website

Re: Anti Spam in core

Franz wrote:

Wouldn't they just go for the one field that isn't hidden (not really hard to parse) after they'd figured it out?

In theory, but it would require new bots to be written. If we can cut out all currently existing bots, that's at least a start.

damaxxed wrote:

Hey Reines

what is your exact implementation of "Hidden field with CSS"?

I'm thinking about something like the following:

  1. the username field is renamed to something new for every user/session (e.g. A1B2)

  2. a new hidden field with the name "username" is added as honeypot for bots

  3. some additional hidden fields with names with the same scheme like the real username field are added (e.g. A2B2, A2B3) as honeypots

I'm pretty sure no bot is able to pass this test. Any input?

I'm going to apply a patch to the site implementing that idea and see what happens.

Offline

#7 2010-05-05 22:47:28

damaxxed
Member
From: Germany
Registered: 2008-05-16
Posts: 353

Re: Anti Spam in core

Reines wrote:

I'm going to apply a patch to the site implementing that idea and see what happens.

Nice! I'm really interested.

Offline

#8 2010-05-05 22:49:56

Reines
Administrator
From: Scotland
Registered: 2008-05-11
Posts: 3,197
Website

Re: Anti Spam in core

I've applied this patch for now. We currently have 3,697 (2,252 verified) users with the latest user "test". Will check back tomorrow morning and see if it had any effect.

Last edited by Reines (2010-05-05 23:01:28)

Offline

#9 2010-05-05 23:01:47

Paul
Developer
From: Wales, UK
Registered: 2008-04-27
Posts: 1,653

Re: Anti Spam in core

Wouldn't it have been an idea to make the registration form/system different from 1.2 and therefore different from PunBB. That wouldn't stop anything in itself but it would make more work for the bot coders who are still probably more interested in PunBB installs.


The only thing worse than finding a bug is knowing I created it in the first place.

Offline

#10 2010-05-06 08:08:35

Reines
Administrator
From: Scotland
Registered: 2008-05-11
Posts: 3,197
Website

Re: Anti Spam in core

Quite possibly yes, hindsight is a bitch tongue

It looks like either the patch is having some effect, or spam bots don't like Thursday mornings. Annoyingly the data I have isn't great since it's over fairly short time periods so could just be anomalies, but:

  • Before the server move, with 1.3 - we were getting around 100-200 user registrations a day.

  • After the move, with 1.4 and akismet filtering registrations and posts - we were getting around 40 user registrations a day.

  • Yesterday between removing the akismet filtering and adding the honeypot patch, we had around 90 user registrations within 5 hours.

  • In 9 hours since adding the honeypot patch we've had around 10 user registrations.

I think I'll maybe prune some of the old spam users, then leave the patch in place for a bit to see if the trend continues.

Edit: Bots seem to love the timezone -12, I wonder if we can use that to our advantage since according to Wikipedia the only places that use the timezone -12 are Baker Island and Howland Island, and both of them are actually uninhabited.

Last edited by Reines (2010-05-06 08:31:08)

Offline

#11 2010-05-06 08:34:13

Rich Pedley
Member
From: Liverpool, UK
Registered: 2008-05-13
Posts: 246
Website

Re: Anti Spam in core

note on CAPTCHA - besides being difficult for some users to understand  - the majority have already been broken.


my mind is on a permanent tangent

Offline

#12 2010-05-06 08:52:19

Reines
Administrator
From: Scotland
Registered: 2008-05-11
Posts: 3,197
Website

Re: Anti Spam in core

This is a bit overkill but it would be interesting to see if some kind of AI (i.e. like an email spam filter works) could be made that would recognise spam registrations from the following:

  • username

  • email

  • email_setting

  • timezone

  • dst

  • REMOTE_ADDR

  • HTTP_ACCEPT

  • HTTP_REFERER

  • HTTP_USER_AGENT

From a quick glance I can't seem to see any solutions out there for this, I wonder if that's because it doesn't work, or just no-one has bothered to try it.

I assume Akismet probably works like this behind the scenes, but a big part of their filtering is based on post content, which obviously you don't have during registration.

Last edited by Reines (2010-05-06 10:21:04)

Offline

#13 2010-05-06 10:28:26

Paul
Developer
From: Wales, UK
Registered: 2008-04-27
Posts: 1,653

Re: Anti Spam in core

Your figures tie in with mine.  I was pruning zero posters registered more than a week ago and was pruning between 700 and 1000 each time not including unverifieds.


The only thing worse than finding a bug is knowing I created it in the first place.

Offline

#14 2010-05-06 11:50:55

Reines
Administrator
From: Scotland
Registered: 2008-05-11
Posts: 3,197
Website

Re: Anti Spam in core

Okay at the moment I've got a combination of honeypot and stopforumspam running. It's logging all registrations and if they were flagged as spam and hence blocked or not. I've also got it reporting any flagged as spam to the stopforumspam database. I'll report back with results smile

Offline

#15 2010-05-06 12:28:09

Paul
Developer
From: Wales, UK
Registered: 2008-04-27
Posts: 1,653

Re: Anti Spam in core

damaxxed wrote:

True. Nevertheless I'm convinced it's hard for bots to figure out which field is "hidden", because there are many ways of hiding a field:

  • visibility:hidden

  • display:none

  • position:absolute; left:-9999px;

  • .....

The problem is screenreaders are bots.  Screenreaders treat display: none as "voice none" and I think the same goes for visibility: hidden but those are the only hiding techniques you can use. Of course the field would also appear if css were disabled.  To make certain that the thing is accessible you would really need a label which says "If you are human do not complete this field".


The only thing worse than finding a bug is knowing I created it in the first place.

Offline

#16 2010-05-06 12:40:32

Reines
Administrator
From: Scotland
Registered: 2008-05-11
Posts: 3,197
Website

Re: Anti Spam in core

The other way is to use html comments to exclude the field. Assuming the bots are using regex rather than proper xml parsing that should work without any side effects.

Offline

#17 2010-05-06 13:29:14

MattF
Member
From: South Yorkshire, England
Registered: 2008-05-06
Posts: 1,233
Website

Re: Anti Spam in core

Reines wrote:

The other way is to use html comments to exclude the field. Assuming the bots are using regex rather than proper xml parsing that should work without any side effects.

IMHO, Paul's approach seems best. A simple label prompting people not to fill the field out lest they be marked as a spammer should work fine. It's simple and precise.


Screw the chavs and God save the Queen!

Offline

#18 2010-05-06 13:59:56

MattF
Member
From: South Yorkshire, England
Registered: 2008-05-06
Posts: 1,233
Website

Re: Anti Spam in core

Reines wrote:

This is a bit overkill but it would be interesting to see if some kind of AI (i.e. like an email spam filter works) could be made that would recognise spam registrations from the following:

  • username

  • email

  • email_setting

  • timezone

  • dst

  • REMOTE_ADDR

  • HTTP_ACCEPT

  • HTTP_REFERER

  • HTTP_USER_AGENT

From a quick glance I can't seem to see any solutions out there for this, I wonder if that's because it doesn't work, or just no-one has bothered to try it.

I assume Akismet probably works like this behind the scenes, but a big part of their filtering is based on post content, which obviously you don't have during registration.

I've been thinking something similar for a while, namely, why aren't their any DNS type blocklists, (they've been around for donkeys in other areas), to make filtering on some of that information viable. There are online blocklists, like stopforumspam, for example, but can't note seeing one DNS based RBL as of yet, (which would mean less data transfer for the check too).

Last edited by MattF (2010-05-06 14:15:53)


Screw the chavs and God save the Queen!

Offline

#19 2010-05-06 14:04:36

Reines
Administrator
From: Scotland
Registered: 2008-05-11
Posts: 3,197
Website

Re: Anti Spam in core

StopForumSpam contributes to a DNSBL which you can use, but only for IPs obviously. However the DNSBL also seems to include stuff from other places, less related to forum spam, so I didn't try using it.

Offline

#20 2010-05-06 14:11:13

MattF
Member
From: South Yorkshire, England
Registered: 2008-05-06
Posts: 1,233
Website

Re: Anti Spam in core

Reines wrote:

StopForumSpam contributes to a DNSBL which you can use, but only for IPs obviously. However the DNSBL also seems to include stuff from other places, less related to forum spam, so I didn't try using it.

Exactly. There's no specific setup merely for forum related spam. Any that do fringe on the area are usually more biased towards the traditional use of the RBL, i.e: MTA's.

It's another one of those ideas on my, (long big_smile), list to possibly play around with in the future. It's the submission side of things which would be the pain. The DNS part would be a doddle to setup.

Last edited by MattF (2010-05-06 14:13:45)


Screw the chavs and God save the Queen!

Offline

#21 2010-05-06 14:15:13

Reines
Administrator
From: Scotland
Registered: 2008-05-11
Posts: 3,197
Website

Re: Anti Spam in core

Though to be fair, what I was talking about was the idea of a system specific to FluxBB that would be able to determine is a user is spam or not without making use of blacklists. For example 1 series of spam bots seem to always set the timezone to -12, and since the only islands with timezone -12 are uninhabited, it's a fairly good indication of spam. What I was wondering was if, with enough training data, an AI based system could be made which could distinguish between spam and legitimate user without the use of hard coded rules or blacklists.

Offline

#22 2010-05-06 14:20:10

MattF
Member
From: South Yorkshire, England
Registered: 2008-05-06
Posts: 1,233
Website

Re: Anti Spam in core

Reines wrote:

Though to be fair, what I was talking about was the idea of a system specific to FluxBB that would be able to determine is a user is spam or not without making use of blacklists. For example 1 series of spam bots seem to always set the timezone to -12, and since the only islands with timezone -12 are uninhabited, it's a fairly good indication of spam. What I was wondering was if, with enough training data, an AI based system could be made which could distinguish between spam and legitimate user without the use of hard coded rules or blacklists.

Sort of a trained filter, like they use for email spam filtering, (though obviously not for email). Wouldn't the Bayesian? type filter be adaptable to that use?


Screw the chavs and God save the Queen!

Offline

#23 2010-05-06 14:26:36

Reines
Administrator
From: Scotland
Registered: 2008-05-11
Posts: 3,197
Website

Re: Anti Spam in core

I think so, I was aiming to have a look into it after work today, though I'm not sure if the data available during registration would be enough to make any decisions accurately.

Offline

#24 2010-05-06 15:18:21

Paul
Developer
From: Wales, UK
Registered: 2008-04-27
Posts: 1,653

Re: Anti Spam in core

You could also take account that at least 75% of the email addresses are @yahoo.com or at least they were when I was deleting them.

Of course if you happen to be camped out on the Howland Islands and have a yahoo email account you would be screwed.


The only thing worse than finding a bug is knowing I created it in the first place.

Offline

#25 2010-05-06 15:21:25

Reines
Administrator
From: Scotland
Registered: 2008-05-11
Posts: 3,197
Website

Re: Anti Spam in core

Well the point of an AI approach is it automatically learns patterns like that without anyone needing to define them, though yeah that is another good example, and I noticed it too. Oddly enough according to stopforumspam the biggest amount of spam comes from gmail.com, followed by mail.ru then yahoo.com.

Paul wrote:

Of course if you happen to be camped out on the Howland Islands and have a yahoo email account you would be screwed.

rofl

Offline

Board footer

Powered by FluxBB