I've been fighting a lot of spam over at ASCIIBin. In order to minimize it I wrote a class that attempts to detect spam and reject the content before it is submitted. It uses:
- Akismet - Checks against Content, URL, Name, Email
- SURBL - All URLs in content are parsed and checked against this spam database. This database contains URLs submitted via spam emails.
- Spamhaus - Checks IP for known spammers
- SpamCop - Checks IP for known spammers
Prerequisites
This class is based on two other classes which you'll need to download and install.
- PEAR::Net_DNSBL - Use pear to install (pear install Net_DNSBL)
- PHP5Akismet - Download and extract archive. Rename folder to Akismet
- You will need to obtain a WordPress API key here
The PEAR class uses your resolv.conf file located in /etc/resolv.conf. If you have PHP Open Base Dir restriction you'll need to put a file named '.resolv.conf' in the directory executing this script. If you are in Windows you can create \etc\resolv.conf or place .resolv.conf in the executing directory. resolv.conf contains a list of nameservers which are needed by Net_DNSBL to send TCP/UDP packets.
The Script
Example Usage:Code:<?php
// {{{ Header
/**
* ASCII Post/Comment Spam Checker
*
* PHP versions 5
*
* LICENSE:
*
* Copyright (c) 2008 CodeCall.net
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted under the terms of the BSD License.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
*
* @category Copy and Paste
* @author Jordan (CodeCall.net)
* @date 9-30-2008
* @version 1.0
* @link http://www.codecall.net
* @copyright 2008 CodeCall.net
* @uses PEAR::Net_DNSBL
* @uses Akismet (http://www.achingbrain.net/stuff/php/akismet)
*
*
*/
// }}}
// {{{ Includes
/**
* Include our spam checking third-party
* classes.
*
* Akismet is local
* Net/* is PEAR
*
*/
require('Akismet/Akismet.class.php');
require('Net/DNSBL.php');
require('Net/DNSBL/SURBL.php');
// }}}
// {{{ Class
class SpamChecker {
//{{{ Members
/*
* The actual text of the comment or
* submission data
*/
private $__comment;
/*
* The "used" name of the submitter.
* This value may be blank.
*/
private $__name;
/*
* The email used to submit content, if
* included.
*/
private $__email;
/*
* The URL used to submit, if included
*/
private $__url;
/*
* Word Press API
* Needed for Akismet, can be
* obtained from http://en.wordpress.com/api-keys/
*/
private $__wordPressApiKey;
/*
* The site running the spam test
* You site URL (http://www.you.com)
*/
private $__ownerSiteUrl;
//}}}
// {{{ methods
/**
* Constructor for SpamChecker
*
* @param string $comment
* @param string $name
* @param string $email
* @param string $url
* @param string $wordPressApiKey
* @param string $ownerSiteUrl
* @return SpamChecker
*/
public function SpamChecker($comment, $name="", $email="",
$url="", $wordPressApiKey="", $ownerSiteUrl="" ) {
/*
* Apply our local variables to the class
* members
*/
$this->__comment = $comment;
$this->__name = $name;
$this->__email = $email;
$this->__url = $url;
$this->__wordPressApiKey = $wordPressApiKey;
$this->__ownerSiteUrl = $ownerSiteUrl;
}
/**
* Check for spam using different methods.
* This function is more of a controller
* that executes other, private functions. A
* true or false bool value is returned.
*
* true = detected spam
* false = did not detect spam
*
* @return array
*/
public function isSpam() {
/*
* Create generic array
*/
$spamResults = array();
/*
* Check against Akismet
*/
$spamResults['Akismet'] = $this->checkAkismet();
/*
* Check against IP Black
* Lists
*/
$spamResults['BlackLists'] = $this->checkBlackLists();
/*
* Scan content URLs against previously submitted
* URL spam database
*/
$spamResults['SpamURLs'] = $this->scanContentUrls();
/*
* Set global Spam flag
*/
$spamResults['Spam'] = ($spamResults['Akismet'] || $spamResults['BlackLists'] || $spamResults['SpamURLs']) ? true : false;
/*
* Return array
*/
return $spamResults;
}
/**
* Check the comment for spam against
* Akismet. Akismet is the popular wordpress
* blogging comment spam checker. It works extremely
* well but may not work in all circumstances if all
* data is not provided.
*
* @return bool
*/
private function checkAkismet() {
/*
* Create the class and add
* paramters
*/
$akismet = new Akismet($this->__ownerSiteUrl ,$this->__wordPressApiKey);
$akismet->setCommentAuthor($this->__name);
$akismet->setCommentAuthorEmail($this->__email);
$akismet->setCommentAuthorURL($this->__url);
$akismet->setCommentContent($this->__comment);
//$akismet->setPermalink(‘http://www.example.com/blog/alex/someurl/’);
/*
* Run the test
*/
if($akismet->isCommentSpam()) {
// Found spam
return true;
} else {
return false;
}
}
/**
* Check the IP address against IP
* black lists. If the IP is found in
* the database, the user has already
* been turned in for email or content
* spam by another user.
*
* Uses two well known services:
* spamcop.net
* spamhaus.org
*
* @uses PEAR::Net_DNSBL
*/
private function checkBlackLists() {
/*
* Create class
*/
$dnsbl = new Net_DNSBL();
/*
* Obtain the IP address of the person
* submitting content
*/
$remoteIp = $_SERVER['REMOTE_ADDR'];
/*
* Set the black lists to check
* against
*/
$dnsbl->setBlacklists(array('sbl-xbl.spamhaus.org', 'bl.spamcop.net'));
/*
* Run the Check
*/
if ($dnsbl->isListed($remoteIp)) {
// Found Spam
return true;
}
/*
* Nothing found, return
* false
*/
return false;
}
/**
* Take the content of the submitted
* comment/data and extract all URLs.
* Send each URL to checkUrlForSpam()
* to receive a response.
*
*/
private function scanContentUrls() {
/*
* Match all URLs
*/
preg_match_all("((https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))+[\w\d:#@%/;$()~_?\+-=\\\.&]*)",
$this->__comment, $urlArray, PREG_SET_ORDER);
/*
* Cycle through and submit
*/
foreach ($urlArray as $url) {
if ($this->CheckUrlForSpam(trim($url[0]))) {
// Found spam so exit
return true;
}
}
/*
* No spam links found
*/
return false;
}
/**
* Check a URL against the SPAM
* database to determine if it is
* a SPAM submitted URL
*
* @param unknown_type $url
* @return unknown
*/
private function checkUrlForSpam($url) {
/*
* Create a new DNS URL class
* and check it against the URL
* database
*/
$surbl = new Net_DNSBL_SURBL();
if ($surbl->isListed($url)) {
// Spam
return true;
}
/*
* Nothing found, return
* false
*/
return false;
}
// }}}
}
// }}}
All values are of known spammers at the time of posting.
Output:Code:<?php
// {{{ Header
/**
* ASCII Post/Comment Spam Checker Test
*
* PHP versions 5
*
* LICENSE:
*
* Copyright (c) 2008 CodeCall.net
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted under the terms of the BSD License.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
* FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
* COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
* ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
*
* @category Copy and Paste
* @author Jordan (CodeCall.net)
* @date 9-30-2008
* @version 1.0
* @link http://www.codecall.net
* @copyright 2008 CodeCall.net
* @uses PEAR::Net_DNSBL
* @uses Akismet (http://www.achingbrain.net/stuff/php/akismet)
*
*
*/
error_reporting(E_ALL);
// }}}
// {{{ Implementation
/*
* Include the class file
*/
include "SpamChecker.php";
/*
* Needed data
*/
$wordPressApiKey = 'APIKEY';
$ownerUrl = 'http://www.jordandelozier.com';
/*
* Create known spam Akismet variables for
* testing purposes. We are looking
* for false.
*/
$akismetSpam = array('comment'=>'What charming message http://www.zulucutie.com',
'name' =>'lanellgiz',
'email' =>'latesha@buyclialis.info',
'url' =>'df3gd.com',
'ip' =>'89.28.114.111'
);
/*
* Now we want to make the black lists fail.
* It should have passed above and been blank.
* In order to make the black lists fail, we need
* to override a server settings.
*/
$_SERVER['REMOTE_ADDR'] = '41.110.2.2';
/*
* Create a new instance of the class
*/
$spamChecker = new SpamChecker($akismetSpam['comment'], $akismetSpam['name'],
$akismetSpam['email'], $akismetSpam['url'],
$wordPressApiKey, $ownerUrl);
/*
* Run it and print the results
*/
echo "<pre>";
print_r($spamChecker->isSpam());
echo "</pre>";
// }}}
The class returns an associative array containing boolean values of each test. If any test is true, the spam key will be true.
See it in Action!Array
(
[Akismet] => 1
[BlackLists] => 1
[SpamURLs] => 1
[Spam] => 1
)
Visit ASCIIBin and submit any content. Before content is submitted this class is executed. If you are a spammer, or submitting spam, it will reject.
What kind of SPAM will it detect because I typed in this and it still allowed it to pass through?
adsklfjadls;kfjals;dkjfal;ksdjfkl;adshgkdfhsjgkl;j dsfl;kjdasl;kjfadkshfl;adksgl;kdfsg - ASCIIBin
Still looks nice so thank-you mate. Might use it for something later *copy**paste**save*.
jQuery Selectors Tutorial - jQuery Striped Table tutorial - jQuery Events - jQuery Validation
Sorry if I don't post as often as I did, I'll try to get here as much as possible! I'm working my bum off to get this scholarship and other stuff!
At the top he lists 4 things,checks your IP from 2 different databases, checks any URLs in the post to another database.. he said your email and name as well... so id just assume it checks for a "valid" email and does some common filters for text
it might be kinda hard to filter a post tho cause if its code it will have typos so you cant just filter that. Maybe if you detect spam based on average length of words or something...
Only Akismet checks the content, the other services check URLs and IPs from known spammers.
What kind of content would be considered spam?
Anything previously submitted by wordpress blog posters as spam. Could be anything!![]()
Does it work? How many spams have you blocked?
Ahhh OK, I'll give it another shot later. What do you mean by wordpress blog posters?
jQuery Selectors Tutorial - jQuery Striped Table tutorial - jQuery Events - jQuery Validation
Sorry if I don't post as often as I did, I'll try to get here as much as possible! I'm working my bum off to get this scholarship and other stuff!
Doesn't work.
Thanks for the code. But it really should be simpler to check SURBL and Akismet.
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks