+ Reply to Thread
Results 1 to 9 of 9

Thread: [PHP] Spam Detection Class

  1. #1
    Jordan Guest

    [PHP] Spam Detection Class

    I've been fighting a lot of spam over at ASCIIBin. In order to minimize it I wrote a class that attempts to detect spam and reject the content before it is submitted. It uses:
    • Akismet - Checks against Content, URL, Name, Email
    • SURBL - All URLs in content are parsed and checked against this spam database. This database contains URLs submitted via spam emails.
    • Spamhaus - Checks IP for known spammers
    • SpamCop - Checks IP for known spammers

    Prerequisites
    This class is based on two other classes which you'll need to download and install.
    1. PEAR::Net_DNSBL - Use pear to install (pear install Net_DNSBL)
    2. PHP5Akismet - Download and extract archive. Rename folder to Akismet
    3. You will need to obtain a WordPress API key here

    The PEAR class uses your resolv.conf file located in /etc/resolv.conf. If you have PHP Open Base Dir restriction you'll need to put a file named '.resolv.conf' in the directory executing this script. If you are in Windows you can create \etc\resolv.conf or place .resolv.conf in the executing directory. resolv.conf contains a list of nameservers which are needed by Net_DNSBL to send TCP/UDP packets.

    The Script
    Code:
    <?php
    // {{{ Header
    /**
     * ASCII Post/Comment Spam Checker
     *
     * PHP versions 5
     *
     * LICENSE:
     *
     * Copyright (c) 2008 CodeCall.net
     * All rights reserved.
     *
     * Redistribution and use in source and binary forms, with or without
     * modification, are permitted under the terms of the BSD License.
     *
     * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
     * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
     * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
     * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
     * COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
     * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
     * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
     * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
     * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
     * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
     * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
     * POSSIBILITY OF SUCH DAMAGE.
     *
     *
     * @category  Copy and Paste
     * @author    Jordan (CodeCall.net)
     * @date      9-30-2008
     * @version   1.0
     * @link      http://www.codecall.net
     * @copyright 2008 CodeCall.net
     * @uses          PEAR::Net_DNSBL
     * @uses          Akismet (http://www.achingbrain.net/stuff/php/akismet)
     * 
     * 
     */

    // }}}
    // {{{ Includes


    /**
     * Include our spam checking third-party 
     * classes. 
     * 
     * Akismet is local
     * Net/* is PEAR
     * 
     */
    require('Akismet/Akismet.class.php');
    require(
    'Net/DNSBL.php');
    require(
    'Net/DNSBL/SURBL.php');

    // }}}
    // {{{ Class
    class SpamChecker {
        
        
    //{{{ Members
        /*
         * The actual text of the comment or
         * submission data
         */
        
    private $__comment;
        
        
    /*
         * The "used" name of the submitter.
         * This value may be blank.
         */
        
    private $__name;
        
        
    /*
         * The email used to submit content, if
         * included.
         */
        
    private $__email;
        
        
    /*
         * The URL used to submit, if included
         */
        
    private $__url;
        
        
    /*
         * Word Press API
         * Needed for Akismet, can be
         * obtained from http://en.wordpress.com/api-keys/
         */
        
    private $__wordPressApiKey;
        
        
    /*
         * The site running the spam test
         * You site URL (http://www.you.com)
         */
        
    private $__ownerSiteUrl;
        
        
    //}}}
        
        // {{{ methods
        /**
         * Constructor for SpamChecker
         *
         * @param string $comment
         * @param string $name
         * @param string $email
         * @param string $url
         * @param string $wordPressApiKey
         * @param string $ownerSiteUrl
         * @return SpamChecker
         */
        
    public function SpamChecker($comment$name=""$email=""
                                    
    $url=""$wordPressApiKey=""$ownerSiteUrl="" ) {
            
    /*
             * Apply our local variables to the class
             * members
             */
            
    $this->__comment         $comment;
            
    $this->__name             $name;
            
    $this->__email            $email;
            
    $this->__url              $url;
            
    $this->__wordPressApiKey $wordPressApiKey;
            
    $this->__ownerSiteUrl    $ownerSiteUrl;
        }
        
        
    /**
         * Check for spam using different methods.
         * This function is more of a controller
         * that executes other, private functions. A
         * true or false bool value is returned.
         * 
         * true  = detected spam
         * false = did not detect spam 
         *
         * @return array
         */
        
    public function isSpam() {
            
    /*
             * Create generic array
             */
            
    $spamResults = array();
            
            
    /*
             * Check against Akismet
             */
            
    $spamResults['Akismet']    = $this->checkAkismet();
            
            
    /*
             * Check against IP Black
             * Lists
             */
            
    $spamResults['BlackLists'] = $this->checkBlackLists();
            
            
    /*
             * Scan content URLs against previously submitted
             * URL spam database
             */
            
    $spamResults['SpamURLs']   = $this->scanContentUrls();
            
            
    /*
             * Set global Spam flag
             */
            
    $spamResults['Spam'] = ($spamResults['Akismet'] || $spamResults['BlackLists'] || $spamResults['SpamURLs']) ? true false;
            
            
    /*
             * Return array
             */
            
    return $spamResults;
        }
        
        
    /**
         * Check the comment for spam against
         * Akismet. Akismet is the popular wordpress
         * blogging comment spam checker. It works extremely
         * well but may not work in all circumstances if all
         * data is not provided. 
         *
         * @return bool
         */
        
    private function checkAkismet() {
            
    /*
             * Create the class and add
             * paramters
             */         
            
    $akismet = new Akismet($this->__ownerSiteUrl ,$this->__wordPressApiKey);
            
    $akismet->setCommentAuthor($this->__name);
            
    $akismet->setCommentAuthorEmail($this->__email);
            
    $akismet->setCommentAuthorURL($this->__url);
            
    $akismet->setCommentContent($this->__comment);
            
    //$akismet->setPermalink(‘http://www.example.com/blog/alex/someurl/’);
             
            /*
             * Run the test
             */
            
    if($akismet->isCommentSpam()) {
              
    // Found spam
              
    return true;
            } else {
                return 
    false;
            }
        }
        
        
    /**
         * Check the IP address against IP
         * black lists. If the IP is found in
         * the database, the user has already
         * been turned in for email or content
         * spam by another user. 
         * 
         * Uses two well known services:
         *    spamcop.net
         *    spamhaus.org
         * 
         * @uses PEAR::Net_DNSBL
         */
        
    private function checkBlackLists() {
            
    /*
             * Create class
             */
            
    $dnsbl = new Net_DNSBL();
            
            
    /*
             * Obtain the IP address of the person
             * submitting content
             */
            
    $remoteIp $_SERVER['REMOTE_ADDR'];

            
    /*
             * Set the black lists to check
             * against
             */
            
    $dnsbl->setBlacklists(array('sbl-xbl.spamhaus.org''bl.spamcop.net'));
            
            
    /*
             * Run the Check
             */
            
    if ($dnsbl->isListed($remoteIp)) {
                
    // Found Spam
                
    return true;
            } 

            
    /*
             * Nothing found, return
             * false
             */
            
    return false;
            
        }
        
        
    /**
         * Take the content of the submitted
         * comment/data and extract all URLs.
         * Send each URL to checkUrlForSpam()
         * to receive a response. 
         *
         */
        
    private function scanContentUrls() {
            
    /*
             * Match all URLs
             */
            
    preg_match_all("((https?|ftp|gopher|telnet|file|notes|ms-help):((//)|(\\\\))+[\w\d:#@%/;$()~_?\+-=\\\.&]*)"
                            
    $this->__comment$urlArrayPREG_SET_ORDER);
            
            
    /*
             * Cycle through and submit
             */
            
    foreach ($urlArray as $url) {
                if (
    $this->CheckUrlForSpam(trim($url[0]))) {
                    
    // Found spam so exit
                    
    return true;
                }
            }
            
            
    /*
             * No spam links found
             */
            
    return false;
                
        }
        
        
    /**
         * Check a URL against the SPAM
         * database to determine if it is
         * a SPAM submitted URL
         *
         * @param unknown_type $url
         * @return unknown
         */
        
    private function checkUrlForSpam($url) {
            
    /*
             * Create a new DNS URL class
             * and check it against the URL
             * database
             */
            
    $surbl = new Net_DNSBL_SURBL();
            if (
    $surbl->isListed($url)) {
                
    // Spam
                
    return true;
            } 
            
            
    /*
             * Nothing found, return
             * false
             */
            
    return false;
        }
        
        
        
    // }}}
        
    }
    // }}}
    Example Usage:
    All values are of known spammers at the time of posting.
    Code:
    <?php
    // {{{ Header
    /**
     * ASCII Post/Comment Spam Checker Test 
     *
     * PHP versions 5
     *
     * LICENSE:
     *
     * Copyright (c) 2008 CodeCall.net
     * All rights reserved.
     *
     * Redistribution and use in source and binary forms, with or without
     * modification, are permitted under the terms of the BSD License.
     *
     * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
     * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
     * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
     * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
     * COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
     * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
     * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
     * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
     * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
     * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
     * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
     * POSSIBILITY OF SUCH DAMAGE.
     *
     *
     * @category  Copy and Paste
     * @author    Jordan (CodeCall.net)
     * @date      9-30-2008
     * @version   1.0
     * @link      http://www.codecall.net
     * @copyright 2008 CodeCall.net
     * @uses          PEAR::Net_DNSBL
     * @uses          Akismet (http://www.achingbrain.net/stuff/php/akismet)
     * 
     * 
     */

    error_reporting(E_ALL);

    // }}}
    // {{{ Implementation

    /*
     * Include the class file
     */
    include "SpamChecker.php";

    /*
     * Needed data
     */
    $wordPressApiKey 'APIKEY';
    $ownerUrl          'http://www.jordandelozier.com';

    /*
     * Create known spam Akismet variables for
     * testing purposes. We are looking
     * for false. 
     */
    $akismetSpam = array('comment'=>'What charming message http://www.zulucutie.com',
                         
    'name'   =>'lanellgiz',
                          
    'email'  =>'latesha@buyclialis.info',
                         
    'url'    =>'df3gd.com',
                         
    'ip'     =>'89.28.114.111'                     
                         
    );

    /*
     * Now we want to make the black lists fail.
     * It should have passed above and been blank.
     * In order to make the black lists fail, we need
     * to override a server settings.
     */
    $_SERVER['REMOTE_ADDR'] = '41.110.2.2';

    /*
     * Create a new instance of the class
     */
    $spamChecker = new SpamChecker($akismetSpam['comment'], $akismetSpam['name'],
                                         
    $akismetSpam['email'], $akismetSpam['url'], 
                                         
    $wordPressApiKey$ownerUrl);
    /*
     * Run it and print the results
     */
    echo "<pre>";                                     
    print_r($spamChecker->isSpam());
    echo 
    "</pre>";                                     



    // }}}
    Output:
    The class returns an associative array containing boolean values of each test. If any test is true, the spam key will be true.

    Array
    (
    [Akismet] => 1
    [BlackLists] => 1
    [SpamURLs] => 1
    [Spam] => 1
    )
    See it in Action!
    Visit ASCIIBin and submit any content. Before content is submitted this class is executed. If you are a spammer, or submitting spam, it will reject.

  2. CODECALL Circuit advertisement

     
  3. #2
    Join Date
    Sep 2008
    Location
    Australia
    Posts
    4,834
    Blog Entries
    10
    Rep Power
    51

    Re: [PHP] Spam Detection Class

    What kind of SPAM will it detect because I typed in this and it still allowed it to pass through?
    adsklfjadls;kfjals;dkjfal;ksdjfkl;adshgkdfhsjgkl;j dsfl;kjdasl;kjfadkshfl;adksgl;kdfsg - ASCIIBin

    Still looks nice so thank-you mate. Might use it for something later *copy**paste**save*.
    jQuery Selectors Tutorial - jQuery Striped Table tutorial - jQuery Events - jQuery Validation
    Sorry if I don't post as often as I did, I'll try to get here as much as possible! I'm working my bum off to get this scholarship and other stuff!

  4. #3
    Join Date
    Apr 2009
    Location
    Trapped in my own little world.
    Posts
    2,487
    Rep Power
    33

    Re: [PHP] Spam Detection Class

    At the top he lists 4 things,checks your IP from 2 different databases, checks any URLs in the post to another database.. he said your email and name as well... so id just assume it checks for a "valid" email and does some common filters for text

    it might be kinda hard to filter a post tho cause if its code it will have typos so you cant just filter that. Maybe if you detect spam based on average length of words or something...

  5. #4
    Jordan Guest

    Re: [PHP] Spam Detection Class

    Only Akismet checks the content, the other services check URLs and IPs from known spammers.

  6. #5
    Join Date
    Apr 2009
    Location
    Trapped in my own little world.
    Posts
    2,487
    Rep Power
    33

    Re: [PHP] Spam Detection Class

    What kind of content would be considered spam?

  7. #6
    Jordan Guest

    Re: [PHP] Spam Detection Class

    Anything previously submitted by wordpress blog posters as spam. Could be anything!

  8. #7
    relapse's Avatar
    relapse is offline Programming Expert
    Join Date
    Jul 2009
    Location
    Intrawebs
    Posts
    479
    Blog Entries
    2
    Rep Power
    0

    Re: [PHP] Spam Detection Class

    Does it work? How many spams have you blocked?

  9. #8
    Join Date
    Sep 2008
    Location
    Australia
    Posts
    4,834
    Blog Entries
    10
    Rep Power
    51

    Re: [PHP] Spam Detection Class

    Ahhh OK, I'll give it another shot later. What do you mean by wordpress blog posters?
    jQuery Selectors Tutorial - jQuery Striped Table tutorial - jQuery Events - jQuery Validation
    Sorry if I don't post as often as I did, I'll try to get here as much as possible! I'm working my bum off to get this scholarship and other stuff!

  10. #9
    pkiula is offline Newbie
    Join Date
    Apr 2010
    Posts
    1
    Rep Power
    0

    Re: [PHP] Spam Detection Class

    Doesn't work.

    Thanks for the code. But it really should be simpler to check SURBL and Akismet.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Replies: 3
    Last Post: 10-21-2011, 08:21 AM
  2. Replies: 2
    Last Post: 03-30-2011, 07:13 PM
  3. Replies: 4
    Last Post: 06-30-2010, 09:22 PM
  4. Replies: 1
    Last Post: 10-20-2009, 08:49 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts