+ Reply to Thread
Results 1 to 3 of 3

Thread: RSS Reader - Perl

  1. #1
    Join Date
    Aug 2009
    Location
    ~/
    Posts
    918
    Rep Power
    19

    RSS Reader - Perl

    This script parses the CodeCall RSS 2.0 Feed on the front page.

    I was reading an Announcements post in the General Forum about RSS Feeds
    and got the idea to create a perl script post for my member website.
    Then I figured... Why not kill 2 birds w/ one script and make a tutorial also.

    Here is the complete script:
    Code:
    #!/usr/bin/perl
    
    $feed_link="http://forum.codecall.net/external.php?type=RSS2";
    
    system("wget -q $feed_link -O /home/debtboy/cgi-bin/feed.txt");
    
    $feed_file="/home/debtboy/cgi-bin/feed.txt";
    
    open(RSSFILE, "<", $feed_file);
    
    $item_detect=0;
    $header_title=0;
    $header_desc=0;
    
    while(<RSSFILE>)
    {
      if($item_detect =~ /[01]/)
      {
        if(/<title>/../<\/title>/)
        {
          $whitespace = index $_, "<";
          $title_string = substr $_, $whitespace;
          $title_string =~ s/<title>//g;
          $title_string =~ s/<\/title>//g;
          chomp($title_string);
          print "\n\nRSS FEED:           $title_string\n";
          $header_title=1;
        }
        elsif(/<description>/../<\/description>/)
        {
          $whitespace = index $_, "<";
          $description_string = substr $_, $whitespace;
          $description_string =~ s/<description>//g;
          $description_string =~ s/<\/description>//g;
          chomp($description_string);
          print "FEED DESCRIPTION:   $description_string\n\n\n";
          $header_desc=1;
        }
        $item_detect = ($header_title + $header_desc);
      }
      else
      {
        if(/<item>/../<\/item>/)
        {
          if(/<title>/../<\/title>/)
          {
            $whitespace = index $_, "<";
            $title_string = substr $_, $whitespace;
            $title_string =~ s/<title>//g;
            $title_string =~ s/<\/title>//g;
            chomp($title_string);
            print "Title:     $title_string\n";
          }
          elsif(/<link>/../<\/link>/)
          {
            $whitespace = index $_, "<";
            $link_string = substr $_, $whitespace;
            $link_string =~ s/<link>//g;
            $link_string =~ s/<\/link>//g;
            chomp($link_string);
            print "URL Link:  $link_string\n";
          }
          elsif(/<pubDate>/../<\/pubDate>/)
          {
            $whitespace = index $_, "<";
            $date_string = substr $_, $whitespace;
            $date_string =~ s/<pubDate>//g;
            $date_string =~ s/<\/pubDate>//g;
            chomp($date_string);
            print "Date:      $date_string\n";
            print "Posting:   ";
          }
          elsif(/<description>/../<\/description>/)
          {
            $whitespace = index $_, "<";
            $description_string = substr $_, $whitespace;
            $description_string =~ s/<description>//g;
            $description_string =~ s/<\/description>//g;
            chomp($description_string);
            print "$description_string";
          }
          elsif(/<category /../<\/category>/)
          {
            $temp = substr $_, 5;
            $end_tag = index $temp, "<";
            $begin_tag = index $temp, ">";
            $category_length = $end_tag - ($begin_tag + 1);
            $category_string = substr $temp, ($begin_tag + 1), $category_length; 
            print "\nCategory:  $category_string\n";
          }
          elsif(/<dc:creator>/../<\/dc:creator>/)
          {
            $whitespace = index $_, "<";
            $user_string = substr $_, $whitespace;
            $user_string =~ s/<dc:creator>//g;
            $user_string =~ s/<\/dc:creator>//g;
            chomp($user_string);
            print "User:      $user_string\n\n";
            print "----------------------------------------------\n\n";
          }
        }   
      }
    }

    In this first section I download the feed and save it as feed.txt
    (the absolute paths are used on my member website)

    Notice that I dropped to the system for wget...
    I do much more shell scripting than I do perl so it was easier.

    Here is the First Section:
    Code:
    #!/usr/bin/perl
    
    $feed_link="http://forum.codecall.net/external.php?type=RSS2";
    
    system("wget -q $feed_link -O /home/debtboy/cgi-bin/feed.txt");
    
    $feed_file="/home/debtboy/cgi-bin/feed.txt";

    In this second section, I open up the downloaded file
    (in read mode) and set up a while loop to look for
    particular RSS 2.0 tags. I match regular expressions
    using the if control structure.
    The outer if structure is used to separate the header
    from the items by flagging when a <title> and <description> tags
    have been processed, meaning that we have moved past the
    header area and are now looking for <item> tags.
    The inner if structures do the parsing and printing
    to the output.


    Here is the Second Section:

    Code:
    open(RSSFILE, "<", $feed_file);
    
    $item_detect=0;
    $header_title=0;
    $header_desc=0;
    
    while(<RSSFILE>)
    {
      if($item_detect =~ /[01]/)
      {
        if(/<title>/../<\/title>/)
        {
          $whitespace = index $_, "<";
          $title_string = substr $_, $whitespace;
          $title_string =~ s/<title>//g;
          $title_string =~ s/<\/title>//g;
          chomp($title_string);
          print "\n\nRSS FEED:           $title_string\n";
          $header_title=1;
        }
        elsif(/<description>/../<\/description>/)
        {
          $whitespace = index $_, "<";
          $description_string = substr $_, $whitespace;
          $description_string =~ s/<description>//g;
          $description_string =~ s/<\/description>//g;
          chomp($description_string);
          print "FEED DESCRIPTION:   $description_string\n\n\n";
          $header_desc=1;
        }
        $item_detect = ($header_title + $header_desc);
      }
      else
      {
        if(/<item>/../<\/item>/)
        {
          if(/<title>/../<\/title>/)
          {
            $whitespace = index $_, "<";
            $title_string = substr $_, $whitespace;
            $title_string =~ s/<title>//g;
            $title_string =~ s/<\/title>//g;
            chomp($title_string);
            print "Title:     $title_string\n";
          }
          elsif(/<link>/../<\/link>/)
          {
            $whitespace = index $_, "<";
            $link_string = substr $_, $whitespace;
            $link_string =~ s/<link>//g;
            $link_string =~ s/<\/link>//g;
            chomp($link_string);
            print "URL Link:  $link_string\n";
          }
          elsif(/<pubDate>/../<\/pubDate>/)
          {
            $whitespace = index $_, "<";
            $date_string = substr $_, $whitespace;
            $date_string =~ s/<pubDate>//g;
            $date_string =~ s/<\/pubDate>//g;
            chomp($date_string);
            print "Date:      $date_string\n";
            print "Posting:   ";
          }
          elsif(/<description>/../<\/description>/)
          {
            $whitespace = index $_, "<";
            $description_string = substr $_, $whitespace;
            $description_string =~ s/<description>//g;
            $description_string =~ s/<\/description>//g;
            chomp($description_string);
            print "$description_string";
          }
          elsif(/<category /../<\/category>/)
          {
            $temp = substr $_, 5;
            $end_tag = index $temp, "<";
            $begin_tag = index $temp, ">";
            $category_length = $end_tag - ($begin_tag + 1);
            $category_string = substr $temp, ($begin_tag + 1), $category_length; 
            print "\nCategory:  $category_string\n";
          }
          elsif(/<dc:creator>/../<\/dc:creator>/)
          {
            $whitespace = index $_, "<";
            $user_string = substr $_, $whitespace;
            $user_string =~ s/<dc:creator>//g;
            $user_string =~ s/<\/dc:creator>//g;
            chomp($user_string);
            print "User:      $user_string\n\n";
            print "----------------------------------------------\n\n";
          }
        }   
      }
    }
    Here are 2 images of the output in my shell, which is only
    a portion of the total output.






    Here is the total output, copied and pasted from my terminal:

    Code:
    RSS FEED:           CodeCall Programming Forum
    FEED DESCRIPTION:   Community for Programmers and Developers with experts in C++, C#, Visual Basic, Java, Javascript, CGI, HTML and More!
    
    
    Title:     Writing Problem
    URL Link:  http://forum.codecall.net/java-help/21502-writing-problem.html
    Date:      Fri, 09 Oct 2009 21:30:05 GMT
    Posting:   im working on a note project where u can write notes down etc. but im stucked because i dont know how to make the whole screen editable. anyone could help me please?
    Category:  Java Help
    User:      Flezria
    
    ----------------------------------------------
    
    Title:     which language is the best match for my envisioned project
    URL Link:  http://forum.codecall.net/general-programming/21501-language-best-match-my-envisioned-project.html
    Date:      Fri, 09 Oct 2009 21:26:20 GMT
    Posting:   <![CDATA[Hello,
    Category:  General Programming
    User:      Codix
    
    ----------------------------------------------
    
    Title:     Multiple insertion with checkbox
    URL Link:  http://forum.codecall.net/php-forum/21500-multiple-insertion-checkbox.html
    Date:      Fri, 09 Oct 2009 20:35:09 GMT
    Posting:   <![CDATA[I have a database table with program details. It has program ID (proid), program name (pronm) and others. I have selected all the row from the program table using a while loop and associated a checkbox with each program. The following is the code.<?php<form method=post action=rating_insert.php>";<table width=450 border=0 cellspacing=0 cellpadding=3>";<tr><td><input name=chck[] type=checkbox value=$rec_row[proid]/></td><td colspan=3>$rec_row[pronm]</td>";<td><input name=chck[] type=checkbox value=$rec_row[proid] /></td><td colspan=3>$rec_row[pronm]</td></tr>";<tr><td></td><td></td><td align=right><input name=submit type=submit value='Submit' /></td></tr>";</table>";</form>";
    Category:  PHP Forum
    User:      liri ayekpam
    
    ----------------------------------------------
    
    Title:     wht to learn after HTML ?
    URL Link:  http://forum.codecall.net/html-programming/21498-wht-learn-after-html.html
    Date:      Fri, 09 Oct 2009 19:31:06 GMT
    Posting:   what i have to learn after html ?
    Category:  HTML Programming
    User:      alexsmith
    
    ----------------------------------------------
    
    Title:     Can you spot the error in this program?
    URL Link:  http://forum.codecall.net/java-help/21497-can-you-spot-error-program.html
    Date:      Fri, 09 Oct 2009 19:29:09 GMT
    Posting:   <![CDATA[Hello,< car_list.length; k++)<carList.length;k++)< carList.length-1; k++)< carList[k+1].getModel())< carList.length-1; k++)< carList[k+1].getMileage())
    Category:  Java Help
    User:      heidi7
    
    ----------------------------------------------
    
    Title:     Why do you program/Why are you learning to program?
    URL Link:  http://forum.codecall.net/lounge/21496-why-do-you-program-why-you-learning-program.html
    Date:      Fri, 09 Oct 2009 18:34:53 GMT
    Posting:   <![CDATA[Hi,
    Category:  The Lounge
    User:      taylerhughes
    
    ----------------------------------------------
    
    Title:     NEED HELP!!!
    URL Link:  http://forum.codecall.net/java-help/21495-need-help.html
    Date:      Fri, 09 Oct 2009 17:37:12 GMT
    Posting:   Hello everyone, can someone help me with this please?
    Category:  Java Help
    User:      JackDaniels
    
    ----------------------------------------------
    
    Title:     how to learn SSL !
    URL Link:  http://forum.codecall.net/general-programming/21494-how-learn-ssl.html
    Date:      Fri, 09 Oct 2009 16:53:41 GMT
    Posting:   hello . 
    Category:  General Programming
    User:      alexsmith
    
    ----------------------------------------------
    
    Title:     what wrong
    URL Link:  http://forum.codecall.net/perl/21492-what-wrong.html
    Date:      Fri, 09 Oct 2009 16:14:04 GMT
    Posting:   <![CDATA[<STDIN>);
    Category:  Perl
    User:      kiddies
    
    ----------------------------------------------
    
    Title:     Four Word Game reader
    URL Link:  http://forum.codecall.net/community-projects/21489-four-word-game-reader.html
    Date:      Fri, 09 Oct 2009 15:20:57 GMT
    Posting:   <![CDATA[I just made a small program I want to share with you :) The program will look for all posts in the new "Four Word Game" and put them together to a whole story. The program is written in python and here's the code :D :<div id="post_message_' + post + '">') <div id="post_message_' + post + '">'):content.index("</div>",Index)]<i>Posted via <a href="http://codecall.mobi" target="_blank">CodeCall Mobile</a></i>',"")
    Category:  Community Projects
    User:      Vswe
    
    ----------------------------------------------
    
    Title:     C# Generate Fixtures
    URL Link:  http://forum.codecall.net/c-programming/21488-c-generate-fixtures.html
    Date:      Fri, 09 Oct 2009 14:43:51 GMT
    Posting:   <![CDATA[Attached is a project that lets you control a sports league.
    Category:  C# Programming
    User:      matio
    
    ----------------------------------------------
    
    Title:     <![CDATA[[C++][WinAPI] Exceptions & callback]]>
    URL Link:  http://forum.codecall.net/c-c/21482-c-winapi-exceptions-callback.html
    Date:      Fri, 09 Oct 2009 13:14:08 GMT
    Posting:   <![CDATA[When i`m trying to throw an exception... :
    Category:  C and C++
    User:      winuser
    
    ----------------------------------------------
    
    Title:     <![CDATA[[Help]WYSIWYG Editor[Help]]]>
    URL Link:  http://forum.codecall.net/website-design/21481-help-wysiwyg-editor-help.html
    Date:      Fri, 09 Oct 2009 13:02:36 GMT
    Posting:   <![CDATA[Okay so i have downloaded a trial version of Adobe Dreamweaver so i can update my beta version of a text based game i am making due to the fact i am trying to make it unique as i am ok at making graphics.
    Category:  Website Design
    User:      Junes
    
    ----------------------------------------------
    
    Title:     Development tools for linux.
    URL Link:  http://forum.codecall.net/software-development-tools/21480-development-tools-linux.html
    Date:      Fri, 09 Oct 2009 12:53:43 GMT
    Posting:   Hi, im new to programing (still on algorithms and flowcharts part), my question is: Is there a version of Raptor flowchart for linux or a tool that resembles this same 1? i also work (study) with Editpad pro wich i found out that can be installed in Linux through Wine.
    Category:  Software Development Tools
    User:      Trigg3r
    
    ----------------------------------------------
    
    Title:     internet speed test in mbps
    URL Link:  http://forum.codecall.net/computer-hardware/21478-internet-speed-test-mbps.html
    Date:      Fri, 09 Oct 2009 11:37:00 GMT
    Posting:   i would like to test my internet speed in mbps,last week i test my internet speed here ip-details.com/internet-speed-test/, it shown in kbps i need the right website to test in mbps,please let me know if there is anything.
    Category:  Computer Hardware
    User:      johntaylor
    
    ----------------------------------------------
    Please keep in mind that this Feed Reader was specifically designed
    to demonstrate parsing of the CodeCall RSS 2.0 Feed on the front page.

    NOTE:
    There are perl modules which are designed to simplify this very process,
    (see: cpan.org) but the original script was created as a CGI script
    where I don't have control over the environment.

    This was fun, I should do more perl scripting.

  2. CODECALL Circuit advertisement
    Join Date
    Always
    Location
    Advertising world
    Posts
    Many

     
  3. #2
    Jordan Guest

    Re: RSS Reader - Perl

    Excellent Tutorial. I completely missed this one. I love the way you demonstrate with screenshots. BTW, any tutorial that works with CodeCall will work with any other vBulletin based forum. +rep!

  4. #3
    Join Date
    Aug 2009
    Location
    ~/
    Posts
    918
    Rep Power
    19

    Re: RSS Reader - Perl

    Thanks for the +rep
    it all started via a post in the Announcements section.
    I noticed that firefox doesn't pickup the category or user,
    so I decided to.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. RSS feed reader :)
    By reilly in forum C and C++
    Replies: 8
    Last Post: 09-22-2010, 12:24 AM
  2. HTML Reader
    By seph6664 in forum Java Help
    Replies: 0
    Last Post: 11-25-2009, 07:29 PM
  3. HTML Reader
    By xsonny in forum C# Programming
    Replies: 3
    Last Post: 07-13-2007, 05:16 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts