Lost Password?

Go Back   CodeCall Programming Forum > Web Development Forum > Perl

Perl Discussion for the PERL language - Practical Extraction and Reporting Language, is a programming language often used for creating CGI programs.

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 09-04-2007, 09:50 AM
joe1986 joe1986 is offline
Newbie
 
Join Date: Sep 2007
Posts: 5
Rep Power: 0
joe1986 is on a distinguished road
Post Data extraction.

Hi there, new to Perl scripting, but ive been told its quite straight forward to pick up Anyway, im looking to write a small script that can be implemented in explorer and that can extract specific info off a page and export that info into a word document or excel file. Any pointers that could get me started?
Cheers
Joe
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

Sponsored Links
  #2 (permalink)  
Old 09-04-2007, 01:16 PM
KevinADC KevinADC is offline
Learning Programmer
 
Join Date: Jan 2007
Posts: 87
Rep Power: 6
KevinADC is on a distinguished road
Default

Extract info from a page on the internet? You don't typically use a browser for more than invoking a perl script, this would typically be a CGI script. This forum is an example of a CGI script written using PHP.

The server runs the code the browser just displays the formatted output from the script.

Read a good perl tutorial is my best pointer for now:

Beginning Perl - perl.org

worry about your specifc program requirements once you understand some basics.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 12-14-2007, 12:39 PM
jonmacpherson jonmacpherson is offline
Newbie
 
Join Date: Dec 2007
Posts: 7
Rep Power: 0
jonmacpherson is on a distinguished road
Default

Well..... heres a real world example of a very similar program, which controls word, and combines articles. Your mileage may vary. You may use all, parts of or none of this script.

I learned how to write the following program by looking at word controlling programs others had written.


#!c:\perl56\bin\perl
#
# Combine Articles using MS Word, and save them back to their original queue.



BEGIN {

use Cwd;
use CGI::Carp qw(fatalsToBrowser);
use Win32::OLE;
use ANPA;



require "cgi-lib.pl";
require "Gamma.pm";
require "Security2.pm";
require "cn4.lib";
require "Process.pm";
require "SubData.pm";
require "QueueAccess.pm";

$folder = "o:\\combine\\";
$sys_Universal_Prefix = "o:\\";

$InchLengthMacro = "Normal.Module1.GetInches";


}

print 'Content-type: text/html', "\n\n";

$data = new Gamma::Process();

my $wrd = CreateObject Win32::OLE "Word.Application" or die $1;
$wrd->{'Visible'} = 1;

%lingo = (
'article.rec_type' => 0, # not used
'article.category' => 1, #
'article.date' => 2,
'article.add_date' => 3,
'article.add_time' => 4, # Used by auotpurger to delete old articles
'article.exp_date' => 5,
'article.owner' => 6,
'article.active' => 7,
'article.title' => 8,
'article.author' => 9,
'article.image' => 10, # In Stone.
'article.photocap' => 11,
'article.template' => 12, #
'article.priority' => 13,
'article.intro' => 14,
'article.story' => 15,
'article.notes' => 16,
'article.relevency' => 17
);


$data->_fill_lev1( { 'lingo_record' => \%lingo } );





chdir ($folder);

opendir (DIR, $folder) || die "cannot open $folder due to $!";

@LIST = readdir DIR;


foreach $file (@LIST){

# Only look for cmb files.
if ($file =~ m/\.cmb$/ig){
print $file;
&combineArticles($file);

}


}


sub combineArticles {

my ($file) = @_;

$fullFile = $folder . $file;
$destQueue = $file;
$destQueue =~ s#-Q-F-.*##igs;
$destFile = $file;
$destFile =~ s#^.*-Q-F-##igs;
$destFile =~ s#\.cmb$##igs;
$slug = $destFile;
$destFile = $sys_Universal_Prefix . $destQueue . "/" . $destFile . ".doc";

$sys_record = $sys_Universal_Prefix . $destQueue . "/" . "records.gamma";


print $fullFile;
open (CMBInstr, "$fullFile")|| die "Cannot open $fullFile due to $!";
@FILENAmes = <CMBInstr>;
close CMBInstr;


my $ToDoc = $wrd->Documents->Add;

foreach $file (@FILENAmes){

if ($file =~ m/\w/ig){

$file = $file . ".doc";
$file = $sys_Universal_Prefix . $destQueue . "/" . $file;
$file =~ s#\n##igs;

my $doc = $wrd->Documents->Open( $file ) || die "Cannot open $file due to $!";
$doc->Content->Copy;
print "Copying Contents of $file \n";
$doc->Close();

print "Removing Temporary File $file \n";
#system (" del \"$file\" ");

print "Pasting Contents of $file into \n \t $destFile \n";
$wrd->Run('Normal.NewMacros1.PasteText');


}

}

$ToDoc->SaveAs($destFile);

$wrd->Run($InchLengthMacro);

$ToDoc->Close();

system(" del \"$fullFile\" ");

$t = time();


my $InchesDataFile = $destFile . ".count";
open (INCHFILE, $InchesDataFile);
my (@Counttainer) = <INCHFILE>;
close INCHFILE;

unlink($InchesDataFile);

my $InchLenghtIn = join ('', @Counttainer);
$InchLenghtIn =~ s#\n##igs;

{

# Get the current date, and chop out the parts that are unwanted.
# I only want the month day of the month and time
# example: jul 19 14:23:06

my ($DateString) = "" . localtime(time());

my ($dweek, $mon, $dmon, $time, $yr) = split (' ', $DateString );

$DesiredDateString = "$mon $dmon $time";

}
$data->read($sys_record, 'record');

print "Writting changes to $sys_record \n\n";
print "Slug \t\t $slug \n";
print "Date \t\t $DesiredDateString \n";
print "TimeStamp \t $t \n";
print "Inches \t\t $InchLenghtIn Inches \n";
print "Queue \t\t $destQueue \n";

$data->_fill_lev1({'object' => 'record'});
$data->_fill_lev2('files', { 'record' => $sys_record });
$data->_fill_lev2('formdata', { 'record_index' => $slug,
'article.title' => $slug,
'article.date' => $DesiredDateString,
'article.add_date' => $t,
'article.add_time' => $t,
'article.template' => $destQueue,
'record.index' => $slug,
'article.intro' => '',
'article.notes' => $InchLenghtIn . " Inches",
'article.owner' => ""
} );

$data->_auto_save_any();
$data->write( $sys_record, 'record');

}

$wrd->Quit;
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 12-14-2007, 12:57 PM
KevinADC KevinADC is offline
Learning Programmer
 
Join Date: Jan 2007
Posts: 87
Rep Power: 6
KevinADC is on a distinguished road
Default

jon,

watch the post dates, this thread is several months old, the OP has posted this one question and never returned. But of course I only offer that as a suggestion, you are free to post replies in any thread you wish to.

--Kevin
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 12-14-2007, 01:03 PM
jonmacpherson jonmacpherson is offline
Newbie
 
Join Date: Dec 2007
Posts: 7
Rep Power: 0
jonmacpherson is on a distinguished road
Cool Thanks

Thanks Kevin;

Didn't even notice.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote

Sponsored Links
Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads
Thread Thread Starter Forum Replies Last Post
Java:Tutorial - Data Types John Java Tutorials 6 07-02-2007 09:16 PM
Fetching Data from a Form Generated Website pclark2 General Programming 5 05-11-2007 06:24 AM
Small job - Database extraction - Delphi 7 of compliant paul. Request Services (Paid) 4 04-11-2007 11:41 AM
Please help me , Post Data & Store Data minusp PHP Forum 7 03-23-2007 07:00 PM


All times are GMT -5. The time now is 03:26 PM.

Contest Stats

John ........ 87.50000
dargueta ........ 75.00000
Xav ........ 50.00000
MeTh0Dz ........ 20.00000
gaylo565 ........ 18.00000
Johnnyboy ........ 3.00000

Contest Rules

Ads