Jump to content

Web Spider....

- - - - -

This topic has been archived. This means that you cannot reply to this topic.
19 replies to this topic

#1
OldMac

OldMac

    Newbie

  • Members
  • Pip
  • 9 posts
At the prime age of 49 I've decided to start attending night school to get some basic/advance knowledge of programming.

I want to make a web spider for my website but have no idea where to start will anyone be able to help ??

My structure that I think will work will be as follows -

index page.php - just mock up a site home page and put an 'include' for a search bar uptop. Make sure its .php otherwise it wont work
Results page - for the search results.
Spider - for searching through the pages. << (PROBLEM)
Database - for storing all the data collected by the spider.

Any help will be much appreciated, I've exhausted all other possible avenues and have reached a dead end...

#2
John

John

    Writes binary right handed and hex left handed

  • Moderators
  • 6,321 posts
I'm not really sure what you are trying to do. Crawl your own website?

#3
OldMac

OldMac

    Newbie

  • Members
  • Pip
  • 9 posts
Yeah just the information on my own web pages.

So I think I want to try and get stuff on my web pages put on mySQL database and then the spider searches that ???

#4
zeroradius

zeroradius

    Speaks fluent binary

  • Members
  • PipPipPipPipPipPipPipPip
  • 1,406 posts
Making a spider for that would be a waste of time. You can use a MySQL query to achive the same results with much less effort. Spidrs are better for sites that you do not have access to there database.
Posted Image

#5
Guest_Jordan_*

Guest_Jordan_*
  • Guests
How is your site built, statically (every page hand written) or dynamically such as this forum (each page driven by database results)? Depending on such, Zero's answer could be correct (#2, database driven).

Aside from that, building a spider isn't hard. You can use CURL to crawl pages and extract HTML data.

#6
zeroradius

zeroradius

    Speaks fluent binary

  • Members
  • PipPipPipPipPipPipPipPip
  • 1,406 posts
@ Jordon - he wants his site dynmic with MySQL database

OldMac said:

So I think I want to try and get stuff on my web pages put on mySQL database and then the spider searches that ???

Posted Image

#7
Guest_Jordan_*

Guest_Jordan_*
  • Guests
Oh, I suppose I should of read his second response/reply. /* dumbass */

#8
dvelop

dvelop

    Newbie

  • Members
  • PipPip
  • 12 posts
Learning MySQL isn't that difficult.

Although complex queries are a little bit on the hard side to think of in beginning stages.

You will pickup in no time.

#9
OldMac

OldMac

    Newbie

  • Members
  • Pip
  • 9 posts
I'm willing to go with whatever is the easiest option is, ideally though I would like to make the spider with PHP if that is at all possible.

I'm really at a last resort now so any help will be much appreciated.

P.S the webpages are going to be very basic with just a few pieces of text, headings and a few tables nothing fancy at all for the moment until I get the spider done.

#10
OldMac

OldMac

    Newbie

  • Members
  • Pip
  • 9 posts
I've found this coding....

phpcodesnippets.com

But I want it to track inputted data opposed to URL what modification could I make from this coding or would I have to revamp all of it ?

#11
Guest_Jaan_*

Guest_Jaan_*
  • Guests
Btw guys you gave me a good idea for a tutorial :D

#12
OldMac

OldMac

    Newbie

  • Members
  • Pip
  • 9 posts
Any ideas guys ??

I've been searching the web and have seen maybe the term crawler would be more specific to what I need ?