Jump to content

help with regular expression

- - - - -

This topic has been archived. This means that you cannot reply to this topic.
2 replies to this topic

#1
Orjan

Orjan

    Writes binary right handed and hex left handed

  • Moderators
  • 3,299 posts
hello there. I'm into parsing a web page.

i have this sequence in a web page I retrieved through curl occurring several times:

<a class="nameholder" href="/en/citizen/profile/123456">name</a>

what I would like is to get an associative array consisting of:
$myarr[123456] = "name"
$myarr[3456] = "name2"
etc

where the numbers vary, and can be any number from 1 to millions, and name might be any alphanumerical characters, max 30 chars long.

how would be the best way to do this? I have the whole html in one variable.
the rest of the html is unimportant to me and will be dropped after this data collection.

#2
Guest_Jordan_*

Guest_Jordan_*
  • Guests
Your pattern should be like this:

/\<a class=\"(.*?)\" href=\"(.*?)\"\>(*.?)\<\/a\>/is

Which will give you 3 vars - classname, href and anchor text value. You can then split the second variable by "/" and take the last value, making it a key to the array.

Note: That regex is untested....

#3
Orjan

Orjan

    Writes binary right handed and hex left handed

  • Moderators
  • 3,299 posts
I got it working by this regular expression:

#<a class="nameholder" href="/en/citizen/profile/([0-9]+?)">(.+?)</a>#
then I ran it in preg_match_all and it put it in an array...