hello there. I'm into parsing a web page.
i have this sequence in a web page I retrieved through curl occurring several times:
<a class="nameholder" href="/en/citizen/profile/123456">name</a>
what I would like is to get an associative array consisting of:
$myarr[123456] = "name"
$myarr[3456] = "name2"
etc
where the numbers vary, and can be any number from 1 to millions, and name might be any alphanumerical characters, max 30 chars long.
how would be the best way to do this? I have the whole html in one variable.
the rest of the html is unimportant to me and will be dropped after this data collection.
help with regular expression
Started by Orjan, Mar 14 2009 06:48 AM
2 replies to this topic
#1
Posted 14 March 2009 - 06:48 AM
|
|
|
#2
Guest_Jordan_*
Posted 14 March 2009 - 07:02 AM
Guest_Jordan_*
Your pattern should be like this:
Which will give you 3 vars - classname, href and anchor text value. You can then split the second variable by "/" and take the last value, making it a key to the array.
Note: That regex is untested....
/\<a class=\"(.*?)\" href=\"(.*?)\"\>(*.?)\<\/a\>/is
Which will give you 3 vars - classname, href and anchor text value. You can then split the second variable by "/" and take the last value, making it a key to the array.
Note: That regex is untested....
#3
Posted 14 March 2009 - 08:51 AM
I got it working by this regular expression:
#<a class="nameholder" href="/en/citizen/profile/([0-9]+?)">(.+?)</a>#then I ran it in preg_match_all and it put it in an array...


Sign In
Create Account

Back to top









