Hi,
I am trying to create a internet crawler and am having problems with detecting the links on a page.
I just want to match somethng like this:
<a...>.....</a>
... - everything in between.
So far I have:
<[aA][ ]*[^>]*>.*</a>
I have having problems with matching stuff after the > and before the </a>. Sometimes even the </a> gets matched as part of the .* and i get everything until the next </a> shows up in the page.
What do you think is the right expression?
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks