|
||||||
| PHP Forum Use this forum to discuss all aspects of PHP Development. PHP is a server-side, cross-platform, HTML embedded scripting language that lets you create dynamic web pages. |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
|
|||||
|
Hi all,
I'm using cURL from within a PHP webpage to display some RSS feeds on my site. I'm displaying comics from the xkcd feed and the Dilbert feed. The xkcd feed works perfectly, but Dilbert does something very strange. At certain times of day Dilbert works fine, but at other times it does a 302 redirect to a different page with an older comic on it. However, only cURL seems to get this 302 redirect. When you paste the url into Firefox, you get the current comic! This is really confusing me as I'm not sure whether the problem is in my code or in the Dilbert RSS feed. It's happened consistently for the last three days though. It works fine from around 11:00pm (GMT) to around 11:00am (GMT), and does the redirect from 11am until 11pm. (The times are only very rough, I haven't tracked them down precisely.) The url I am using for the Dilbert feed is: Code:
http://feeds.feedburner.com/DilbertDailyStrip?format=xml Code:
http://feedproxy.feedburner.com/DilbertDailyStrip?format=xml PHP Code:
__________________
My fun, friendly online games website: Cygnet Games My Squidoo page on Cygnet Games. |
| Sponsored Links |
|
|
|
|||||
|
How often are you connecting to it and how many times are you connecting each time? They may have some custom code to forward IPs that connect to often to older RSS feeds. You can also try setting:
PHP Code:
PHP Code:
__________________
CodeCall Blog | CodeCall Wiki | Shareware Site | Linux Forum | Write a Blog The CodeCall Wiki is now fully integrated with vBulletin users! Check it out and add some new pages! |
|
|||||
|
I'm connecting once every time the page is refreshed by a client - is that too much? I was considering caching the feeds on my server and updating the cache if the page is refreshed and the cache is over a day old (or it's passed whatever time the feeds are updated with new content, or something...).
It does the same thing with CURLOPT_FOLLOWLOCATION set to 0. That's how I had it set originally - my xml parsing code choked on the 302 page returned by cURL, which was how I discovered it was getting redirected. It may well be that they are redirecting me for connecting too often.
__________________
My fun, friendly online games website: Cygnet Games My Squidoo page on Cygnet Games. |
|
|||||
|
Like I said, I don't see anything wrong with your code so I assume it is them blocking you. I think once every time someone connects is way to often.
__________________
CodeCall Blog | CodeCall Wiki | Shareware Site | Linux Forum | Write a Blog The CodeCall Wiki is now fully integrated with vBulletin users! Check it out and add some new pages! |
|
|||||
|
I was hoping that you were right about the blocking, but I've just tested it today and it's still happening.
I loaded the page at around 10:30am (GMT) this morning and it worked fine. Then I didn't touch it until just now (4:30pm), and it's redirected to the old comic. It's not as if the old comic is a fixed amount of time before the new one either, it has been redirecting to the same comic - May 16 - for the last few days. I will definetly cache it on my server though - so as not to connect to their feed too often.
__________________
My fun, friendly online games website: Cygnet Games My Squidoo page on Cygnet Games. |
| Sponsored Links |
|
|
|
|||||
|
I fixed it!
You were half right Jordan. They were blocking me, but not because I was connecting too often. It was because I had a blank user agent header. Apparently, some servers don't like that. When I spoofed my user agent as Firefox, it worked perfectly. Thanks for the help. Here is the additional code for anyone else with a similar problem: PHP Code:
__________________
My fun, friendly online games website: Cygnet Games My Squidoo page on Cygnet Games. |
|
|||||
|
Quote:
Posted via CodeCall Mobile |
|
|||||
|
It was partly trial and error and partly random luck!
While looking for something different, I found someone on a forum having a similar problem trying to scrape some content from another site and needing to spoof their user agent before the server would like them. I wondered if my problem was the same and it looks like it was. Here is the finished, working version of the comics page. It pulls the latest comic from both Dilbert and xkcd, caching them every hour on my server. I might make this into a tutorial, if it continues to work for the next few days without showing any more problems! It's a nice simple example of cURL and degradable ajax.
__________________
My fun, friendly online games website: Cygnet Games My Squidoo page on Cygnet Games. |
|
|||||
|
Good job, and a tutorial would be awesome!
__________________
CodeCall Blog | CodeCall Wiki | Shareware | Linux Forum | My Blog Chat with other CodeCall members on IRC; connect to irc.codecall.net and join #codecall |
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | Search this Thread |
| Display Modes | |
|
|
| WingedPanther | ........ | 2753.6 |
| Xav | ........ | 2704 |
| Brandon W | ........ | 1702.32 |
| John | ........ | 1207.73 |
| marwex89 | ........ | 1175.24 |
| morefood2001 | ........ | 966.05 |
| dcs | ........ | 655.75 |
| Steve.L | ........ | 475.59 |
| orjan | ........ | 418.58 |
| Aereshaa | ........ | 383.54 |
Goal: 100,000 Posts
Complete: 97%