Whenever I try and get a webpage with wget system call or urlopen in python when the page I am trying to get requires a login to get to I can't get it even though I am logged in in say firefox. If I do a firefox system call through python I can access pages that require me to be logged in, so if I want to parse an html page where a login is required how can I go about doing that?
GET in Python
Started by s3gf4ult, Nov 07 2010 09:15 AM
1 reply to this topic
#1
Posted 07 November 2010 - 09:15 AM
|
|
|
#2
Posted 07 November 2010 - 12:33 PM
Hi s3gf4ult,
Generally website creates session for logged users by setting cookie with particular name. You can check what cookies are set in firefox for particalur site in: Tools -> Options -> Privacy -> Remove individual cookies. Browser sends these cookies with request (get or post), so site is able to recognize you. The same way should work your python script. It should however get cookie (possibly script should imitate user activity) and send cookie with request. You can start with cookielib library (20.21. cookielib ? Cookie handling for HTTP clients — Python v2.7 documentation) and then return to the forum if you need more details.
Generally website creates session for logged users by setting cookie with particular name. You can check what cookies are set in firefox for particalur site in: Tools -> Options -> Privacy -> Remove individual cookies. Browser sends these cookies with request (get or post), so site is able to recognize you. The same way should work your python script. It should however get cookie (possibly script should imitate user activity) and send cookie with request. You can start with cookielib library (20.21. cookielib ? Cookie handling for HTTP clients — Python v2.7 documentation) and then return to the forum if you need more details.


Sign In
Create Account

Back to top









