Closed Thread
Page 1 of 3 123 LastLast
Results 1 to 10 of 22

Thread: downloading images

  1. #1
    Hot_Milo23's Avatar
    Hot_Milo23 is offline Programmer
    Join Date
    Jun 2009
    Location
    Western Australia
    Posts
    120
    Rep Power
    11

    Cool downloading images

    hey all,
    got a small problem and im not really sure where to start with it. (i havent used the html libraries before). I quite enjoy the internet comic called ansems retort (some of you may know of it? or not?). and i would like a way to be able to view them offline. so i started going through each one methodically copying and pasting, got bored pretty quickly. so im looking for a way to do it easier?

    if this helps each picture is held on a page that follows like this:
    ansemretort.org/ansemretort/index.html?comic=x
    x being the number (up to 529 atm).
    also each strip is named : "Comicx.png" (x being the number again.)

    if possible could someone show me a way to accomplish this?
    at first i just wanted it for convenience, now i want to use it as a learning opportunity?

    thanks in advance!

  2. CODECALL Circuit advertisement
    Join Date
    Always
    Location
    Advertising world
    Posts
    Many

     
  3. #2
    Join Date
    Apr 2008
    Posts
    789
    Blog Entries
    5
    Rep Power
    24

    Re: downloading images

    Well, I'm not too familiar with python, but here's the way I would do it in pseudocode:
    -while there's more comics to fetch:
    --fetch url "blah.com/comic=" + num.to_s into string
    --write string to file "comic" + num.to_s
    -end while.
    If someone proficient in python could translate this into python in should work.
    Watches: Nanoha, Haruhi, AzuDai. Listens to: E-Type, Dj Melodie, Nightcore.
    "When people are wrong they need to be corrected. And then when they can't accept it, an argument ensues." - MeTh0Dz

  4. #3
    psam is offline Learning Programmer
    Join Date
    Jun 2009
    Posts
    34
    Rep Power
    10

    Re: downloading images

    I wrote a python script that'll download the comics up to the edition 531 ( which i believe to be the last one right now ) into the folder you save it. I can't post the code because it contains the download link and my post count is less than 10 so i attached it to this message.
    Attached Files Attached Files

  5. #4
    Hot_Milo23's Avatar
    Hot_Milo23 is offline Programmer
    Join Date
    Jun 2009
    Location
    Western Australia
    Posts
    120
    Rep Power
    11

    Smile Re: downloading images

    haha, wow
    thx psam
    your a legend!
    preciate it

  6. #5
    psam is offline Learning Programmer
    Join Date
    Jun 2009
    Posts
    34
    Rep Power
    10

    Re: downloading images

    No problem.
    Any time you need .

  7. #6
    Hot_Milo23's Avatar
    Hot_Milo23 is offline Programmer
    Join Date
    Jun 2009
    Location
    Western Australia
    Posts
    120
    Rep Power
    11

    Smile Re: downloading images

    im sure this page is very dead by now but if u are still around psam, i would like your help again with a similar problem.

    analyzing the code u used last time i see you didnt use the web address at all, u used the address of where the pictures were stored. i was just wondering how u knew where this was, and if u know how to do it again (with "Pokemon x" comics).

    so if u still frequent this site psam, i would appreciate your help.
    or if anyone else could have a look into this for me??

    thx in advance guys

  8. #7
    Davison is offline Newbie
    Join Date
    Sep 2009
    Posts
    9
    Rep Power
    0

    Re: downloading images

    I slightly modified the previous posters code, to make it so you can define which comic you wish to start downloading from, and which you wish to finish with (i.e. you know you have not read 25 or 26, you simply run this from command prompt in windows with the code "spam and eggs.py" 25 26 (with 'spam and eggs.py' being the script name)

    Code:
    from urllib import urlretrieve
    import sys

    int(sys.argv[1])
    finish int(sys.argv[2])
    while 
    finish:
        
    url 'urlurlurlurlurlurlurl' str(n) + '.png'
        
    name 'comictitlehere' str(n) + '.PNG'
        
    file open(name'w')
        
    urlretrieve(urlname)
        print 
    'downloading %s NOW' % (url)
        
    file.close()
        
    += 
    *It says urlurl...as i cannot post links.

    To retrieve the image url, the simplest possible method is
    • Right click the picture
    • Copy the image url
    • Paste to your address bar/empty txt file

    If the url is along the lines of 'comic/500.png' or similar, you are fine.

    However my code will not work if the comic you use has the date it was posted as the name, a-la Ctrl-Alt-Del.

    I would probably be able to find a solution to this, but it is midnight and i have university tomorrow, i'll try and update tomorrow night with any solution for the problem of date.

    Hope i've helped.

  9. #8
    Davison is offline Newbie
    Join Date
    Sep 2009
    Posts
    9
    Rep Power
    0

    Re: downloading images

    Code:
    from urllib import urlretrieve
    import sys
    ,os

    def main
    ():
        
    int(sys.argv[1])
        while 
    1:
            
    url 'urlurlurlurlurl' str(n) + '.png'
            
    name 'comic-' str(n) + '.PNG'
            
    file open(name'w')
            
    urlretrieve(urlname)
            
    file.close()
            if(
    os.stat(name)[6] < 10000):
                print 
    'Updated to Comic'str(n-1)
                break
            print 
    'downloaded %s' %url
            n 
    += 1
        os
    .remove(name)

    main(): 
    This is some mildly updated script.

    Added the os.stat function

    This means that now, you enter your start comic number in the command line and the program will get any subsequent comics until it stores an image of less than 10000 bytes(can be changed to any value you like, this is just an example), where it will then exit the program and delete this sub-10000byte image.

    Brushing up on regular expressions just now to handle the issue of date-url comics.

  10. #9
    Davison is offline Newbie
    Join Date
    Sep 2009
    Posts
    9
    Rep Power
    0

    Re: downloading images

    I'm an idiot, regular expressions are not needed

    I love pythons included libraries btw, and this code variant is for sites with the pattern

    yyyymmdd.jpg
    can change the url and the extension to fit...

    Code:
    from urllib import urlretrieve
    import sys
    ,os,datetime,time

    arg 
    sys.argv

    one 
    sys.argv[1]
    yr one[:4]
    mo one[4:6]
    dy one[6:]

    two sys.argv[2]
    yr2 two[:4]
    mo2 two[4:6]
    dy2 two[6:]

    start datetime.date(int(yr),int(mo),int(dy))
    end datetime.date(int(yr2),int(mo2),int(dy2))

    print 
    'Start: ',start,' End: ',end

    while start <= end:
        
    site 'urlurlurlurlurl' 
        
    if start.month 10:
            
    month '0' str(start.month)
        else:
            
    month str(start.month)
        if 
    start.day 10:
            
    day '0' str(start.day)
        else:
            
    day str(start.day)
        
    url =  site str(start.year) + month day '.jpg'

        
    name 'ctrlaltdel - ' start.strftime("%Y%m%d") + '.jpg'
        
    file open(name'w')
        
    urlretrieve(urlname)
        
    file.close()

        if(
    os.stat(name)[6] < 20000):
            
    os.remove(name)
        else:
            print 
    'downloaded',url
        start 
    start datetime.timedelta(1

    Process:
    Get startdate and enddate from command line
    Get site for start date, check if data more than 20000 bytes
    If bytes less than 20000bytes...delete
    Add 1 day and repeat

    edit: Made it that start <= end, so that if there is a comic on the end date, it will also be downloaded

  11. #10
    Hot_Milo23's Avatar
    Hot_Milo23 is offline Programmer
    Join Date
    Jun 2009
    Location
    Western Australia
    Posts
    120
    Rep Power
    11

    Re: downloading images

    Davidson,
    you have been an awesome help, but for some reason it still wont work?
    ive used urlretrieve to download the google logo. ive used it to download every file type (including png, which is what the comic is saved as). Both worked, but when i try to do the exact same thing with the comic, it wont??

    im stumped?

    thx for the help tho

Closed Thread
Page 1 of 3 123 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Replies: 213
    Last Post: 04-14-2011, 07:57 PM
  2. Downloading file from internet
    By Kuto in forum C and C++
    Replies: 11
    Last Post: 01-02-2011, 11:36 AM
  3. Problem Downloading PDF
    By McHenryGIS in forum AJAX
    Replies: 0
    Last Post: 03-11-2010, 11:25 AM
  4. Downloading images (again)
    By Hot_Milo23 in forum C and C++
    Replies: 0
    Last Post: 10-19-2009, 04:26 AM
  5. Problem downloading files
    By miltonaguiar in forum ionFiles
    Replies: 3
    Last Post: 11-14-2007, 03:44 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts