Okay, so I'm looking for a way to extract some text from some PDF files (to output it to a .txt file to make it easier to manipulate, I have to sort them by highest average, etc.)
I thought about using python at first since it's pretty easy to sort data in a .txt file using it, I found the pyPDF library but it's giving me some weird string because of the formatting of the PDF files, so I don't think that library will do.
Here's an example of one of the PDF files in question:
complexejuliequilles.com/files/_leagues_Leagues1_1240.pdf
The result should be something like when you click on select text in adobe reader and copy and paste it in a .txt file. If it's something else, I guess I don't mind all that much either as long as I can work with the data extracted from the pdf files.
It doesn't have to be in Python either, if you've got an alternative just let me know, I know some C++ and Java, but I can learn another language pretty easily I guess.
Thanks a lot in advance!
Jumbala102


LinkBack URL
About LinkBacks




Reply With Quote



Bookmarks
Algorithms and Data Structures
Java tutorials
Algorithms Forum