View Single Post
  #1 (permalink)  
Old 02-15-2008, 10:23 PM
Amber8's Avatar   
Amber8 Amber8 is offline
Newbie
 
Join Date: Feb 2008
Posts: 5
Rep Power: 0
Amber8 is on a distinguished road
Default Automating Text Recognition

Hi everybody, I need to convert a bunch of pdf's to text searchable. The acrobat OCR function cant do it because the resolution is lower than the minimum required (144dpi). What I started doing is saving the pdf pages as image files, increasing the resolution in an imaging package then printing them again to pdf & doing the OCR. Obviously very repetitive & boring - i can think of much better things to do on Sat night LOL.
I was thinking of writing a script for it (using python since thats the only one I've played with in the past) but I was wondering if there exists already some piece of code to do this. I imagine its a common problem since there is a fair bit on the web talking about it but havent been able to find automated code to do it. Or if anyone has any ideas if any other language might be a better match for this??
Reply With Quote

Sponsored Links