Hi,
Is it possible to process some file such as .pdf .doc , .docx, .jpg, .html files for using in quote form in translation website?
quote form and word counting
Started by fardamm, Dec 07 2010 07:37 AM
7 replies to this topic
#1
Posted 07 December 2010 - 07:37 AM
|
|
|
#2
Posted 07 December 2010 - 12:10 PM
Everything is possible as long as you know how to decode the file formats :-)
__________________________________________
I study Information Systems at Karlstad University when I'm not on CodeCall
I study Information Systems at Karlstad University when I'm not on CodeCall
#3
Posted 07 December 2010 - 01:47 PM
Hi ,
I appreciate your answer , but I am looking for solution.
I like to have a for " free quote" for translation fee when customer upload his file with such file extension .pdf .doc , .docx, .jpg, .html.
I appreciate your answer , but I am looking for solution.
I like to have a for " free quote" for translation fee when customer upload his file with such file extension .pdf .doc , .docx, .jpg, .html.
#4
Posted 07 December 2010 - 05:34 PM
This will be somewhat a hacked together project as there are not really any standard PHP libraries for working with these formats. For most doc formats you can use a PHP class such as this , I had not tested it:
http://www.weberdev....ample-3211.html
For PDF there are quite a few PDF libraries out there, such as this
http://www.tcpdf.org/
http://www.fpdf.org/
For HTML this will be easy, in fact you can apply strip_tags() to the HTML document and get only text in return to get a valid quote
For Images I would of course quote by size, as it is impossible to get an OCR to work perfectly with every font/language in existance. 100x100 could be X dollars, 200x800 can be $20 etc. Image size can be found with getimagesize()
The most important thing would be to test these methods greatly, create many documents and images and thoroughly test each price to see if it what you would expect, you can create docx and doc files through OpenOffice (or microsoft office if you have it), this will be a hard project in itself.
http://www.weberdev....ample-3211.html
For PDF there are quite a few PDF libraries out there, such as this
http://www.tcpdf.org/
http://www.fpdf.org/
For HTML this will be easy, in fact you can apply strip_tags() to the HTML document and get only text in return to get a valid quote
For Images I would of course quote by size, as it is impossible to get an OCR to work perfectly with every font/language in existance. 100x100 could be X dollars, 200x800 can be $20 etc. Image size can be found with getimagesize()
The most important thing would be to test these methods greatly, create many documents and images and thoroughly test each price to see if it what you would expect, you can create docx and doc files through OpenOffice (or microsoft office if you have it), this will be a hard project in itself.
Be sure to read the updated FAQ! || Health is achieved through the same 10,000 steps.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.
If a suggested code/method fails, informing us is less important than telling us why or what errors occurred.
#5
Posted 08 December 2010 - 02:02 AM
Thanks for your detailed answer and information.
The programmer has already told my that the window server can do it for .doc but it can not do it in Linux server . I send your information to him .
thanks a lot for your help.
The programmer has already told my that the window server can do it for .doc but it can not do it in Linux server . I send your information to him .
thanks a lot for your help.
#6
Posted 08 December 2010 - 02:06 AM
fardamm said:
The programmer has already told my that the window server can do it for .doc but it can not do it in Linux server .
He must be a VB coder
#7
Posted 08 December 2010 - 03:18 AM
hi,
The links you provided are classes for creating pdf file, and converting .doc file to other formats, which yet need word application in windows
The links you provided are classes for creating pdf file, and converting .doc file to other formats, which yet need word application in windows
#8
Posted 08 December 2010 - 03:21 AM
Hi,
You mean that by VB we can do word count in .doc .pdf etc files?
You mean that by VB we can do word count in .doc .pdf etc files?


Sign In
Create Account

Back to top









