and now we need help.
We would have from existing PDF files a readable PDF file
What are the correct string to convert this files???
Do we need 2 steps - convert from PDF to TIF and then from TIF to PDF_Readable?
Testfiles:
Please HELP
Thanks
====================================
You needn't convert PDF file to TIFF file first and convert TIFF file to PDF file again, you can simple run following command line to convert a scanned PDF file to searchable PDF file at one step,
if your PDF file contains color pages, you can add "-autobitcount" parameter to process the color pages, pdfocrjb2.exe will skip OCR process for color pages automatically,
pdfocrjb2.exe -ocr 1 -autobitcount D:\temp3\testfile_2.pdf D:\temp3\testfile_2_out.pdf
pdfocrjb2.exe is work great for black and white TIFF&PDF files, for the color files, you should better convert them to black and white TIFF&PDF files first, if so, our OCR engine will recognize characters greatly in black and white image files.
====================================
some additional concern:
we also need to convert an image pdf into text format, how can that be done within one cmd line ,
i.e. pdfocrjb2.exe -ocr 1 D:\temp3\testfile_2.pdf D:\temp3\testfile_2_out.txt ?
Thanks for help!!!
====================================
You can use "-ocrtxt " to create a text file, for example,
pdfocrjb2.exe -ocr 1 -ocrtxt D:\temp3\testfile_2_out.txt D:\temp3\testfile_2.pdf D:\temp3\testfile_2_out.pdf
Related Posts
Related posts:
Scanned PDF image to word
How to convert scanned PDF to editable XLSX
How to convert scanned PDF to editable Excel in batches
How to extract German text from scanned PDF to XLS
Convert PDF to Word through command line by OCR to Any Converter
PDF document language detection
How to get the key value pairs from scanned PDF file?
I need a software to convert PDF, Image, DOC, DOCX, CSV, etc. formats to text files from command lin...