question: pdf2txt removes whitespaces in some PDFs ?

I find your pdf2txt.exe utility very useful and fast too, congratulations.

Before ordering I would like to make sure I can find a solution to the following problem:
with certain PDF files all white spaces are removed whether I use the -whitespace command line option or not.
See examples attached.

I'm I doing something wrong or is this a known limitation ?

thank you for your feedback,
best,
==============================
Please run following command line to convert your PDF file to text file to try again, (-breaker parameter will insert page breaker into converted .txt file),

"C:\Program Files\VeryPDF PDF2TXT v3.2\pdf2txt.exe" C:\in.pdf C:\out.txt -unicode -breaker

You can also run following command line to convert PDF file to text file without page breaker symbols,

"C:\Program Files\VeryPDF PDF2TXT v3.2\pdf2txt.exe" C:\in.pdf C:\out.txt -unicode

We hoping "-unicode" parameter will work better for you, please to try.

VeryPDF
==============================
thank you, -unicode is much better.
FYI: I don't see a difference with and without -breaker and also original formatting is also kept with or without -format.

But the result is OK for me. License purchased.

thanks again,
==============================
If you use -breaker parameter, PDF2TXT software will insert a page breaker symbol 0x0C after each page.
If you use -unicode parameter, the -format parameter will be ignored.

VeryPDF

Related Posts