I find your pdf2txt.exe utility very useful and fast too, congratulations.
Before ordering I would like to make sure I can find a solution to the following problem:
with certain PDF files all white spaces are removed whether I use the -whitespace command line option or not.
See examples attached.
I'm I doing something wrong or is this a known limitation ?
thank you for your feedback,
best,
==============================
Please run following command line to convert your PDF file to text file to try again, (-breaker parameter will insert page breaker into converted .txt file),
"C:\Program Files\VeryPDF PDF2TXT v3.2\pdf2txt.exe" C:\in.pdf C:\out.txt -unicode -breaker
You can also run following command line to convert PDF file to text file without page breaker symbols,
"C:\Program Files\VeryPDF PDF2TXT v3.2\pdf2txt.exe" C:\in.pdf C:\out.txt -unicode
We hoping "-unicode" parameter will work better for you, please to try.
VeryPDF
==============================
thank you, -unicode is much better.
FYI: I don't see a difference with and without -breaker and also original formatting is also kept with or without -format.
But the result is OK for me. License purchased.
thanks again,
==============================
If you use -breaker parameter, PDF2TXT software will insert a page breaker symbol 0x0C after each page.
If you use -unicode parameter, the -format parameter will be ignored.
VeryPDF
VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Related Posts
- Highly accurate OCR server software designed to automate high volume conversion of scanned paper and image documents to searchable PDF
- VeryPDF PDF Extract allows you to extract content from PDF files and save it in a structured data format
- Efficient and Accurate EMF to Text Conversion with VeryPDF Command Line Converter
- Powerful VeryPDF PDF Conversion SDK for Developers: Convert PDF, Word, Excel, PowerPoint, HTML, and More!
- Intelligent PDF Data Extraction with VeryPDF Data Extraction SDK: JSON Output, Table Extraction, and More
- Convert PDF to Text with VeryPDF PDF to Text SDK for Windows, Linux, Mac, iOS, Android platforms
- VeryPDF PDF SDK for Web & Windows & Linux & Mac & iOS & Android as well as PDF Conversion SDK
- VeryPDF Text and Image Extraction Toolkit is a developer product for reliably extracting text, images and metadata from PDF documents
- Full Text Extraction with VeryPDF PDF to Text OCR SDK for .NET
- PDF to Text OCR Converter SDK for .NET, C# OCR SDK, OCR API, OCR Library for .NET Developers Royalty Free
- How can I convert PDF file to text file and maintain original text layout? And does Ps2pdfsdk.Dll need .NET framework?
- How to convert from PDF files to Plain Text files from my VB application?
- An unhandled exception of type ‘System.DllNotFoundException’ occurred in pdf2txt.dll library
- PDF to Text Converter can’t align text lines
- PDF to Text Converter can’t extract text which render by embedded fonts