How can I extract text from PDF files using Visual Basic?

Question: I know there are a lot of PDF extraction methods/techniques, but I'm after a reliable text extractor for PDFs in Visual Basic.All I want is to extract words, but not numbers and no special characters.Any ideas on VeryPDF to achieve this?

Answer: According to your needs, there are two solutions on VeryPDF, which can help you extract text from PDF. Those two software is:VeryPDF PDF to TXT Converter and VeryPDF PDF to Text OCR Converter Command Line. Both of them can convert PDF to text easily and allow you to run the conversion from Visual Basic. But the first software can not process image PDF, scanned PDF and other scan image file. The second one with OCR function can make up insufficiency of the first one.   In the following part, I will take the first one for example to show you how to extract text from PDF by Visual Basic. If you need to know more detail information, please visit each homepage. In the following part, let us check how it works.

Step 1. Free download VeryPDF PDF to TXT COM

  • There are two license types of this software for you to choose: Server version and developer version. If you do not use this software for developing, simply choose the server version, which sells at $199.00 and allows you to use it the whole life.
  • When downloading finishes, there will be a zip file. Please extract it to some folder then you can check related elements and components. When you use this software, please refer to the code templates in extracted folder.

Step 2. Extract text from PDF from Visual Basic.

  • By the COM software of PDF to Text Converter, you can convert this software together with applications like C/C++, Delphi, ASP, PHP, C#, and .NET.
  • Here is one code template of extracting text from PDF from Visual Basic, please have a check.

Private Sub bconvert_Click()

  ' only PDF Files
cdialog.Filter = "PDF Files (*.PDF)|*.PDF"
  ' if ok
   cdialog.ShowOpen
' if file specified
If cdialog.FileName <> "" Then
' Application exe path and add filename for txt
  TxtFile = App.Path + "\Temp.Txt"
' Txt file exist ?
If Dir(TxtFile) <> "" Then ' ok delete it
Kill TxtFile
End If

  'Register your PDF2TXT SDK by given License Code
PDF2TXTSetLicenseCode "XXXXXXXXX"
SetTXTFormat 1
SetGetDocumentSummary 0
SetPageSeparator "*****************>>>>>>>>>>>>>>>"
' convert the file
pdf2txt cdialog.FileName, TxtFile
' open notepad with textfile
ShellExecute 0, "open", TxtFile, "", "", SW_SHOW
TxtFile = App.Path + "\Temp2.Txt
pdf2txtEx cdialog.FileName, TxtFile, 0, 0, "", ""
ShellExecute 0, "open", TxtFile, "", "", SW_SHOW
End If
End Sub

Please note this software only can be used to convert text based PDF to text and supports nearly all the common languages like English, French, German, Italian, Chinese Simplified, Chinese Traditional, Czech, Danish, Dutch, Japanese, Korean, Norwegian, Polish, Portuguese, Russian, Spanish, Swedish, Thai, etc.  But when you need to process image PDF, please use the one with OCR function. During the using, if you have any question, please contact us as soon as possible.

VN:F [1.9.20_1166]
Rating: 8.5/10 (6 votes cast)
VN:F [1.9.20_1166]
Rating: +1 (from 1 vote)
How can I extract text from PDF files using Visual Basic?, 8.5 out of 10 based on 6 ratings

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!