VeryPDF Knowledge Base

Structured Data Extraction from PDF with VeryPDF pdf2Data

2023/03/292023/03/30

Introducing VeryPDF pdf2Data - a simple and efficient solution for extracting structured data from PDF documents. This tool is available for Java and C# (.NET), as well as a CLI version.

With VeryPDF pdf2Data, you can intelligently recognize and extract data from PDF documents using selection rules that you define in a template. This approach offers a significant advantage over AI-based solutions that require extensive training to recognize documents.

https://www.verypdf.com/app/pdf-extract-tool/index.html

Moreover, the intuitive pdf2Data Editor is browser-based and allows anyone, from marketers to information managers to HR staff, to create and update templates. You don't need to be a developer to benefit from VeryPDF pdf2Data's user-friendly template-based solution for PDF data extraction.

If your documents are not in PDF format, don't worry! VeryPDF has got you covered. The VeryPDF OCR to Any Converter Command Line can turn scanned documents and images into PDF, or PDF/A for long-term archiving compliance. Once converted, the documents are ready to be processed by VeryPDF pdf2Data.

== Why Choose VeryPDF pdf2Data?
Data is a valuable asset, and you may have more of it than you realize locked inside your PDF documents. Manual data collection can be time-consuming and resource-intensive, with the risk of input errors or security issues to consider.

With VeryPDF pdf2Data, you can automate the data capture process and extract data in a secure manner. By creating a template from a single reference file, VeryPDF pdf2Data allows you to recognize and extract data from all PDFs that follow the same predictable format. This extraction method provides you with a high level of confidence from the outset, without requiring extensive datasets for training recognition models.

VeryPDF pdf2Data templates are flexible and reusable, so there is no need to redefine extraction rules for each new document from scratch. Instead, you can easily modify or reuse existing templates to process documents with new or different layouts.

== Core Capabilities of VeryPDF pdf2Data
VeryPDF pdf2Data operates by defining the areas, fonts, patterns, or tables of interest in a template that is used for all PDFs created in the same format, such as an invoice or other commercial documents.

You can then define areas of interest with data field selectors. Each selector uses a different method of identifying important information. Selectors can also be combined to fine-tune data identification and capture based on your requirements.

The data is output in a structured, reusable format for further processing, with access to the page coordinates of the extracted content.

== What VeryPDF pdf2Data does?
Many businesses deal with PDF documents that follow a predictable structure, such as invoices and registration forms. These documents often contain specific information, such as invoice numbers, supplier addresses, and purchase order numbers, located in the same place. Although the content of these documents, such as item descriptions, quantities, and costs, can vary, businesses can utilize a template based on a sample invoice to identify and categorize the data they need to extract.

VeryPDF pdf2Data offers a simple way to extract data from these types of PDF documents by creating a template that outlines the specific areas and rules for extracting the desired content. The template can be visually validated with other documents to ensure accurate data recognition before it is processed by the pdf2Data software development kit (SDK) for all subsequent documents that match the template.

Unlike AI-based data extraction solutions, VeryPDF pdf2Data does not require hundreds of samples or intensive supervision to train the recognition process. Instead, the template configuration controls the content recognition, which means no training is necessary before starting the data extraction process. With just one example document, businesses can extract data from all subsequent documents.

AI recognition solutions have their limitations. Any changes in the required output, such as adding a new field, will necessitate retraining the models, and multiple language support is minimal at best. Documents that have the same layout but different content in various languages can yield inconsistent results.

Fortunately, VeryPDF pdf2Data has none of these drawbacks. Modifying templates is fast and easy, and it offers excellent language support. It also provides robust table recognition functionality, which is one of the primary limitations of other data extraction solutions.

== How VeryPDF pdf2Data works?
Want to know how VeryPDF pdf2Data works? With its intuitive browser-based pdf2Data Editor, creating a template for data extraction is a breeze. All you need to do is create a template PDF by defining data field selectors for areas of interest based on a sample document. These selectors are configurable rules that can detect different types of content for extraction.

VeryPDF pdf2Data comes with approximately two dozen selectors that can intelligently recognize and extract text, images, and barcodes. You can configure these selectors to detect various parameters such as page range, position on the page, specific font styles and colors, text patterns, fixed keywords next to the data, and even automatic recognition of table structures. Additionally, you can combine multiple selectors to fine-tune the detection parameters to your liking.

Once you have created your extraction template, it can be used to parse all future PDFs that match the template. To test your extraction template and ensure that the data field selectors are configured correctly to recognize the data you require, you can use the pdf2Data Editor to upload a document.

== Schedule a demo
If you want to see how VeryPDF pdf2Data works in action, you can schedule a demo. Upon submitting your request, a specialist will contact you within two business days to inquire about details and set up a time for the walk-through. The link for the demo will be sent to you shortly after. In case you do not receive it, make sure to check your "spam" or "junk email" folders.

Rating: 0.0/10 (0 votes cast)

Rating: 0 (from 0 votes)

docprint pro, hookprinter, mini emf printer driver, verypdf sdk & com

Introducing VeryPDF Dummy Network Shared Printer Solution

2023/03/29

If you need to capture print jobs and save them as image or PDF files, VeryPDF's Dummy Network Shared Printer Solution can be a valuable tool for you. With this solution, you can create a virtual printer on your computer that captures print jobs from other computers on the network and saves them as BMP, TIFF, JPG, or PDF files.

With VeryPDF Dummy Network Shared Printer Solution, You can create a Dummy Network Shared Printer for capture print jobs easily.

VeryPDF Dummy Network Shared Printer Solution will work by following steps,

1. Download and install VeryPDF Dummy Printer software on the computer where you want to create the virtual printer.

2. Connect your Dummy Printer to the target printer that you want to forward print jobs to by selecting your Dummy Printer and clicking on the "Set Printer" button. Choose the target printer from the list and click "OK".

3. Set additional options for your Dummy Printer, such as output format, output folder, and file naming conventions. These options can be accessed by selecting your Dummy Printer and clicking on the "Printer Options" button.

4. Share your Dummy Printer as a network printer so that other users on the network can print to it. Right-click on your Dummy Printer and choose "Printer Properties". Go to the "Sharing" tab and select "Share this printer". You can also give your printer a share name that will be displayed to other users.

5. When a user prints a document to the Dummy Printer, it will store the print job as an image or PDF file, and then forward the job to the target printer.

By following these simple steps, you can create a Dummy Network Shared Printer that accepts print jobs from any device on your network, including Windows, Mac, iOS, and Android systems. VeryPDF's solution is easy to use and can help you monitor print activity or troubleshoot printing issues. Contact VeryPDF today and see how it can improve your printing workflow!

Rating: 0.0/10 (0 votes cast)

Rating: 0 (from 0 votes)

pdf text replacer

Effortlessly Replace Text in PDF Files with PDF Text Replacer Command Line

2023/03/29

Software : PDF Text Replacer
While using text replace software the new file generated is having files size more than 3 times i.e from 205 kb to 773kb.

Secondly the text is not replaced but it is made transparent , so one can get the text if we does copy and paste at that location.

Command used:
c:\pdftr\pdftr.exe -searchandoverlaytext "AUTHORISED SIGNATORY=> ||SAPAN S CHOKSI=> ||Signature & Date=> ||For VENUS JEWEL=> ||FOR VENUS JEWEL=> ||SAPAN S CHOKSI=> " c:\digi_files\vj_e_96887_inv.pdf c:\digi_files\vj_e_96887_invdigi.pdf

Customer
----------------------------------

https://www.verypdf.com/app/pdf-text-replacer/try-and-buy.html

Thanks for your message, please use "-contentreplace" option to instead of "-searchandoverlaytext" option to try again,

pdftr.exe -contentreplace "AUTHORISED SIGNATORY=> ||SAPAN S CHOKSI=> ||Signature & Date=> ||For VENUS JEWEL=> ||FOR VENUS JEWEL=> ||SAPAN S CHOKSI=> " "D:\Downloads\rebuilt.FW\vj_e_96887_inv.pdf" "D:\Downloads\rebuilt.FW\vj_e_96887_inv_out.pdf"

The following two problems will be solved by "-contentreplace" option,

1. While using text replace software the new file generated is having files size more than 3 times i.e from 205 kb to 773kb.

2. The text is not replaced but it is made transparent, so one can get the text if we does copy and paste at that location.

VeryPDF
----------------------------------
PDF Text Replacer Command Line is a powerful software tool designed to replace or edit text in PDF files quickly and easily. Whether you need to update a date, change a name, or replace a word or phrase, this command line tool provides a simple and efficient solution for all your PDF text replacement needs.

PDF Text Replacer Command Line is ease of use. You can replace text in PDF files with just a few simple commands, making it a great choice for both experienced users and those who are new to working with command line tools. The software supports a wide range of text formats, including Unicode, which means you can replace text in multiple languages.

PDF Text Replacer Command Line is also highly customizable. You can specify the exact location in the PDF file where you want to replace text, and you can also choose to replace text in a specific page range or even in multiple PDF files at once. This makes it a great tool for businesses, academic institutions, or anyone who needs to replace text in large volumes of PDF files.

PDF Text Replacer Command Line has ability to perform batch processing. You can use the software to replace text in multiple PDF files simultaneously, saving you a significant amount of time and effort. This feature is especially useful for businesses or organizations that need to update large volumes of documents on a regular basis.

PDF Text Replacer Command Line is a powerful and versatile software tool that can save you time and effort when it comes to replacing or editing text in PDF files. Its ease of use, customizability, and batch processing capabilities make it a great choice for businesses, academic institutions, and individuals alike. Whether you're looking to update a single PDF file or thousands of documents, this tool provides a reliable and efficient solution.

Rating: 0.0/10 (0 votes cast)

Rating: 0 (from 0 votes)

ocr products, pdf to text ocr command line, pdf to word ocr converter, scan to word ocr converter, screen ocr, table extractor ocr

Automate Document Conversion with Highly Accurate OCR – VeryPDF Server OCR Software

2023/03/27

VeryPDF Server OCR Software: Streamlining Document Conversion with Powerful Scan and OCR Functions. Automate Document Conversion with Highly Accurate OCR - VeryPDF Server OCR Software.

In today's fast-paced business world, digitization is the key to efficiency and productivity. Converting paper documents to electronic files is an essential step in this process, and VeryPDF Server OCR Software is a powerful tool to accomplish this task. The software is designed to automate high-volume conversion of scanned paper and image documents to searchable PDF files, making it an ideal solution for enterprise document scanning, archiving, and digitization.

https://www.verypdf.com/app/ocr-to-any-converter-cmd/try-and-buy.html

One of the key features of VeryPDF Server OCR Software is its highly accurate OCR engine. This engine is capable of recognizing and converting scanned documents with exceptional accuracy, minimizing the need for manual processing. This not only saves time but also reduces the costs associated with manual data entry. Additionally, the software's batch OCR and multi-threading capabilities enable rapid, high-volume processing of large numbers of documents.

The software's watch folder mode is another powerful feature that enables touchless, automated OCR. This mode monitors a designated folder and automatically converts any scanned documents that are placed into it. This eliminates the need for manual intervention, making the process of document conversion even more efficient.

VeryPDF Server OCR Software's robust OCR functionality includes confidence reports and multiple voting engines. This ensures that the converted documents are accurate and of high quality. The software also provides control over the output generated, with the ability to generate documents in 13 different formats, including PDF and PDF/A.

The software's scan to PDF feature streamlines workflows by converting paper contracts, agreements, and other documents to electronic PDF files in a single step. This saves time and reduces the risk of errors associated with manual data entry. The optical character recognition (PDF OCR) feature enables the conversion of scanned or image-based content into selectable, searchable, and editable text. This is particularly valuable when dealing with hardcopy documents.

In addition, VeryPDF Server OCR Software allows users to edit text in scanned PDF documents. The software's PDF OCR feature generates editable text from scanned documents, enabling paragraph editing. This is especially useful when dealing with hardcopy documents that need to be converted to electronic format.

The software also provides the ability to correct suspect OCR PDF results. This feature enables users to find and correct incorrect OCR PDF results, which is essential for accurate file indexing for effective PDF searching. Finally, the software allows for the insertion of scan to PDF pages directly into an existing PDF document, further streamlining workflows.

In conclusion, VeryPDF Server OCR Software is a powerful tool for automating high-volume document conversion. Its highly accurate OCR engine, watch folder mode, batch OCR and multi-threading capabilities, and robust OCR functionality make it an ideal solution for enterprise document scanning, archiving, and digitization. With its ability to generate documents in 13 different formats and its features for editing scanned PDF documents and correcting suspect OCR PDF results, the software offers a comprehensive solution for all document conversion needs.

Rating: 0.0/10 (0 votes cast)

Rating: 0 (from 0 votes)

pdf split-merge, pdf stamp

Do you have 64bit PDF Stamper SDK Component and 64bit PDF Split-Merge SDK for Developers?

2023/03/23

Hi there

We are using PDF Stamper SDK Component in our app, and would like to upgrade to 64bit.

Do you have a 64bit version of this component available?

Best regards,
Customer
-----------------------
The latest version of PDF Stamper SDK has already supported the 64bit system, please by following steps to call PDF Stamper SDK from your 64bit application,

1. Please download the latest version of PDF Stamper SDK from following web page,

https://www.verypdf.com/app/pdf-stamp/try-and-buy.html#buy-sdk
https://www.verypdf.com/dl2.php/pdfstamp_sdk.zip

2. Please run "bin/install_for_x64.bat" to register PDFStampCom.exe first, you need run it with administrator privilege, you can also run a CMD window with administrator privilege first, and then run following command line to register it by manual,

PDFStampCom.exe /regserver

3. Please run "test-pdfstampcom-x64.vbs" to test the stamp functions, please refer to a VB Script sample code at below, the "PDFStampCOM.CPDFStamp" COM interface can be supported by both 32bit and 64bit applications,

Const Very_Set_Range = 131
Const Very_Set_Opacity = 240
Const Very_Get_Opacity = 240
Const Very_Get_PdfPageCount = 206
Const Very_Get_PageBoxForStamp = 260
Const Very_Set_EmbedFont = 241
Const Very_Get_EmbedFont = 241
Const Very_Set_TransparentColor = 242
Const Very_Set_ImageLossless = 243
Const Very_Set_InsertMultipleImageCopy = 244

Set fso = CreateObject("Scripting.FileSystemObject")
strFolder = fso.GetParentFolderName(wscript.ScriptFullName)

strPDFFile = strFolder & "\example.pdf"
strOutFile = strFolder & "\vbcom-test.pdf"
Set pdfstamp = CreateObject("PDFStampCOM.CPDFStamp")
pdfstamp.veryRegEx "XXXXXXXXXXXXXX"
id = pdfstamp.veryOpenEx(strPDFFile, strOutFile)
'id = pdfstamp.VeryStampLayerOpenEx(strPDFFile, strOutFile, "PDFManWatermark_Overlayer", "PDFManWatermark_Underlayer")
If (id > 0) Then
Page = 1
iRet = pdfstamp.verySetFunctionEx(id, Very_Set_InsertMultipleImageCopy, 1, 0, 0, 0)
pagecount = pdfstamp.veryGetFunctionEx(id, Very_Get_PdfPageCount, 0, 0, 0, 0)
MsgBox "PDF file: " & strPDFFile & ", Page Count = " & CStr(pagecount)
For Page = 1 To pagecount

leftpos = pdfstamp.veryGetFunctionEx(id, Very_Get_PageBoxForStamp, Page, 0, "left", 0)
top = pdfstamp.veryGetFunctionEx(id, Very_Get_PageBoxForStamp, Page, 0, "top", 0)
pagewidth = pdfstamp.veryGetFunctionEx(id, Very_Get_PageBoxForStamp, Page, 0, "width", 0)
pageheight = pdfstamp.veryGetFunctionEx(id, Very_Get_PageBoxForStamp, Page, 0, "height", 0)

'We need to control which pages the stamp goes on
iRet = pdfstamp.verySetFunctionEx(id, Very_Set_Range, Page, 1, 0, 0)
strStampBuf = "Page:" + CStr(pagecount - Page + 1) + ", Page Box: [" + CStr(leftpos) + " " + CStr(top) + " " + CStr(pagewidth) + " " + CStr(pageheight) + "]"

'Not embed TTF font for general text stamp
iRet = pdfstamp.verySetFunctionEx(id, Very_Set_EmbedFont, 0, 0, 0, 0)
iRet = pdfstamp.veryAddTextEx(id, 2, strStampBuf, 255, 0, 0, 0, 0, 0, 0, 300, 0, 10, 1, "https://www.verypdf.com", 0)

'Embed special TTF font into PDF file, we can to use any TTF font
iRet = pdfstamp.verySetFunctionEx(id, Very_Set_EmbedFont, 1, 0, 0, 0)
iRet = pdfstamp.veryAddTextEx(id, 1, "0123456789", RGB(0, 0, 0), 0, 0, 0, 0, 0, 0, 0, "Code-39-20", 8, 1, "https://www.verypdf.com/", 0)

iRet = pdfstamp.verySetFunctionEx(id, Very_Set_ImageLossless, 1, 0, 0, 0)
iRet = pdfstamp.veryAddImageEx(id, 9, strFolder & "\watermark.gif", 0, 0, 0, 0, 50, 50, 0, 0, 0)
Next
pdfstamp.veryCloseEx (id)
End If

iRet = pdfstamp.VeryStampDeleteStampFromPagesEx(strOutFile, strOutFile + "-StampRemoval.pdf", "1,5-6,8,13-15")

iRet = pdfstamp.VeryStampDeleteImagesFromPagesEx(strOutFile, strOutFile + "-ImageRemoval.pdf", 1240, 1240, "1,5-6,8,13-15")

nIsStamped1 = pdfstamp.VeryStampIsStampedEx(strPDFFile)
nIsStamped2 = pdfstamp.VeryStampIsStampedEx(strOutFile)
strMessage = strPDFFile & ", Check Stamp Status: " & CStr(nIsStamped1) & vbCrLf
strMessage = strMessage & strOutFile & ", Check Stamp Status: " & CStr(nIsStamped2)
MsgBox strMessage

strOutFile = strFolder & "\vbcom-newLayer-test.pdf"
id = pdfstamp.VeryStampLayerOpenEx(strPDFFile, strOutFile, "Foreground Layer", "Background Layer")
If (id > 0) Then
iRet = pdfstamp.veryAddTextEx(id, 2, "Test VeryStampLayerOpen() function", 255, 0, 0, 0, 0, 0, 0, 300, 0, 10, 1, "https://www.verypdf.com", 0)
pdfstamp.veryCloseEx (id)
End If

strOutFile = strFolder & "\vbcom-Encrypted-test.pdf"
iRet = pdfstamp.VeryStampEncryptPDFEx(strPDFFile, strOutFile, "", "123", 1, 3900)

4. OK, you could no problem to call "PDFStampCOM.CPDFStamp" from your 64bit application now.

VeryPDF
-----------------------
Hi again

We are also looking for a PDF Split-Merge SDK component in 64 bit. Do you have this as well?

Best regards,
Customer
-----------------------

Thanks for your message, yes, we have 64bit version of PDF Split-Merge SDK Developer License, the cost is USD1200, you may buy it from our website directly,

https://www.verypdf.com/order_pdfpgsdk_dev.html

You may also download the trial version of PDF Split-Merge SDK Developer License from this web page,

https://www.verypdf.com/app/pdf-split-merge/try-and-buy.html#buy-sdk

after you download and unzip it to a folder, please go to "bin" folder, run "install.vbs" to register "VeryPDFSplitMergeCOM.exe" into your system, then you can use following C# source code to split and merge PDF files easily,

private void button1_Click(object sender, EventArgs e)
{
string appPath = Path.GetDirectoryName(Application.ExecutablePath);
System.Type VeryPDFSplitMergeCOMType = System.Type.GetTypeFromProgID("VeryPDFSplitMergeCOM.com");
VeryPDFSplitMergeCOM.com VeryPDFSplitMergeCOM = (VeryPDFSplitMergeCOM.com)System.Activator.CreateInstance(VeryPDFSplitMergeCOMType);

VeryPDFSplitMergeCOM.com_PDF_SetCode("XXXXXXXXXXXXXXXXXXXXXXXXXXX");

string szPDFFile = appPath + "\\_out1.pdf";
System.IO.File.Copy(appPath + "\\testcmd.pdf", szPDFFile, true);

int nRet = 0;
nRet = VeryPDFSplitMergeCOM.com_VeryAppendPDF(szPDFFile, appPath + "\\testcmd.pdf");
nRet = VeryPDFSplitMergeCOM.com_VeryAppendPDF(szPDFFile, appPath + "\\testcmd.pdf");
nRet = VeryPDFSplitMergeCOM.com_VeryAppendPDF(szPDFFile, appPath + "\\testcmd.pdf");
nRet = VeryPDFSplitMergeCOM.com_VeryAppendPDF(szPDFFile, appPath + "\\testcmd.pdf");
int nPageCount = VeryPDFSplitMergeCOM.com_VeryGetPDFFilePageCount(szPDFFile);
MessageBox.Show(szPDFFile + " is contain " + nPageCount.ToString() + " pages.");
}

private void button2_Click(object sender, EventArgs e)
{
string appPath = Path.GetDirectoryName(Application.ExecutablePath);
System.Type VeryPDFSplitMergeCOMType = System.Type.GetTypeFromProgID("VeryPDFSplitMergeCOM.com");
VeryPDFSplitMergeCOM.com VeryPDFSplitMergeCOM = (VeryPDFSplitMergeCOM.com)System.Activator.CreateInstance(VeryPDFSplitMergeCOMType);

VeryPDFSplitMergeCOM.com_PDF_SetCode("XXXXXXXXXXXXXXXXXXXXXXXXXXX");

string szPDFFile = appPath + "\\testcmd.pdf";
VeryPDFSplitMergeCOM.com_VerySplitMergePDFEx(szPDFFile, "2-5", appPath + "\\_page2-5.pdf");
int nPageCount = VeryPDFSplitMergeCOM.com_VeryGetPDFFilePageCount(appPath + "\\_page2-5.pdf");
MessageBox.Show(appPath + "\\_page2-5.pdf" + " is contain " + nPageCount.ToString()+ " pages.");
}

You may also refer to a sample from following web page,

https://www.verypdf.com/wordpress/201404/verypdf-release-notes-split-and-merge-pdf-files-from-c-source-code-40394.html

>>Do you have any documentation for using "PDF Split-Merger Pdf PDF" for Classic ASP?

You may refer to all documents about PDF Split-Merge SDK from following web pages,

https://www.verypdf.com/wordpress/category/pdf-split-merge
https://www.verypdf.com/wordpress/?s=VeryPDFSplitMergeCOM

If you encounter any problem with this product, please feel free to let us know.

VeryPDF

Rating: 0.0/10 (0 votes cast)

Rating: 0 (from 0 votes)

May 2024
M	T	W	T	F	S	S
« Apr
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31