Can you please provide us a sample api URL where we can extract image texts with position in text file?

Can you please provide us a sample api URL where we can extract image texts with position in text file.

Customer
-------------------------------------------------------
Sure, no problem, you can open following URL in web browser,

http://online.verypdf.com/api/?apikey=XXXXXXXXXXXXX&app=ocr
&infile=http://online.verypdf.com/examples/cloud-api/multipage.tif&outfile=out&lang=eng&format

You will get an output URL, e.g.,

[Output] http://online.verypdf.com/u/public/api/20140703-214752-683223135-out.html

please open http://online.verypdf.com/u/public/api/20140703-214752-683223135-out.html in web browser, view its source code, you will get the source code like below,

-------------------------------------
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title></title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name='ocr-system' content='VeryPDF Cloud API System -- VeryPDF Cloud OCR API' />
<meta name='ocr-capabilities' content='ocr_page ocr_carea ocr_par ocr_line ocrx_word'/>
</head>
<body>
<div class='ocr_page' id='page_1' title='image "20140703-214752-1177385112.tif"; bbox 0 0 2000 2388; ppageno 0'>
<div class='ocr_carea' id='block_1_1' title="bbox 22 28 1013 87">
<p class='ocr_par' dir='ltr' id='par_1' title="bbox 23 30 1012 83">
<span class='ocr_line' id='line_1' title="bbox 23 30 1012 83"><span class='ocrx_word' id='word_1' title="bbox 23 32 262 74">Universal</span> <span class='ocrx_word' id='word_2' title="bbox 278 31 569 73">Declaration</span> <span class='ocrx_word' id='word_3' title="bbox 586 31 637 73">of</span> <span class='ocrx_word' id='word_4' title="bbox 649 31 836 72">Human</span> <span class='ocrx_word' id='word_5' title="bbox 853 30 1012 83">Rights</span>
</span>
</p>
</div>
-------------------------------------

Above HTML code is contain coordinates for each words in TIFF image file.

You can also look at following web page for more information,

https://www.verypdf.com/wordpress/201308/verypdf-cloud-api-platform-verypdf-ocr-cloud-api-online-ocr-engine-to-recognize-scanned-pdf-and-image-files-to-editable-document-formats-37980.html

VeryPDF

Rating: 1.0/10 (1 vote cast)

Rating: +1 (from 1 vote)

July 2014
M	T	W	T	F	S	S
« Jun				Aug »
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Related Posts

Leave a Reply Cancel reply