How to use VeryPDF Cloud API OCR SDK for automation and getting coordinate (position) of text on image

hi,

"Greetings of the day"

We are developing a web application that require OCR SDK for get position of particular text on the image.
We also require pdf to image conversion for the same..

So my Query is ...
if we upload a pdf and send it to you for conversion then what we will get in return, Will you provide converted images or you will provide only URL for these converted images..

Our Requirement ...
We want converted images on our server for further process also we want your OCR SDK for automation and getting co-ordinate(position) of text on image.

Please provide details on these matter as soon as possible..

Thanks & Regards,
Customer
---------------------------------------------------
>>if we upload a pdf and send it to you for conversion then what we will get in return, Will you provide converted images or you will provide only URL for these converted images..

We will return the URLs for these converted images, you can download these images to your local server from your code easily.

>>Our Requirement ...
>>We want converted images on our server for further process also we want your
>>OCR SDK for automation and getting co-ordinate(position) of text on image.
>>Please provide details on these matter as soon as possible..

Yes, this is possible, you can subscribe "VeryPDF Cloud API Platform" from following web page,

https://www.verypdf.com/online/cloud-api/try-and-buy.html

after you subscribed "VeryPDF Cloud API Platform", please send to us your Order ID, we will send the detailed instructions to you asap.

Just for test purpose, you can open following URL in web browser,

http://online.verypdf.com/api/?apikey=XXXXXXXXXXXXX&app=ocr
&infile=http://online.verypdf.com/examples/cloud-api/multipage.tif&outfile=out&lang=eng&format

You will get following return,

http://online.verypdf.com/u/public/api/20140619-025657-1646062729-out.html

These HTML contents are contain "X, Y, Width, Height" coordinates for each word and each line, you can write a simple application to parse these HTML contents easily,

<div class='ocr_page' id='page_1' title='image "20140619-025657-4646439576.tif"; bbox 0 0 2000 2388; ppageno 0'>
<div class='ocr_carea' id='block_1_1' title="bbox 22 28 1013 87">


Universal
Declaration of
Human
Rights


</div>
<div class='ocr_carea' id='block_13_13' title="bbox 24 2434 1944 2587">


Nothing in this
Declaration
may
be
interpreted
as
implying
for
any
State,
group
or
person
any


right
to
engage
in
any
activity
or
to
perform
any
act
aimed
at
the
destruction
of
any
of
the
rights


and
freedoms
set
forth
herein.


</div>
</div>

------------------------------------------------------------------

>>(1). As i already asked that, will you provide converted images after conversion or you will provide only URL for these image?
>>Your reply was: you will provide URL ..
>>Requirement: Its fine but we want images on our server, Is there any process that we can get images on our server with these packages or on extra cost for this.

VeryPDF Cloud API will return the URLs of converted images to you, you can download these images to yourself's server, then you can store these images on your server.

>>(2). As per our next requirement, we want your OCR SDK for getting position of text on images.
>>Your reply was: you will provide a URL that is in html format and contain co-ordinates for all texts in our image.
>>Requirement: Can you provide these co-ordinates and information in a text file or in a json format..?

Yes, no problem, we will return the text contents which contain co-ordinates for all texts in your image, you can parse these text contents to get co-ordinates and words easily.

>>(3). Is there any feature for search with your cloud OCR ??
>>What type of search we require ...
>>We require that when we provide you any text from all the text show in pdf.. so can you provide co-ordinate for particular given text with repetition of that particular(all positions for this text on image, whether it is repeated) text on that image.

Yes, this is already be done in our Cloud API Platform, you can use "PDF Extraction API" function to get co-ordinates for each word from PDF file, you can evaluate this function from following web page,

https://www.verypdf.com/app/pdf-extract-tool/online.html

The output format for text positions is like below,

//Text Positions for each Word
word: x=157.06..188.76 y=18.60..32.55 base=30.17 fontSize=11.52 rot=0 link=00000000 'Home'
word: x=197.88..257.12 y=18.60..32.55 base=30.17 fontSize=11.52 rot=0 link=00000000 'PDF-Tools'
word: x=266.21..287.18 y=18.60..32.55 base=30.17 fontSize=11.52 rot=0 link=00000000 'Doc'
word: x=288.38..323.97 y=18.60..32.55 base=30.17 fontSize=11.52 rot=0 link=00000000 'ument'
word: x=333.65..379.00 y=18.60..32.55 base=30.17 fontSize=11.52 rot=0 link=00000000 'Support'
word: x=65.66..182.76  y=46.95..72.70 base=68.32 fontSize=21.26 rot=0 link=00000000 'Advanced'
word: x=190.02..237.43 y=46.95..72.70 base=68.32 fontSize=21.26 rot=0 link=00000000 'PDF'
word: x=245.12..307.43 y=46.95..72.70 base=68.32 fontSize=21.26 rot=0 link=00000000 'Tools'
word: x=314.31..432.03 y=46.95..72.70 base=68.32 fontSize=21.26 rot=0 link=00000000 'Command'
word: x=439.29..488.87 y=46.95..72.70 base=68.32 fontSize=21.26 rot=0 link=00000000 'Line'
word: x=496.13..550.23 y=46.95..72.70 base=68.32 fontSize=21.26 rot=0 link=00000000 'User'
word: x=557.23..643.64 y=46.95..72.70 base=68.32 fontSize=21.26 rot=0 link=00000000 'Manual'
word: x=8.87..62.31    y=86.81..100.7 base=98.39 fontSize=11.52 rot=0 link=00000000 'Version:'
word: x=65.64..94.2    y=86.81..100.7 base=98.39 fontSize=11.52 rot=0 link=00000000 'v2.0'
word: x=8.87..79.14    y=117.82..137  base=133.8 fontSize=15.95 rot=0 link=00000000 'Content'
word: x=79.86..133.67  y=155.91..169  base=167.4 fontSize=11.52 rot=0 link=00000000 'Overview'
word: x=79.86..131.13  y=172.74..186  base=184.3 fontSize=11.52 rot=0 link=00000000 'Features'

//Text Positions for each Line
line: x=157.06..379.00 y=18.60..32.55   base=30.17  'Home PDF-Tools Doc ument Support'
line: x= 65.66..643.64 y=46.95..72.70   base=68.32  'Advanced PDF Tools Command Line User Manual'
line: x=  8.87..94.23  y=86.81..100.76  base=98.39  'Version: v2.0'
line: x=  8.87..79.14  y=117.82..137.13 base=133.85 'Content'
line: x= 79.86..133.67 y=155.91..169.86 base=167.49 'Overview'
line: x= 79.86..131.13 y=172.74..186.69 base=184.31 'Features'
line: x= 79.86..203.05 y=189.57..203.52 base=201.15 'Command Line Usage'
line: x=115.36..263.33 y=223.23..237.18 base=234.81 'Input and output PDF file'
line: x=115.36..264.56 y=240.07..254.01 base=251.64 'Show PDF file information'
line: x=115.36..253.03 y=256.90..270.84 base=268.47 'Set PDF file information'

The following web page is the user guide for PDF Extraction Tools software,

https://www.verypdf.com/app/pdf-extract-tool/user-guide.html

VeryPDF

Rating: 10.0/10 (1 vote cast)

Rating: 0 (from 0 votes)

June 2014
M	T	W	T	F	S	S
« May				Jul »
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Related Posts

Leave a Reply Cancel reply