VeryPDF Cloud API Platform :: VeryPDF OCR Cloud API :: Online OCR engine to recognize scanned PDF and Image files to editable document formats.

VeryPDF OCR Cloud API is a part of VeryPDF Cloud API Platform. VeryPDF OCR Cloud API is allow you to convert scanned PDF, TIFF and other Image formats (PNG, JPG, BMP, GIF, PCX, TGA, etc.) to plain Text format (TXT), editable Word (DOC, DOCX), Excel (XLS, XLSX), PowerPoint (PPT, PPTX), RTF, HTML, XML, PDF etc. document formats.

VeryPDF Cloud API Platform Home Page:

https://www.verypdf.com/online/cloud-api/index.html

The list of supported languages:

Language Code	Language Description
grc	Ancient Greek Language
epo_alt	Esperanto alternative language
eng	English language
ukr	Ukrainian language
tur	Turkish language
tha	Thai language
tgl	Tagalog language
tel	Telugu language
tam	Tamil language
swe	Swedish language
swa	Swahili language
srp	Serbian (Latin) language
sqi	Albanian language
spa	Spanish language
slv	Slovenian language
slk	Slovakian language
ron	Romanian language
por	Portuguese language
pol	Polish language
nor	Norwegian language
nld	Dutch language
msa	Malay language
mlt	Maltese language
mkd	Macedonian language
mal	Malayalam language
lit	Lithuanian language
lav	Latvian language
kor	Korean language
kan	Kannada language
ita	Italian language
isl	Icelandic language
ind	Indonesian language
chr	Cherokee language
hun	Hungarian language
hrv	Croatian language
hin	Hindi language
heb	Hebrew language
glg	Galician language
frm	Middle French (ca. 1400-1600) language
frk	Frankish language
fra	French language
fin	Finnish language
eus	Basque language
est	Estonian language
epo	Esperanto language
enm	Middle English (1100-1500) language
ell	Greek language
deu	German language
dan	Danish language
ces	Czech language
cat	Catalan language
bul	Bulgarian language
ben	Bengali language
bel	Belarusian language
aze	Azerbaijani language
ara	Arabic language
afr	Afrikaans language
jpn	Japanese language
chi_sim	Chinese (Simplified) language
chi_tra	Chinese (Traditional) language
rus	Russian Language
vie	Vietnamese Language

You can use -lang=XXXX parameter to set the OCR Language.

The following URL will convert a TIFF file to text file with English language,

http://online.verypdf.com/api/?apikey=XXXXXXXXXXXXX&app=ocr
&infile=http://online.verypdf.com/examples/cloud-api/test0002.tif
&outfile=out&lang=eng

The following URL will convert a TIFF file to HTML file with English language, the output HTML file is contain position for each word and character,

http://online.verypdf.com/api/?apikey=XXXXXXXXXXXXX&app=ocr
&infile=http://online.verypdf.com/examples/cloud-api/test0002.tif
&outfile=out&lang=eng&format

The following URL will convert a multipage TIFF file to text file with English language,

http://online.verypdf.com/api/?apikey=XXXXXXXXXXXXX&app=ocr&infile=http://online.verypdf.com/examples/cloud-api/multipage.tif&outfile=out&lang=eng

The following URL will convert a multipage TIFF file to HTML file with English language, the output HTML file is contain position for each word and character,

http://online.verypdf.com/api/?apikey=XXXXXXXXXXXXX&app=ocr
&infile=http://online.verypdf.com/examples/cloud-api/multipage.tif
&outfile=out&lang=eng&format

Convert Japanese characters in TIFF file to Japanese text file,

http://online.verypdf.com/api/?apikey=XXXXXXXXXXXXX&app=ocr
&infile=http://online.verypdf.com/examples/cloud-api/japanese.tif
&outfile=out&lang=jpn

Convert German characters in TIFF file to German text file,

http://online.verypdf.com/api/?apikey=XXXXXXXXXXXXX&app=ocr
&infile=http://online.verypdf.com/examples/cloud-api/german.tif
&outfile=out&lang=deu

More articles for VeryPDF Cloud API Platform,

https://www.verypdf.com/wordpress/category/verypdf-cloud-api

If you need any other functions which are not included in VeryPDF Cloud API Platform, please feel free to let us know,

http://support.verypdf.com

Rating: 7.5/10 (2 votes cast)

Rating: 0 (from 2 votes)

3 Replies to “VeryPDF Cloud API Platform :: VeryPDF OCR Cloud API :: Online OCR engine to recognize scanned PDF and Image files to editable document formats.”

We have implemented “Extract text from image rectangles” today, we have added a “rectangle” parameter to OCR characters in a rectangle on image, you can use it like below,

http://online.verypdf.com/api/?apikey=XXXXXXXXXXXXXXXX&app=ocr&infile=https://dl.dropboxusercontent.com/u/5570462/49AD37032CCC2C0_newfilename10.tif&format=1&dumpwordpos=1&lang=swe&rectangle=200×1674+822+379

the meaning of “200×1674+822+379” is,

200 is width,
1674 is height,
822 is left position,
379 is top position,

You should better use urlencode() function to encodes string when you call this URL from PHP code, e.g.,
function get_data($url) { $ch = curl_init(); $timeout = 5; curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); $data = curl_exec($ch); curl_close($ch); return $data; }


$strURL = 'http://online.verypdf.com/api/?apikey=XXXXXXXXXXXXXXXX&app=ocr&infile=';

$strURL .= 'https://dl.dropboxusercontent.com/u/5570462/49AD37032CCC2C0_newfilename10.tif';

$strURL .= '&format=1&dumpwordpos=1&lang=swe';

$strURL .= '&rectangle=' . urlencode('200x1674+822+379');

$returned_content = get_data($strURL); echo $returned_content;

You can use “rectangle” option to get characters from a special rectangle on image file easily.

Rating: 0.0/5 (0 votes cast)

Rating: -1 (from 1 vote)

>>What products can be used to convert scanned PDF to searchable PDF file?

Thanks for your message, the following products are all can convert scanned PDF files to searchable PDF files, the output PDF files will contain a hidden text layer, you can open OCRed PDF files in Adobe Reader and search text contents properly,

Image to PDF OCR Converter Command Line,
http://www.verypdf.com/app/image-to-pdf-ocr-converter/try-and-buy.html#buy-ocr-cmd

PDF to Text OCR Converter Command Line,
http://www.verypdf.com/app/pdf-to-text-ocr-converter/try-and-buy.html#buy

VeryPDF OCR to Any Converter Command Line,
http://www.verypdf.com/app/ocr-to-any-converter-cmd/try-and-buy.html

Rating: 0.0/5 (0 votes cast)

Rating: 0 (from 0 votes)

“VeryPDF OCR Cloud API” is able to OCR on scanned TIFF and PDF files, you can convert an online TIFF or PDF to text file using following URL,

http://online.verypdf.com/api/?apikey=XXXXXXXXXXXXX&app=ocr&infile=https://dl.dropboxusercontent.com/u/5570462/verypdf-cloud-api/table.tif&outfile=out.txt&lang=eng

http://online.verypdf.com/api/?apikey=XXXXXXXXXXXXX&app=ocr&infile=https://dl.dropboxusercontent.com/u/5570462/verypdf-cloud-api/table.pdf&outfile=out.txt&lang=eng

If you wish get the position for each word, you need add “&format=1&dumpwordpos=1” parameters, for example,

http://online.verypdf.com/api/?apikey=XXXXXXXXXXXXX&app=ocr&infile=https://dl.dropboxusercontent.com/u/5570462/verypdf-cloud-api/table.tif&outfile=out.txt&lang=eng&format=1&dumpwordpos=1

“VeryPDF OCR Cloud API” is an Online & Cloud application, if you want to do the batch conversion, the desktop application may work better for you, fast and without any network connection problems. If so, we suggest you may download “VeryPDF OCR to Any Converter Command Line” product from following web page to try, “VeryPDF OCR to Any Converter Command Line” is a powerful product which can convert scanned TIFF and PDF files to plain Text format (TXT), editable Word (DOC, DOCX), Excel (XLS, XLSX), PowerPoint (PPT, PPTX), RTF, HTML, XML, plain text based PDF etc. document formats, the OCR engine in this software is reach to 99.9% accuracy,

http://www.verypdf.com/app/ocr-to-any-converter-cmd/try-and-buy.html

Rating: 0.0/5 (0 votes cast)

Rating: 0 (from 0 votes)

M	T	W	T	F	S	S
« Jul				Sep »
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Related Posts

3 Replies to “VeryPDF Cloud API Platform :: VeryPDF OCR Cloud API :: Online OCR engine to recognize scanned PDF and Image files to editable document formats.”

Leave a Reply Cancel reply