I want get the text contents and positions from a PDF file


I already have bought VeryPDF to Any Converter and I would like to evaluate pdf-extract-tool. I am interested in the text position functionality. I have downloaded it but can't get it to work.

Can someone help out.

Thanks for your message, you can download and purchase "PDF Extract Tool Command Line" from this web page,


after you download it, you can run following command lines to extract text and position from PDF file and save to a text file,

pdfextract.exe -outfolder D:\out\ D:\in.pdf
pdfextract.exe -textpos D:\in.pdf D:\out.txt
pdfextract.exe -textpos -nopgbrk D:\in.pdf D:\out.txt
pdfextract.exe -$ "XXXXXXXXXXXXXXXX" -textpos test-form.pdf _test-form-pos.txt

the converted text file contains following information, you can write a script to parse the position for each word easily,

Page #1] *** initial words ***

word: x=242.30..293.90 y=49.94..61.93 base=59.50 fontSize=13.94 'Request'
word: x=298.62..315.08 y=49.94..61.93 base=59.50 fontSize=13.94 'for'
word: x=319.80..376.42 y=49.94..61.93 base=59.50 fontSize=13.94 'Taxpayer'
word: x=491.92..510.83 y=52.40..60.43 base=58.80 fontSize=8.96 'Give'
word: x=513.89..533.95 y=52.40..60.43 base=58.80 fontSize=8.96 'form'
word: x=537.01..545.64 y=52.40..60.43 base=58.80 fontSize=8.96 'to'
word: x=548.70..562.31 y=52.40..60.43 base=58.80 fontSize=8.96 'the'
word: x=46.85..63.02 y=55.78..62.13 base=60.75 fontSize=6.97 'Form'
word: x=70.22..108.70 y=42.83..65.27 base=60.75 fontSize=23.90 'W-9'
word: x=46.85..62.60 y=64.54..70.90 base=69.52 fontSize=6.97 '(Rev.'
word: x=64.95..97.37 y=64.54..70.90 base=69.52 fontSize=6.97 'November'
word: x=99.72..117.04 y=64.54..70.90 base=69.52 fontSize=6.97 '2005)'
word: x=491.92..535.55 y=62.36..70.39 base=68.76 fontSize=8.96 'requester.'
word: x=538.62..550.74 y=62.36..70.39 base=68.76 fontSize=8.96 'Do'
word: x=553.81..567.75 y=62.36..70.39 base=68.76 fontSize=8.96 'not'
word: x=187.09..269.66 y=64.87..76.86 base=74.44 fontSize=13.94 'Identification'
word: x=274.38..322.35 y=64.87..76.86 base=74.44 fontSize=13.94 'Number'
word: x=327.07..349.66 y=64.87..76.86 base=74.44 fontSize=13.94 'and'
word: x=354.38..431.65 y=64.87..76.86 base=74.44 fontSize=13.94 'Certification'
word: x=46.85..78.43 y=75.31..80.76 base=79.58 fontSize=5.98 'Department'
word: x=80.48..85.68 y=75.31..80.76 base=79.58 fontSize=5.98 'of'
word: x=87.74..96.15 y=75.31..80.76 base=79.58 fontSize=5.98 'the'
word: x=98.21..121.35 y=75.31..80.76 base=79.58 fontSize=5.98 'Treasury'
word: x=491.92..512.66 y=72.32..80.35 base=78.72 fontSize=8.96 'send'
word: x=515.72..524.35 y=72.32..80.35 base=78.72 fontSize=8.96 'to'
word: x=527.41..541.02 y=72.32..80.35 base=78.72 fontSize=8.96 'the'
word: x=544.08..561.49 y=72.32..80.35 base=78.72 fontSize=8.96 'IRS.'
word: x=46.85..66.68 y=82.28..87.73 base=86.55 fontSize=5.98 'Internal'
word: x=68.81..92.18 y=82.28..87.73 base=86.55 fontSize=5.98 'Revenue'
word: x=94.31..114.13 y=82.28..87.73 base=86.55 fontSize=5.98 'Service'
word: x=75.46..93.93 y=92.56..98.92 base=97.54 fontSize=6.97 'Name'
word: x=96.29..105.33 y=92.56..98.92 base=97.54 fontSize=6.97 '(as'
word: x=107.69..128.22 y=92.56..98.92 base=97.54 fontSize=6.97 'shown'
word: x=130.59..138.46 y=92.56..98.92 base=97.54 fontSize=6.97 'on'
word: x=140.83..154.52 y=92.56..98.92 base=97.54 fontSize=6.97 'your'
word: x=156.88..179.74 y=92.56..98.92 base=97.54 fontSize=6.97 'income'
word: x=182.11..191.66 y=92.56..98.92 base=97.54 fontSize=6.97 'tax'
word: x=194.02..214.17 y=92.56..98.92 base=97.54 fontSize=6.97 'return)'
word: x=75.46..103.74 y=116.57..122.93 base=121.55 fontSize=6.97 'Business'
word: x=106.15..125.41 y=116.57..122.93 base=121.55 fontSize=6.97 'name,'
word: x=127.82..131.43 y=116.57..122.93 base=121.55 fontSize=6.97 'if'
word: x=133.84..159.54 y=116.57..122.93 base=121.55 fontSize=6.97 'different'
word: x=161.95..176.28 y=116.57..122.93 base=121.55 fontSize=6.97 'from'
word: x=178.69..197.81 y=116.57..122.93 base=121.55 fontSize=6.97 'above'
word: x=168.85..200.85 y=143.08..149.44 base=148.06 fontSize=6.97 'Individual/'
word: x=391.93..396.91 y=146.08..152.55 base=150.81 fontSize=4.98 '?'
word: x=500.05..523.95 y=143.08..149.44 base=148.06 fontSize=6.97 'Exempt'
word: x=526.32..540.65 y=143.08..149.44 base=148.06 fontSize=6.97 'from'
word: x=543.02..566.28 y=143.08..149.44 base=148.06 fontSize=6.97 'backup'
word: x=248.05..285.14 y=147.33..153.68 base=152.30 fontSize=6.97 'Corporation'
word: x=312.85..348.68 y=147.20..153.55 base=152.17 fontSize=6.97 'Partnership'
word: x=370.45..387.95 y=147.33..153.68 base=152.30 fontSize=6.97 'Other'
word: x=75.46..95.48 y=151.05..157.41 base=156.03 fontSize=6.97 'Check'
word: x=97.84..133.87 y=151.05..157.41 base=156.03 fontSize=6.97 'appropriate'
word: x=136.23..149.91 y=151.05..157.41 base=156.03 fontSize=6.97 'box:'
word: x=168.85..182.67 y=151.05..157.41 base=156.03 fontSize=6.97 'Sole'
word: x=184.90..215.63 y=151.05..157.41 base=156.03 fontSize=6.97 'proprietor'



