Hi
I have downloaded your trial version which looks as though it works on
our .ps files running it from a command line. I have about 20 files I
want to convert to text, preferably in a batch job, how do I do this and
what license do I need.
Regards
=================================
You can run following command line to convert all PS files to text files
at one time,
#1: Convert all PS files to TXT files in D:\temp folder,
for %F in (D:\temp\*.ps) do "C:\VeryPDF\ps2txt.exe" "%F" "%~dpnF.txt"
Postscript to Text Converter Server License is USD195 per server, you
can purchase it from our website directly,
http://www.verydoc.com/ps-to-text.html
VeryPDF
===================================
is it possible to run this from within a python script and if so how do I do it?
===================================
ps2txt.exe is a command line application, you can call it from any script, of course, you can call it from python script too, please refer to following sample code,
~~~~~~~~~~~~~~~~~~~~~~
import commands
( stat, output ) = commands.getstatusoutput( "C:\VeryPDF\ps2txt.exe C:\test.pdf C:\out.txt" )
if( stat == 0 ):
print "Command succeeded, here is the output: %s" % output
else:
print "Command failed, here is the output: %s" % output
~~~~~~~~~~~~~~~~~~~~~~
VeryPDF
======================================
HI sorry to be a pain but I have 2 questions ifrstly when I ran the script you suggested I got:
Command failed, here is the output: '{' is not recognized as an internal or external command,
operable program or batch file.
so I am not sure what I should be changing, it is python 2.6 I am using.
Secondly assuming we do get it to work what changes do I need to make to pick out all .ps files from a mixed directory to produce a txt file for each ps file of the same name
Regards
======================================
You need use correct path to the ps2txt.exe, for example, if you place ps2txt.exe to C:\ folder, you need use following code,
( stat, output ) = commands.getstatusoutput( "C:\ps2txt.exe C:\test.pdf C:\out.txt" )
You can call following command line from a .bat file to Convert all PS files to TXT files in D:\temp folder,
For %%F in (D:\temp\*.ps) do "C:\VeryPDF\ps2txt.exe" "%%F" "%%~dpnF.txt"
You can then call this .bat file form your Python code, for example,
( stat, output ) = commands.getstatusoutput( "C:\test.bat" )
VeryPDF
======================================
Hi I have now got ps2txt.exe running from Python under Windows. import commands does not work under Windows so I had to use this instead:
def getstatusoutput(cmd):
"""Return (status, output) of executing cmd in a shell."""
"""This new implementation should work on all platforms."""
import subprocess
print cmd
pipe = subprocess.Popen(cmd, shell=True, universal_newlines=True,
stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
output = str.join("", pipe.stdout.readlines())
sts = pipe.wait()
print sts
print output
if sts is None:
sts = 0
return sts, output
cmd = 'C:/ps2txt/ps2txt/ps2txt.exe "C:/ps2txt/ps2txt/test.ps" "C:/ps2txt/ps2txt/ab.txt"'
getstatusoutput(cmd)
which worked for a single file. So I then tried your for statement initiall from a bat file then from the command line and I got this
so the question now is what am I doing wrong on the command line or in the bat file which is behind the commandline window in the picture
Thank you
========================================
We have sent an example to you just now, please check it.
Thank you.
VeryPDF
hi
i want to extract text from postscript
i use ps2txt but it don’t work
this is my file content
%!PS-Adobe-3.0
%%Pages: (atend)
%%BoundingBox: 59 584 104 611
%%HiResBoundingBox: 59.100000 584.300000 103.200000 610.700000
%……………………………..
%%Creator: GPL Ghostscript 905 (pswrite)
%%CreationDate: 2012/08/21 14:51:28
%%DocumentData: Clean7Bit
%%LanguageLevel: 3
%%EndComments
%%BeginProlog
% This copyright applies to everything between here and the %%EndProlog:
% Copyright (C) 2010 Artifex Software, Inc. All rights reserved.
%%BeginResource: procset GS_pswrite_3_0_1001 1.001 0
/GS_pswrite_3_0_1001 80 dict dup begin
/PageSize 2 array def/setpagesize{ PageSize aload pop 3 index eq exch
4 index eq and{ pop pop pop}{ PageSize dup 1
5 -1 roll put 0 4 -1 roll put dup null eq {false} {dup where} ifelse{ exch get exec}
{ pop/setpagedevice where
{ pop 1 dict dup /PageSize PageSize put setpagedevice}
{ /setpage where{ pop PageSize aload pop pageparams 3 {exch pop} repeat
setpage}if}ifelse}ifelse}ifelse} bind def
/!{bind def}bind def/#{load def}!/N/counttomark #
/rG{3{3 -1 roll 255 div}repeat setrgbcolor}!/G{255 div setgray}!/K{0 G}!
/r6{dup 3 -1 roll rG}!/r5{dup 3 1 roll rG}!/r3{dup rG}!
/w/setlinewidth #/J/setlinecap #
/j/setlinejoin #/M/setmiterlimit #/d/setdash #/i/setflat #
/m/moveto #/l/lineto #/c/rcurveto #
/p{N 2 idiv{N -2 roll rlineto}repeat}!
/P{N 0 gt{N -2 roll moveto p}if}!
/h{p closepath}!/H{P closepath}!
/lx{0 rlineto}!/ly{0 exch rlineto}!/v{0 0 6 2 roll c}!/y{2 copy c}!
/re{4 -2 roll m exch dup lx exch ly neg lx h}!
/^{3 index neg 3 index neg}!
/f{P fill}!/f*{P eofill}!/s{H stroke}!/S{P stroke}!
/q/gsave #/Q/grestore #/rf{re fill}!
/Y{P clip newpath}!/Y*{P eoclip newpath}!/rY{re Y}!
/|={pop exch 4 1 roll 1 array astore cvx 3 array astore cvx exch 1 index def exec}!
/|{exch string readstring |=}!
/+{dup type/nametype eq{2 index 7 add -3 bitshift 2 index mul}if}!
/@/currentfile #/${+ @ |}!
/B{{2 copy string{readstring pop}aload pop 4 array astore cvx
3 1 roll}repeat pop pop true}!
/Ix{[1 0 0 1 11 -2 roll exch neg exch neg]exch}!
/,{true exch Ix imagemask}!/If{false exch Ix imagemask}!/I{exch Ix image}!
/Ic{exch Ix false 3 colorimage}!
/F{/Columns counttomark 3 add -2 roll/Rows exch/K -1/BlackIs1 true>>
/CCITTFaxDecode filter}!/FX{<</EndOfBlock false F}!
/X{/ASCII85Decode filter}!/@X{@ X}!/&2{2 index 2 index}!
/@F{@ &2<<F}!/@C{@X &2 FX}!
/$X{+ @X |}!/&4{4 index 4 index}!/$F{+ @ &4<AqWn”8Ko$EO^]~>
646 6049 33 44 @C
,
-R^/Faq/tbT220pa+PLqIf8p=rr:aIg[KWO*q4F`1Wm!&s7Mab#*p1cf+k@($4T:QMu~>
690 6049 59 44 @C
,
OiC=YN%_+/s8W-!s8W-!s8W-!s8W-!s8W,Tf$jT#e*uaJRTUfO#_[06pt2aunC([?\7dRiKGYiK
Ll;~>
760 6049 37 44 @C
,
3f,6d+TmB<CICn=0];tg5f%/PWCLe!!@]`!o_Li”/~>
808 6049 10 41 @X
,
!!%KKIt7QLs+(-“s+(-“s+#S!)uos=zzzzzzzzz!!%KKIt7QLs+(-“s+(-“s+#TL4ob~>
830 6049 59 44 @C
,
OiC=YN%_+/s8W-!s8W-!s8W-!s8W-!s8W,Tf$jT#e*uaJRTUfO#_[06pt2aunC([?\7dRiKGYiK
Ll;~>
902 6049 59 44 @C
,
OiC=YN%_+/s8W-!s8W-!s8W-!s8W-!s8W,Tf$jT#e*uaJRTUfO#_[06pt2aunC([?\7dRiKGYiK
Ll;~>
973 6049 59 44 @C
,
OiC=YN%_+/s8W-!s8W-!s8W-!s8W-!s8W,Tf$jT#e*uaJRTUfO#_[06pt2aunC([?\7dRiKGYiK
Ll;~>
591 5843 8 58 @X
,
!.Y%Ks8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!s8W-!IfK~>
611 5843 37 62 @C
,
+dC=bbc!]:I6O9bnG37rH?qmLoUimHJ$8nPs8W-!hu<Va?*;c4m;+Jb4!
662 5843 10 41 @X
,
!!%KKIt7QLs+(-“s+(-“s+#S!)uos=zzzzzzzzz!!%KKIt7QLs+(-“s+(-“s+#TL4ob~>
681 5843 36 59 @C
,
,RZd=5mIZVHcUSV6k/(,Fq`gn*rg`_lWR8]@$LF7V1k”=3’2/N/2n$qGP;(:q0d5[1pFjT_H?2/
L7lD9″W.~>
727 5843 36 59 @C
,
,RZd=5mIZVHcUSV6k/(,Fq`gn*rg`_lWR8]@$LF7V1k”=3’2/N/2n$qGP;(:q0d5[1pFjT_H?2/
L7lD9″W.~>
772 5843 36 59 @C
,
,RZd=5mIZVHcUSV6k/(,Fq`gn*rg`_lWR8]@$LF7V1k”=3’2/N/2n$qGP;(:q0d5[1pFjT_H?2/
L7lD9″W.~>
818 5844 37 58 @C
,
/fIaiq]_iJG4″%nDV_pJmG%@%gUDUc^?sg%qnN/\rDm7u\7o?fik;:*_Li1(.)5~>
863 5844 37 58 @C
,
/fIaiq]_iJG4″%nDV_pJmG%@%gUDUc^?sg%qnN/\rDm7u\7o?fik;:*_Li1(.)5~>
cleartomark end end pagesave restore
showpage
%%PageTrailer
%%Trailer
%%Pages: 1
%%EOF
This PS file is contain embedded fonts, it is not contain readable characters. You should better use our PS to PDF Converter to convert this PS file to PDF file first, and then use PDF to Text OCR Converter to convert PDF file to text file again.
http://www.verydoc.com/ps-to-pdf.html
http://www.verypdf.com/pdf2txt/pdf-to-text-ocr-converter.htm