Counting the exact number of pages in any PDF document

Counting the exact number of pages in any PDF document

'-----------------------------------------------------------------
' IF you have ability then for PDF 1.3 version also
' Open file  pdf in binarymode
' Read last 50 lines of that file
' In between somewhere u will find a line
'   "/count xx" pages the xx is # of pages

'MADE ON 14TH AUG 06
'-----------------------------------------------------------------------
' open the PDF  in binary mode & count the pages
' search for "/N  xx"
'             or "/Count xx"

Public Sub pagecount(sfilename As String)
On Error GoTo a
Dim nFileNum As Integer
Dim s As String
Dim c As Integer
Dim pos, pos1 As Integer
pos = 0
pos1 = 0
c = 0
' Get an available file number from the system
nFileNum = FreeFile
'OPEN the PDF file in Binary mode
Open sfilename For Binary Lock Read Write As #nFileNum
  ' Get the data from the file
  Do Until EOF(nFileNum)
    Input #1, s
    c = c + 1
    If c <= 10 Then
        pos = InStr(s, "/N")
    End If
    pos1 = InStr(s, "/count")
       If pos > 0 Or pos1 > 0 Then
            Close #nFileNum
            s = Trim(Mid(s, pos, 10))
            s = Replace(s, "/N", "")
            s = Replace(s, "/count", "")
            s = Replace(s, " ", "")
            s = Replace(s, "/", "")
            For i = 65 To 125
                    s = Replace(s, Chr(i), "")
            Next
            pages = Val(Trim(s))
            If pages < 0 Then
                pages = 1
            End If
            Close #nFileNum
            Exit Sub
        End If
        'imp only 1000 lines searches
        If c >= 1000 Then
             GoTo a
        End If
  Loop
    Close #nFileNum
    Exit Sub
a:
    Close #nFileNum
    pages = 1
    Exit Sub
End Sub
============================================
I actually went with a combined approach. Since I have exec disabled on my server I wanted to stick with a PHP based solution, so ended up with this:

Code:

function getNumPagesPdf($filepath){
    $fp = @fopen(preg_replace("/\[(.*?)\]/i", "",$filepath),"r");
    $max=0;
    while(!feof($fp)) {
            $line = fgets($fp,255);
            if (preg_match('/\/Count [0-9]+/', $line, $matches)){
                    preg_match('/[0-9]+/',$matches[0], $matches2);
                    if ($max<$matches2[0]) $max=$matches2[0];
            }
    }
    fclose($fp);
    if($max==0){
        $im = new imagick($filepath);
        $max=$im->getNumberImages();
    }

    return $max;
}
If it can't figure things out because there are no Count tags, then it uses the imagick php extension. The reason I do a two-fold approach is because the latter is quite slow.
==================================================
Try this :

<?php
if (!$fp = @fopen($_REQUEST['file'],"r")) {
        echo 'failed opening file '.$_REQUEST['file'];
}
else {
        $max=0;
        while(!feof($fp)) {
                $line = fgets($fp,255);
                if (preg_match('/\/Count [0-9]+/', $line, $matches)){
                        preg_match('/[0-9]+/',$matches[0], $matches2);
                        if ($max<$matches2[0]) $max=$matches2[0];
                }
        }
        fclose($fp);
echo 'There '.($max<2?'is ':'are ').$max.' page'.($max<2?'':'s').' in '. $_REQUEST['file'].'.';
}
?>

The Count tag shows the number of pages in the different nodes. The parent node has the sum of the others in its Count tag, so this script just looks for the max (that is the number of pages).

You can also use Spool File Page Counter SDK to count the pages in PDF file, Spool File Page Counter SDK can be downloaded from following web page,

http://www.verydoc.com/spool-page-count.html

Advanced PDF Tools Command Line has ability to count the PDF pages too, Advanced PDF Tools Command Line can be downloaded from following web page,

http://www.verypdf.com/pdfinfoeditor/index.html#dl

VN:F [1.9.20_1166]
Rating: 9.0/10 (1 vote cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)
Counting the exact number of pages in any PDF document, 9.0 out of 10 based on 1 rating

Related Posts

This entry was posted in Spool File Page Counter SDK and tagged . Bookmark the permalink.

14 Responses to Counting the exact number of pages in any PDF document

  1. Hans says:

    I am trying out your Pdf Page Counter SDK for VB.Net.
    It is possible to look for each page in pdf-file if it is black and white or color?

    Thanks for your answer.

    VA:F [1.9.20_1166]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.20_1166]
    Rating: 0 (from 0 votes)
    • VeryPDF says:

      Yes, of course, you can use Pdf Page Counter SDK to count the page count for black and white and color PDF pages, please refer to following sample code, ReadInfoFromAllFormats() function is support PCL, PDF, PS, etc. formats.

      void main(int argc, char *argv[])
      {
      if(argc != 2)
      {
      printf("%s C:\\test.ps\n", argv[0]);
      printf("%s C:\\test.pcl\n", argv[0]);
      printf("%s C:\\test.spl\n", argv[0]);
      return;
      }

      char *lpInFile = argv[1];
      char drive[_MAX_DRIVE];
      char dir[_MAX_DIR];
      char fname[_MAX_FNAME];
      char ext[_MAX_EXT];

      _splitpath(lpInFile, drive, dir, fname, ext );

      BOOL bIsRenderToPDF = TRUE;
      DWORD bwPageCount = 0;
      DWORD colorPageCount = 0;
      DWORD copyCount = 0;
      double nPageWidth = 0;
      double nPageHeight = 0;
      char szPaperSizeName[200] = {0};
      BOOL bRet = FALSE;

      ReadInfoSetCode("XXXXXXXXXXXXXXXXXXX");
      ReadInfoEnableDebug(1);
      if(!stricmp(ext, ".ps") || !stricmp(ext, ".eps"))
      bRet = ReadInfoFromPSFile(lpInFile, bIsRenderToPDF, &bwPageCount, &colorPageCount, &copyCount, &nPageWidth, &nPageHeight, szPaperSizeName);
      else if(!stricmp(ext, ".pcl"))
      bRet = ReadInfoFromPCLFile(lpInFile, bIsRenderToPDF, &bwPageCount, &colorPageCount, &copyCount, &nPageWidth, &nPageHeight, szPaperSizeName);
      else
      bRet = ReadInfoFromAllFormats(lpInFile, bIsRenderToPDF, &bwPageCount, &colorPageCount, &copyCount, &nPageWidth, &nPageHeight, szPaperSizeName);
      printf("=======================================\n");
      printf("File = '%s'\n", lpInFile);
      printf("Return Value = %s\n", bRet?"TRUE":"FALSE");
      printf("bIsRenderToPDF = %d\n", bIsRenderToPDF);
      printf("bwPageCount = %d\n", bwPageCount);
      printf("colorPageCount = %d\n", colorPageCount);
      printf("copyCount = %d\n", copyCount);
      printf("PageWidht = %g\n", nPageWidth);
      printf("PageHeight = %g\n", nPageHeight);
      printf("PaperSizeName = '%s'\n", szPaperSizeName);
      }

      We can also get the color depth for each page in the PCL, PS and PDF formats, please send an email to support@verypdf.com, we will assist you continue.

      VeryPDF

      VN:F [1.9.20_1166]
      Rating: 0.0/5 (0 votes cast)
      VN:F [1.9.20_1166]
      Rating: 0 (from 0 votes)
  2. Hans says:

    Thanks for your very quick answer.
    But the code is in C++ i think and i don't know this.
    Can you give it me for VB.

    Thanks in advance for your help.

    VA:F [1.9.20_1166]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.20_1166]
    Rating: 0 (from 0 votes)
  3. VeryPDF says:

    please refer to the VB code at blow,

    Private Declare Function ReadInfoFromPSFile Lib "ReadInfo.dll" (ByVal fileName As String, ByVal bIsRenderToPDF As Long, _
    ByRef bwPageCount As Long, ByRef colorPageCount As Long, ByRef copyCount As Long, ByRef pagewidth As Double, _
    ByRef pageheight As Double, ByVal paperSizeName As String) As Long

    Private Declare Function ReadInfoFromPCLFile Lib "ReadInfo.dll" (ByVal fileName As String, ByVal bIsRenderToPDF As Long, _
    ByRef bwPageCount As Long, ByRef colorPageCount As Long, ByRef copyCount As Long, ByRef pagewidth As Double, _
    ByRef pageheight As Double, ByVal paperSizeName As String) As Long

    Private Declare Sub ReadInfoSetCode Lib "ReadInfo.dll" (ByVal strCode As String)

    Private Sub Command1_Click()
    Dim bIsRenderToPDF As Long
    Dim bwPageCount As Long
    Dim colorPageCount As Long
    Dim copyCount As Long
    Dim nPageWidth As Double
    Dim nPageHeight As Double
    Dim strPaperSizeName As String
    Dim nRet As Long
    Dim strMsg As String
    Dim strFileName As String

    bIsRenderToPDF = 1
    bwPageCount = 0
    colorPageCount = 0
    copyCount = 0
    nPageWidth = 0
    nPageHeight = 0
    strPaperSizeName = Space$(300)

    strFileName = App.Path & "\test_tiger.eps"

    ReadInfoSetCode ("XXXXXXXXXXXXXXXXXXXXXX")

    nRet = ReadInfoFromPSFile(strFileName, bIsRenderToPDF, bwPageCount, colorPageCount, copyCount, _
    nPageWidth, nPageHeight, strPaperSizeName)

    strMsg = strMsg + "FileName = " + strFileName + vbCrLf
    strMsg = strMsg + "bIsRenderToPDF = " + CStr(bIsRenderToPDF) + vbCrLf
    strMsg = strMsg + "bwPageCount = " + CStr(bwPageCount) + vbCrLf
    strMsg = strMsg + "colorPageCount = " + CStr(colorPageCount) + vbCrLf
    strMsg = strMsg + "copyCount = " + CStr(copyCount) + vbCrLf
    strMsg = strMsg + "PageWidth = " + CStr(nPageWidth) + vbCrLf
    strMsg = strMsg + "PageHeight = " + CStr(nPageHeight) + vbCrLf
    strMsg = strMsg + "PaperSizeName = " + CStr(strPaperSizeName) + vbCrLf
    MsgBox strMsg

    strFileName = App.Path & "\test_grid.pcl"

    nRet = ReadInfoFromPCLFile(strFileName, bIsRenderToPDF, bwPageCount, colorPageCount, copyCount, _
    nPageWidth, nPageHeight, strPaperSizeName)

    strMsg = ""
    strMsg = strMsg + "FileName = " + strFileName + vbCrLf
    strMsg = strMsg + "bIsRenderToPDF = " + CStr(bIsRenderToPDF) + vbCrLf
    strMsg = strMsg + "bwPageCount = " + CStr(bwPageCount) + vbCrLf
    strMsg = strMsg + "colorPageCount = " + CStr(colorPageCount) + vbCrLf
    strMsg = strMsg + "copyCount = " + CStr(copyCount) + vbCrLf
    strMsg = strMsg + "PageWidth = " + CStr(nPageWidth) + vbCrLf
    strMsg = strMsg + "PageHeight = " + CStr(nPageHeight) + vbCrLf
    strMsg = strMsg + "PaperSizeName = " + CStr(strPaperSizeName) + vbCrLf
    MsgBox strMsg
    End Sub

    VN:F [1.9.20_1166]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.20_1166]
    Rating: 0 (from 0 votes)
  4. VeryPDF says:

    you can also download the test package from following web page,

    http://www.verydoc.com/spool-page-count.html

    http://www.verydoc.com/ps-and-pcl-info-sdk.zip

    this test package contains C#, VB, VB.NET, VC++, SDK/COM interface etc. examples, you can download and test it in your system easily.

    VN:F [1.9.20_1166]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.20_1166]
    Rating: 0 (from 0 votes)
  5. VeryPDF says:

    you can also run test application in CMD window to determine a PDF page is BW or Color, please refer to following test case,

    C:\>E:\ps-and-pcl-info-sdk\bin\C#_ParsingTest.exe D:\temp\TestDoc.pdf
    args length is 1
    args index 0 is [D:\temp\TestDoc.pdf]
    =============================
    Page 1 is [Color]
    Page 2 is [Color]
    Page 3 is [ BW]
    Page 4 is [ BW]
    Page 5 is [ BW]
    Page 6 is [ BW]
    Page 7 is [ BW]
    =============================
    Statistics: bwPageCount=5, colorPageCount=2
    File: D:\temp\PoemsTestDoc.pdf
    Render To PDF: 1
    BW Pages: 5
    Color Pages: 2
    Width: 0
    Height: 0
    Paper name:

    as you see, you can get BW or color for each page easily.

    VN:F [1.9.20_1166]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.20_1166]
    Rating: 0 (from 0 votes)
  6. Hans says:

    Dear Support,

    thanks for your help. I'am a beginner in programming with VB. And I don't know, how I can get the colorinformation for each page in a pdf-file like this:

    For Each Page in strFileName
    ListBox1.Item.Add(Page, bwPageCount, colorPageCount, PageWidth, PageHeight)

    Can you help me

    VA:F [1.9.20_1166]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.20_1166]
    Rating: 0 (from 0 votes)
  7. VeryPDF says:

    In the demo version, the following information is printed to console only,
    =============================
    Page 1 is [Color]
    Page 2 is [Color]
    Page 3 is [ BW]
    Page 4 is [ BW]
    Page 5 is [ BW]
    Page 6 is [ BW]
    Page 7 is [ BW]
    =============================
    after you purchased it, we will send a new version of SDK to you, you will able to get color information for each page from SDK easily, the demo version hasn't this function yet.

    VN:F [1.9.20_1166]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.20_1166]
    Rating: 0 (from 0 votes)
  8. Pingback: Detect color pages or BW pages in PDF | VeryPDF Knowledge Base

  9. Pingback: HTML2PDF / DocConverterCOM.pdfout | VeryPDF Knowledge Base

  10. Hans Mustermann says:

    Dear Support,

    is there a Limitation of Pages in your SDK?

    Thanks for your answer

    VA:F [1.9.20_1166]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.20_1166]
    Rating: 0 (from 0 votes)
    • VeryPDF says:

      Can you please let us know what product are you using? because the different product has different limitation in the trial version.

      VN:F [1.9.20_1166]
      Rating: 0.0/5 (0 votes cast)
      VN:F [1.9.20_1166]
      Rating: 0 (from 0 votes)
      • Hans says:

        I use the ps-and-pcl-info-sdk with C#_ParsingTest.exe from VB.Net and now it works.
        But if in path or filename a space the script don't run.
        Can you tell me why

        VA:F [1.9.20_1166]
        Rating: 0.0/5 (0 votes cast)
        VA:F [1.9.20_1166]
        Rating: 0 (from 0 votes)
        • john says:

          You need use "" to include input and output filenames.

          VA:F [1.9.20_1166]
          Rating: 0.0/5 (0 votes cast)
          VA:F [1.9.20_1166]
          Rating: 0 (from 0 votes)

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!