Thursday, August 6, 2009

Find page count in TIFF or PDF with perl

The perl module you'll want to install is called PerlMagick (Image::Magick) and rather than loading the image into memory you can just use the Ping method. The Ping method returns an array with 4 elements of information for each page of a multi-page TIFF/PDF. If we divide the number of elements by 4 we know how many pages there are in the TIFF/PDF.
use Image::Magick;

my $im = Image::Magick->new();

my @ping_info = $im->Ping('fax.tif');

## If we access an @rray in a scalar context we get the element count
my $count = @ping_info / 4;

print $count;

There you have it, good luck.


The example above may or may not work properly with PDF's.  Even if it does work it appears that ImageMagick makes an external call to the ghostscript (gs) executable on the system to determine the properties of the pdf.  Here is an example that will work for PDFs:
use PDF::API2;

my $pdf = PDF::API2->open('2.pdf');
print $pdf->pages();

I'm guessing that eventually ImageMagick won't be "broken" but you can use the above in the meantime.


We recently updated to Debian 6.0 and GhostScript seems to have problems with PDFs rendered by various software (it takes 5 minutes for it to get a page count on a 4 page document).  My first line of defense for checking page counts in a pdf is the following for PDFs:
open(FH, "$temp_file");
for my $line (<FH>) {
    if ($line =~ m/\/Count\s+(\d+)/) {
        $page_count = $1;

Don't be tempted to stop searching for the Count after the first match. Some PDFs will list a count for every page (like a page number). You'll want to use whatever the last match is for your actual count. If this code doesn't produce a page count I do the Image::Magick Ping. That appears to cover all scenarios I've come across.


  1. tiffinfo output_file_name.tif | grep "Page Number" | grep -c "P"

    its simple in linux command line