How To Convert Selected Pages from Multipage PDF into Images (PHP)

programming Posted on Sep 18, 2016

Last month we were confronted with a challenge: convert certain pages from a large PDF document to TIFF files (or any other image format) for people to preview before downloading. After hours of hard work and problem solving we not only found a solution but outlined the steps on Medium. Check out the article or get the steps here.

Step 1: Upload a multi-page PDF file to the server (e.g 40 pages PDF).
Step 2: Count the number of pages in PDF.
Step 3: Give user an option to select a page (e.g user selected 10,12,30,40).
Step 4: Generate another PDF with selected pages.
Step 5: Convert PDF to image.

We used fineuploader to upload multiple documents to the server. The second software you'll need to covert PDFs to images is pdfLib.
Uploadify.php has the following code, upload a file and count the pages in PDF.

Step 1 & 2

if(move_uploaded_file($tempFile,$targetFile))  
{
 $searchpath = (dirname(__FILE__)).”/input”;
 $outfile_basename = “split_document”;
 $title = “Split PDF Document”;
 $infile = “uploads/newfile.pdf”;

 $p = new pdflib();

 $p->set_option(“searchpath={“ . $searchpath . “}”);

 /* This means we must check return values of load_font() etc. */
 $p->set_option(“errorpolicy=return”);
 $p->set_option(“stringformat=utf8”);

 $indoc = $p->open_pdi_document($infile, “”);
 if ($indoc == 0)
 throw new Exception(“Error: “ . $p->get_errmsg());

 /*
 * Determine the number of pages in the input document and compute
 * the number of output documents.
 */
 $page_count = (int) $p->pcos_get_number($indoc, “length:pages”);



 echo json_encode(array(‘success’ => true,’filename’=>$new_name,’page_count’=>$page_count,”display_filename”=>$_FILES[‘qqfile’][‘name’]));
} 
else  
{
 echo json_encode(array(‘success’ => false));
}

Step 3: Once you get the response from uploadify.php then give the user the option to select the page(s).

Step 4: Generate another PDF with selected pages.

/*
 * Document will be split into sub-documents where each document has
* this many pages (except the last sub-document potentially).
*/
define(“SUBDOC_PAGES”, $_REQUEST[“total_pages”]+1);  
$new_pages = $_POST["userSelectedPages"];
try {  
 $p = new pdflib();

 $p->set_option(“searchpath={“ . $searchpath . “}”);
 /* This means we must check return values of load_font() etc. */
 $p->set_option(“errorpolicy=return”);
 $p->set_option(“stringformat=utf8”);
 $indoc = $p->open_pdi_document($infile, “”);
 if ($indoc == 0)
 throw new Exception(“Error: “ . $p->get_errmsg());
/*
 * Determine the number of pages in the input document and compute
 * the number of output documents.
 */

 $page_count = (int) $p->pcos_get_number($indoc, “length:pages”);

 $outdoc_count = ($page_count / SUBDOC_PAGES) + ($page_count % SUBDOC_PAGES > 0 ? 1 : 0);
 /*
 * The loop only produces a single output document that is returned over
 * HTTP.
 *
 * For producing all output documents, change the loop condition like this:
 *
 * $outdoc_counter < $outdoc_count
 */

for ($outdoc_counter = 0, $page = 0; $outdoc_counter < 1; $outdoc_counter += 1)  
 {
 $outfile = $outfile_basename . “_” . ($outdoc_counter + 1) . “.pdf”;
/*
 * Open new sub-document.
 */
 if ($p->begin_document(“”, “”) == 0)
 throw new Exception(“Error: “ . $p->get_errmsg());
 $p->set_info(“Creator”, “Khuram Noman”);
 $p->set_info(“Title”, $title . ‘ $Revision: 1.2 $’);
 $p->set_info(“Subject”, “Sub-document “ . ($outdoc_counter + 1)
 . “ of “ . $outdoc_count . “ of input document ‘” . $infile . “’”);
for ($i = 1; $page < $page_count && $i < SUBDOC_PAGES;  
 $page += 1, $i += 1) {

 if(in_array($i, $new_pages))
 {

 /* Dummy page size; will be adjusted later */
 $p->begin_page_ext(10, 10, “”);
$pagehdl = $p->open_pdi_page($indoc, $page + 1, “”);
 if ($pagehdl == 0)
 throw new Exception(“Error opening page: “ . $p->get_errmsg());
/*
 * Place the imported page on the output page, and adjust
 * the page size
 */
 $p->fit_pdi_page($pagehdl, 0, 0, “adjustpage”);
 $p->close_pdi_page($pagehdl);
$p->end_page_ext(“”);
 }
 }
/* Close the current sub-document */
 $p->end_document(“”);
/*
 * Return the sub-document to the user. If all split documents are to
 * be processed, do something different, e.g. write the documents
 * to disk and create an HTML page with a list of links for the
 * sub-documents.
 */
 $buf = $p->get_buffer();
 $len = strlen($buf);
/* header(“Content-type: application/pdf”);
 header(“Content-Length: $len”);
 header(“Content-Disposition: inline; filename=” . $outfile);
 print $buf;*/
 }
/* Close the input document */
 $p->close_pdi_document($indoc);
 }
$createPDF = “uploads/converted_”.$_REQUEST[“file_name”];
 $fh = fopen($createPDF, ‘w’);
 fwrite($fh, $buf);
 fclose($fh);

 $source_file = “\\uploads\\”.$_POST[“file_name”];

Step 5: Convert the PDF to an image.

$output = exec ( “nconvert.exe -in tiff -multi -out tiff -c 7 -o \\uploads\\pdf_filename.pdf \\downloads\\1.tif”);

Happy converting!

BACK TO BLOG