Delicious Bookmark this on Delicious Share on Facebook SlashdotSlashdot It! Digg! Digg



PHP : Function Reference : PDF Functions

PDF Functions

Introduction

The PDF functions in PHP can create PDF files using the PDFlib library which was initially created by Thomas Merz and is now maintained by » PDFlib GmbH.

The documentation in this section is only meant to be an overview of the available functions in the PDFlib library and should not be considered an exhaustive reference. For the full and detailed explanation of each function, consult the PDFlib Reference Manual which is included in all PDFlib packages distributed by PDFlib GmbH. It provides a very good overview of what PDFlib is capable of doing and contains the most up-to-date documentation of all functions.

For a jump start we urge you to take a look at the programming samples which are contained in all PDFlib distribution packages. These samples demonstrate basic text, vector, and graphics output as well as higher-level functions, such as the PDF import facility (PDI).

All of the functions in PDFlib and the PHP module have identical function names and parameters. Unless configured otherwise, all lengths and coordinates are measured in PostScript points. There are generally 72 PostScript points to an inch, but this depends on the output resolution. Please see the PDFlib Reference Manual included in the PDFlib distribution for a more thorough explanation of the coordinate system used.

With version 6, PDFlib offers an object-oriented API for PHP 5 in addition to the function-oriented API for PHP 4. The main difference is the following:

In PHP 4, first a PDF resource has to be retrieved with a function call like

$p = PDF_new().

This PDF resource is used as the first parameter in all further function calls, such as in

PDF_begin_document($p, "", "").

In PHP 5 however, a PDFlib object is created with

$p = new PDFlib().

This object offers all PDFlib API functions as methods, e.g. as with

$p->begin_document("", "").

In addition, exceptions have been introduced in PHP 5 which are supported by PDFlib 6 and later as well.

Please see the examples below for more information.

Note:

If you're interested in alternative free PDF generators that do not utilize external PDF libraries, see this related FAQ.

Requirements

PDFlib Lite is available as open source. However, the PDFlib Lite license allows free use only under certain conditions. PDFlib Lite supports a subset of PDFlib's functionality; please see the PDFlib web site for details. The full version of PDFlib is available for download at » http://www.pdflib.com/products/pdflib-family/, but requires that you purchase a license for commercial use.

Issues with older versions of PDFlib

Any version of PHP 4 after March 9, 2000 does not support versions of PDFlib older than 3.0.

PDFlib 4.0 or greater is supported by PHP 4.3.0 and later.

Installation

This » PECL extension is not bundled with PHP. Information for installing this PECL extension may be found in the manual chapter titled Installation of PECL extensions. Additional information such as new releases, downloads, source files, maintainer information, and a CHANGELOG, can be located here: » http://pecl.php.net/package/pdflib.

To get these functions to work in PHP < 4.3.9, you have to compile PHP with --with-pdflib[=DIR]. DIR is the PDFlib base install directory, defaults to /usr/local.

Resource Types

PDF_new() creates a new PDFlib object required by most PDF functions.

Remarks about Deprecated PDFlib Functions

Starting with PHP 4.0.5, the PHP extension for PDFlib is officially supported by PDFlib GmbH. This means that all the functions described in the PDFlib Reference Manual are supported by PHP 4 with exactly the same meaning and the same parameters. However, with PDFlib Version 5.0.4 or higher all parameters have to be specified. For compatibility reasons, this binding for PDFlib still supports most of the deprecated functions, but they should be replaced by their new versions. PDFlib GmbH will not support any problems arising from the use of these deprecated functions. The documentation in this section indicates old functions as "Deprecated" and gives the replacement function to be used instead.

Examples

Most of the functions are fairly easy to use. The most difficult part is probably creating your first PDF document. The following example should help to get you started. It is developed for PHP 4 and creates the file hello.pdf with one page. It defines some document info field contents, loads the Helvetica-Bold font and outputs the text "Hello world! (says PHP)".

Example 1735. Hello World example from PDFlib distribution for PHP 4

<?php
$p
= PDF_new();

/*  open new PDF file; insert a file name to create the PDF on disk */
if (PDF_begin_document($p, "", "") == 0) {
   die(
"Error: " . PDF_get_errmsg($p));
}

PDF_set_info($p, "Creator", "hello.php");
PDF_set_info($p, "Author", "Rainer Schaaf");
PDF_set_info($p, "Title", "Hello world (PHP)!");

PDF_begin_page_ext($p, 595, 842, "");

$font = PDF_load_font($p, "Helvetica-Bold", "winansi", "");

PDF_setfont($p, $font, 24.0);
PDF_set_text_pos($p, 50, 700);
PDF_show($p, "Hello world!");
PDF_continue_text($p, "(says PHP)");
PDF_end_page_ext($p, "");

PDF_end_document($p, "");

$buf = PDF_get_buffer($p);
$len = strlen($buf);

header("Content-type: application/pdf");
header("Content-Length: $len");
header("Content-Disposition: inline; filename=hello.pdf");
print
$buf;

PDF_delete($p);
?>


The following example comes with the PDFlib distribution for PHP 5. It uses the new exception handling and object encapsulation features available in PHP 5. It creates the file hello.pdf with one page. It defines some document info field contents, loads the Helvetica-Bold font and outputs the text "Hello world! (says PHP)".

Example 1736. Hello World example from PDFlib distribution for PHP 5

<?php

try {
   
$p = new PDFlib();

   
/*  open new PDF file; insert a file name to create the PDF on disk */
   
if ($p->begin_document("", "") == 0) {
       die(
"Error: " . $p->get_errmsg());
   }

   
$p->set_info("Creator", "hello.php");
   
$p->set_info("Author", "Rainer Schaaf");
   
$p->set_info("Title", "Hello world (PHP)!");

   
$p->begin_page_ext(595, 842, "");

   
$font = $p->load_font("Helvetica-Bold", "winansi", "");

   
$p->setfont($font, 24.0);
   
$p->set_text_pos(50, 700);
   
$p->show("Hello world!");
   
$p->continue_text("(says PHP)");
   
$p->end_page_ext("");

   
$p->end_document("");

   
$buf = $p->get_buffer();
   
$len = strlen($buf);

   
header("Content-type: application/pdf");
   
header("Content-Length: $len");
   
header("Content-Disposition: inline; filename=hello.pdf");
   print
$buf;
}
catch (
PDFlibException $e) {
   die(
"PDFlib exception occurred in hello sample:\n" .
   
"[" . $e->get_errnum() . "] " . $e->get_apiname() . ": " .
   
$e->get_errmsg() . "\n");
}
catch (
Exception $e) {
   die(
$e);
}
$p = 0;
?>


Table of Contents

PDF_activate_item — Activate structure element or other content item
PDF_add_annotation — Add annotation [deprecated]
PDF_add_bookmark — Add bookmark for current page [deprecated]
PDF_add_launchlink — Add launch annotation for current page [deprecated]
PDF_add_locallink — Add link annotation for current page [deprecated]
PDF_add_nameddest — Create named destination
PDF_add_note — Set annotation for current page [deprecated]
PDF_add_outline — Add bookmark for current page [deprecated]
PDF_add_pdflink — Add file link annotation for current page [deprecated]
PDF_add_table_cell — Add a cell to a new or existing table
PDF_add_textflow — Create Textflow or add text to existing Textflow
PDF_add_thumbnail — Add thumbnail for current page
PDF_add_weblink — Add weblink for current page [deprecated]
PDF_arc — Draw a counterclockwise circular arc segment
PDF_arcn — Draw a clockwise circular arc segment
PDF_attach_file — Add file attachment for current page [deprecated]
PDF_begin_document — Create new PDF file
PDF_begin_font — Start a Type 3 font definition
PDF_begin_glyph — Start glyph definition for Type 3 font
PDF_begin_item — Open structure element or other content item
PDF_begin_layer — Start layer
PDF_begin_page_ext — Start new page
PDF_begin_page — Start new page [deprecated]
PDF_begin_pattern — Start pattern definition
PDF_begin_template_ext — Start template definition
PDF_begin_template — Start template definition [deprecated]
PDF_circle — Draw a circle
PDF_clip — Clip to current path
PDF_close_image — Close image
PDF_close_pdi_page — Close the page handle
PDF_close_pdi — Close the input PDF document [deprecated]
PDF_close — Close pdf resource [deprecated]
PDF_closepath_fill_stroke — Close, fill and stroke current path
PDF_closepath_stroke — Close and stroke path
PDF_closepath — Close current path
PDF_concat — Concatenate a matrix to the CTM
PDF_continue_text — Output text in next line
PDF_create_3dview — Create 3D view
PDF_create_action — Create action for objects or events
PDF_create_annotation — Create rectangular annotation
PDF_create_bookmark — Create bookmark
PDF_create_field — Create form field
PDF_create_fieldgroup — Create form field group
PDF_create_gstate — Create graphics state object
PDF_create_pvf — Create PDFlib virtual file
PDF_create_textflow — Create textflow object
PDF_curveto — Draw Bezier curve
PDF_define_layer — Create layer definition
PDF_delete_pvf — Delete PDFlib virtual file
PDF_delete_table — Delete table object
PDF_delete_textflow — Delete textflow object
PDF_delete — Delete PDFlib object
PDF_encoding_set_char — Add glyph name and/or Unicode value
PDF_end_document — Close PDF file
PDF_end_font — Terminate Type 3 font definition
PDF_end_glyph — Terminate glyph definition for Type 3 font
PDF_end_item — Close structure element or other content item
PDF_end_layer — Deactivate all active layers
PDF_end_page_ext — Finish page
PDF_end_page — Finish page
PDF_end_pattern — Finish pattern
PDF_end_template — Finish template
PDF_endpath — End current path
PDF_fill_imageblock — Fill image block with variable data
PDF_fill_pdfblock — Fill PDF block with variable data
PDF_fill_stroke — Fill and stroke path
PDF_fill_textblock — Fill text block with variable data
PDF_fill — Fill current path
PDF_findfont — Prepare font for later use [deprecated]
PDF_fit_image — Place image or template
PDF_fit_pdi_page — Place imported PDF page
PDF_fit_table — Place table on page
PDF_fit_textflow — Format textflow in rectangular area
PDF_fit_textline — Place single line of text
PDF_get_apiname — Get name of unsuccessfull API function
PDF_get_buffer — Get PDF output buffer
PDF_get_errmsg — Get error text
PDF_get_errnum — Get error number
PDF_get_font — Get font [deprecated]
PDF_get_fontname — Get font name [deprecated]
PDF_get_fontsize — Font handling [deprecated]
PDF_get_image_height — Get image height [deprecated]
PDF_get_image_width — Get image width [deprecated]
PDF_get_majorversion — Get major version number [deprecated]
PDF_get_minorversion — Get minor version number [deprecated]
PDF_get_parameter — Get string parameter
PDF_get_pdi_parameter — Get PDI string parameter [deprecated]
PDF_get_pdi_value — Get PDI numerical parameter [deprecated]
PDF_get_value — Get numerical parameter
PDF_info_font — Query detailed information about a loaded font
PDF_info_matchbox — Query matchbox information
PDF_info_table — Retrieve table information
PDF_info_textflow — Query textflow state
PDF_info_textline — Perform textline formatting and query metrics
PDF_initgraphics — Reset graphic state
PDF_lineto — Draw a line
PDF_load_3ddata — Load 3D model
PDF_load_font — Search and prepare font
PDF_load_iccprofile — Search and prepare ICC profile
PDF_load_image — Open image file
PDF_makespotcolor — Make spot color
PDF_moveto — Set current point
PDF_new — Create PDFlib object
PDF_open_ccitt — Open raw CCITT image [deprecated]
PDF_open_file — Create PDF file [deprecated]
PDF_open_gif — Open GIF image [deprecated]
PDF_open_image_file — Read image from file [deprecated]
PDF_open_image — Use image data [deprecated]
PDF_open_jpeg — Open JPEG image [deprecated]
PDF_open_memory_image — Open image created with PHP's image functions [not supported]
PDF_open_pdi_page — Prepare a page
PDF_open_pdi — Open PDF file [deprecated]
PDF_open_tiff — Open TIFF image [deprecated]
PDF_pcos_get_number — Get value of pCOS path with type number or boolean
PDF_pcos_get_stream — Get contents of pCOS path with type stream, fstream, or string
PDF_pcos_get_string — Get value of pCOS path with type name, string, or boolean
PDF_place_image — Place image on the page [deprecated]
PDF_place_pdi_page — Place PDF page [deprecated]
PDF_process_pdi — Process imported PDF document
PDF_rect — Draw rectangle
PDF_restore — Restore graphics state
PDF_resume_page — Resume page
PDF_rotate — Rotate coordinate system
PDF_save — Save graphics state
PDF_scale — Scale coordinate system
PDF_set_border_color — Set border color of annotations [deprecated]
PDF_set_border_dash — Set border dash style of annotations [deprecated]
PDF_set_border_style — Set border style of annotations [deprecated]
PDF_set_char_spacing — Set character spacing [deprecated]
PDF_set_duration — Set duration between pages [deprecated]
PDF_set_gstate — Activate graphics state object
PDF_set_horiz_scaling — Set horizontal text scaling [deprecated]
PDF_set_info_author — Fill the author document info field [deprecated]
PDF_set_info_creator — Fill the creator document info field [deprecated]
PDF_set_info_keywords — Fill the keywords document info field [deprecated]
PDF_set_info_subject — Fill the subject document info field [deprecated]
PDF_set_info_title — Fill the title document info field [deprecated]
PDF_set_info — Fill document info field
PDF_set_layer_dependency — Define relationships among layers
PDF_set_leading — Set distance between text lines [deprecated]
PDF_set_parameter — Set string parameter
PDF_set_text_matrix — Set text matrix [deprecated]
PDF_set_text_pos — Set text position
PDF_set_text_rendering — Determine text rendering [deprecated]
PDF_set_text_rise — Set text rise [deprecated]
PDF_set_value — Set numerical parameter
PDF_set_word_spacing — Set spacing between words [deprecated]
PDF_setcolor — Set fill and stroke color
PDF_setdash — Set simple dash pattern
PDF_setdashpattern — Set dash pattern
PDF_setflat — Set flatness
PDF_setfont — Set font
PDF_setgray_fill — Set fill color to gray [deprecated]
PDF_setgray_stroke — Set stroke color to gray [deprecated]
PDF_setgray — Set color to gray [deprecated]
PDF_setlinecap — Set linecap parameter
PDF_setlinejoin — Set linejoin parameter
PDF_setlinewidth — Set line width
PDF_setmatrix — Set current transformation matrix
PDF_setmiterlimit — Set miter limit
PDF_setpolydash — Set complicated dash pattern [deprecated]
PDF_setrgbcolor_fill — Set fill rgb color values [deprecated]
PDF_setrgbcolor_stroke — Set stroke rgb color values [deprecated]
PDF_setrgbcolor — Set fill and stroke rgb color values [deprecated]
PDF_shading_pattern — Define shading pattern
PDF_shading — Define blend
PDF_shfill — Fill area with shading
PDF_show_boxed — Output text in a box [deprecated]
PDF_show_xy — Output text at given position
PDF_show — Output text at current position
PDF_skew — Skew the coordinate system
PDF_stringwidth — Return width of text
PDF_stroke — Stroke path
PDF_suspend_page — Suspend page
PDF_translate — Set origin of coordinate system
PDF_utf16_to_utf8 — Convert string from UTF-16 to UTF-8
PDF_utf32_to_utf16 — Convert string from UTF-32 to UTF-16
PDF_utf8_to_utf16 — Convert string from UTF-8 to UTF-16

Code Examples / Notes » ref.pdf

thodge

Yet another addition to the PDF text extraction code last posted by jorromer. The code only seemed to work for PDF 1.2 (Acrobat 3.x) or below. This pdfExtractText function uses regular expressions to cover cases I have found in PDF 1.3 and 1.4 documents. The code also handles closing brackets in the text stream, which were ignored by the previous version. My regular expression skills are somewhat lacking, so improvements may possible by a more skilled programmer. I'm sure there are still cases that this function will not handle, but I haven't come across any yet...
<?php
function pdf2string($sourcefile) {
$fp = fopen($sourcefile, 'rb');
$content = fread($fp, filesize($sourcefile));
fclose($fp);
$searchstart = 'stream';
$searchend = 'endstream';
$pdfText = '';
$pos = 0;
$pos2 = 0;
$startpos = 0;
while ($pos !== false && $pos2 !== false) {
$pos = strpos($content, $searchstart, $startpos);
$pos2 = strpos($content, $searchend, $startpos + 1);
if ($pos !== false && $pos2 !== false){
if ($content[$pos] == 0x0d && $content[$pos + 1] == 0x0a) {
$pos += 2;
} else if ($content[$pos] == 0x0a) {
$pos++;
}
if ($content[$pos2 - 2] == 0x0d && $content[$pos2 - 1] == 0x0a) {
$pos2 -= 2;
} else if ($content[$pos2 - 1] == 0x0a) {
$pos2--;
}
$textsection = substr(
$content,
$pos + strlen($searchstart) + 2,
$pos2 - $pos - strlen($searchstart) - 1
);
$data = @gzuncompress($textsection);
$pdfText .= pdfExtractText($data);
$startpos = $pos2 + strlen($searchend) - 1;
}
}
return preg_replace('/(\s)+/', ' ', $pdfText);
}
function pdfExtractText($psData){
if (!is_string($psData)) {
return '';
}
$text = '';
// Handle brackets in the text stream that could be mistaken for
// the end of a text field. I'm sure you can do this as part of the
// regular expression, but my skills aren't good enough yet.
$psData = str_replace('\)', '##ENDBRACKET##', $psData);
$psData = str_replace('\]', '##ENDSBRACKET##', $psData);
preg_match_all(
'/(T[wdcm*])[\s]*(\[([^\]]*)\]|\(([^\)]*)\))[\s]*Tj/si',
$psData,
$matches
);
for ($i = 0; $i < sizeof($matches[0]); $i++) {
if ($matches[3][$i] != '') {
// Run another match over the contents.
preg_match_all('/\(([^)]*)\)/si', $matches[3][$i], $subMatches);
foreach ($subMatches[1] as $subMatch) {
$text .= $subMatch;
}
} else if ($matches[4][$i] != '') {
$text .= ($matches[1][$i] == 'Tc' ? ' ' : '') . $matches[4][$i];
}
}
// Translate special characters and put back brackets.
$trans = array(
'...' => '…',
'\205' => '…',
'\221' => chr(145),
'\222' => chr(146),
'\223' => chr(147),
'\224' => chr(148),
'\226' => '-',
'\267' => '•',
'\(' => '(',
'\[' => '[',
'##ENDBRACKET##' => ')',
'##ENDSBRACKET##' => ']',
chr(133) => '-',
chr(141) => chr(147),
chr(142) => chr(148),
chr(143) => chr(145),
chr(144) => chr(146),
);
$text = strtr($text, $trans);
return $text;
}
?>


jaymaity

Totally free open source alternative is also available without any license cost at
http://fpdf.org/


uwe

Those looking for a free replacement of pdflib may consider
pslib at http://pslib.sourceforge.net which produces PostScript but it can be easily turned into PDF by Acrobat Distiller or ghostscript. The API is very similar and even hypertext functions are supported. There
is also a php extension for pslib in PECL, called ps.


taufiq

There is XPDF Win32 binary package at SourceForge for pdftotext purpose that works.
I've tried php codes below but didn't work.


jonathon hibbard

The other issue with DOMpdf is that it has some pretty painful flaws.
You have to supply full paths to everything (images, includes, javascript files, etc).  And boy, do i mean everything.
Even then, it is not 100% sound.  If you have complex sites, it cannot handle it.  It instead breaks the design and only provides you with about a million broken images.
Don't get me wrong, it's GREAT for use with lower-end more simple sites, but if you have a site that say, has a javascript navigation, flash, and a bunch of container divs, it's really not going to do the job.
The above library seems to be the best fit, as about the only way to get high-end sites to work is just to manually write it out yourself using the functions above.
Sorry to bust anyone's bubble.  Good luck.


pitvanester

Sorry, both versions of pdf2txt don´t work...

17-sep-2005 07:26

some code that can be very helpful for starters.
<?php
   // Declare PDF File
   $pdf = pdf_new();
   PDF_open_file($pdf);
   // Set Document Properties
   PDF_set_info($pdf, "author", "Alexander Pas");
   PDF_set_info($pdf, "title", "PDF by PHP Example");
   PDF_set_info($pdf, "creator", "Alexander Pas");
   PDF_set_info($pdf, "subject", "Testing Code");
   // Get fonts to use
   pdf_set_parameter($pdf, "FontOutline", "Arial=arial.ttf"); // get a custom font
   $font1 = PDF_findfont($pdf, "Helvetica-Bold",  "winansi", 0); // declare default font
   $font2 = PDF_findfont($pdf, "Arial",  "winansi", 1); // declare custom font & embed into file
   /*
   You can use the following Fontypes 14 safely (the default fonts)
   Courier, Courier-Bold, Courier-Oblique, Courier-BoldOblique
   Helvetica, Helvetica-Bold, Helvetica-Oblique, Helvetica-BoldOblique
   Times-Roman, Times-Bold, Times-Italic, Times-BoldItalic
   Symbol, ZapfDingbats
   */
   // make the images
   $image1 = PDF_open_image_file($pdf, "gif", "image.gif"); //supported filetypes are: jpeg, tiff, gif, png.
   //Make First Page
   PDF_begin_page($pdf, 450, 450); // page width and height.
   $bookmark = PDF_add_bookmark($pdf, "Front"); // add a top level bookmark.
   PDF_setfont($pdf, $font1, 12); // use this font from now on.
   PDF_show_xy($pdf, "First Page!", 5, 225); // show this text measured from the left top.
   pdf_place_image($pdf, $image1, 255, 5, 1); // last number will schale it.
   PDF_end_page($pdf); // End of Page.
   //Make Second Page
   PDF_begin_page($pdf, 450, 225); // page width and height.
   $bookmark1 = PDF_add_bookmark($pdf, "Chapter1", $bookmark); // add a nested bookmark. (can be nested multiple times.)
   PDF_setfont($pdf, $font2, 12); // use this font from now on.
   PDF_show_xy($pdf, "Chapter1!", 225, 5);
   PDF_add_bookmark($pdf, "Chapter1.1", $bookmark1); // add a nested bookmark (already in a nested one).
   PDF_setfont($pdf, $font1, 12);
   PDF_show_xy($pdf, "Chapter1.1", 225, 5);
   PDF_end_page($pdf);
   
   // Finish the PDF File
   
   PDF_close($pdf); // End Of PDF-File.
   $output = PDF_get_buffer($pdf); // assemble the file in a variable.
   // Output Area
   header("Content-type: application/pdf"); //set filetype to pdf.
   header("Content-Length: ".strlen($output)); //content length
   header("Content-Disposition: attachment; filename=test.pdf"); // you can use inline or attachment.
   echo $output; // actual print area!
   // Cleanup
   PDF_delete($pdf);
?>


bmironov

RedHat 9 + Apache 2.0 + PHP 4.3.2 + Oracle 9i + PDFlib 5.0.1 (binary distribution)
It seems to be a working bundle if you do some magic with ./configure:
RedHat 9:
kernel-2.4.20-18.9
Apache 2.0.46:
./configure --enable-so --enable-rewrite=shared --enable-status --enable-mpm=prefork
PHP 4.3.2:
./configure \
--program-prefix= \
--prefix=/usr \
--exec-prefix=/usr \
--bindir=/usr/bin \
--sbindir=/usr/sbin \
--sysconfdir=/etc \
--datadir=/usr/share \
--includedir=/usr/include \
--libdir=/usr/lib \
--libexecdir=/usr/libexec \
--localstatedir=/var \
--sharedstatedir=/usr/com \
--mandir=/usr/share/man \
--infodir=/usr/share/info \
--with-config-file-path=/etc \
--with-config-file-scan-dir=/etc/php.d \
--without-tsrm-pthreads \    # !!!!!!!!!!!!!!!!!!!!
--with-zlib \
--with-gd \
--enable-gd-native-ttf \
--with-ttf \
--without-mysql \
--with-apxs2filter=/usr/local/apache2/bin/apxs \
--with-oci8 \
--enable-sigchild \
--enable-inline-optimization
Oracle9i:
ln -s $ORACLE_HOME/rdbms/public/nzerror.h $ORACLE_HOME/rdbms/demo/nzerror.h
ln -s $ORACLE_HOME/rdbms/public/nzt.h $ORACLE_HOME/rdbms/demo/nzt.h
ln -s $ORACLE_HOME/rdbms/public/ociextp.h $ORACLE_HOME/rdbms/demo/ociextp.h
If you want to use bundled GD-library then:
1) install following packages: libjpeg, libjpeg-devel, libpng, libpng-devel, freetype, freetype-devel, libtiff, libtiff-devel, zlib, zlib-devel
2) ln -s /usr/lib/libjpeg.so.62 /usr/lib/libjpeg.so
ln -s /usr/lib/libpng.so.62 /usr/lib/libpng.so
It seems to be a working combination, because it is NOT give you:
1) error message in Apache's error_log:
Module compiled with module API=20020429, debug=0, thread-safety=0
PHP compiled with module API=20020429, debug=0, thread-safety=1
2) error message in Apache's error_log:
[notice] child pid 12345 exit signal Segmentation fault (11)
3) MS Internet Explorer can show PDF-output from your PHP-script via Acrobat plug-in and does not crush. No confusing messages about opening "Adobe Acrobat Control for ActiveX".
Hope it will save you some time.
Good luck,
Boris


pbierans

Load extension, open a PDF, add a font, modify PDF in memory and send
it to browser:
<?php
 // no cache headers:
 header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");
 header("Last-Modified: ".gmdate("D, d M Y H:i:s")." GMT");
 header("Cache-Control: no-store, no-cache, must-revalidate");
 header("Cache-Control: post-check=0, pre-check=0", false);
 header("Pragma: no-cache");
 $ext_name="libpdf_php.so";
   // libpdf_php.so is the PDFLIB for SunOS by "PDFlib GmbH"
   // visit http://www.pdflib.com
 // if the extension is not automatically loaded by Apache
 // dl() will try to load it on demand:
 if (!extension_loaded($ext_name) && !@dl($ext_name))
 {
   ?>
   <table width="100%" border="0"><tr><td align="center">
     <table style="border: solid #f0f0f0 2px;"><tr>
       <td valign="middle" style="padding: 20px; margin: 0px;">
         <p style="font-family: arial; font-size: 12px; ">
         <b>Sorry,</b>
&nbsp;
A PDF can not be generated right now.
The administrator has been informed and will fix this as
         soon as possible.
Please try again later.
       
     </td></tr></table>
   </td></tr></table>
   <?php
   mail('admin@domain.com','Error: PDFLib not found',
        'Called by script:\n  '.$SCRIPT_FILENAME.'?'.$QUERY_STRING,
        "From: warnings@domain.com\n");
   exit;
 } // verify that extension is usable
 // unique serial number:
 srand(microtime()*10000);
 $usnr= gmdate("Ymd-His-").rand(1000,9999).'-';
 $pdf_file=$usnr.'result.pdf';
 $src_file='source.pdf';
 // create pdf object
 $pdf = pdf_new();
 pdf_open_file($pdf);
 pdf_set_parameter($pdf, 'serial',      'if-you-have-one');
 // fonts to embed, they are in the folder of this file:
 pdf_set_parameter($pdf, 'FontAFM',     'TradeGothic=Tg______.afm');
 pdf_set_parameter($pdf, 'FontOutline', 'TradeGothic=Tg______.pfb');
 pdf_set_parameter($pdf, 'FontPFM',     'TradeGothic=Tg______.pfm');
 // load the source file:
 $src_doc   =pdf_open_pdi($pdf,$src_file,'', 0);
 $src_page  =pdf_open_pdi_page($pdf,$src_doc,1,'');
 $src_width =pdf_get_pdi_value($pdf,'width' ,$src_doc,$src_page,0);
 $src_height=pdf_get_pdi_value($pdf,'height',$src_doc,$src_page,0);
 pdf_begin_page($pdf, $src_width, $src_height);
 {
   // place the sourcefile to the background of the actual page:
   pdf_place_pdi_page($pdf,$src_page,0,0,1,1);
   pdf_close_pdi_page($pdf,$src_page);
   // modify the page:
   pdf_set_font($pdf, 'TradeGothic', 8, 'host');
   pdf_show_xy($pdf, 'Now: '.gmdate("Y-m-d H:i:s"),50,50);
 }
 pdf_end_page($pdf);
 pdf_close($pdf);
 // prepare output:
 $pdfdata = pdf_get_buffer($pdf); // to echo the pdf-data
 $pdfsize = strlen($pdfdata);     // IE requires the datasize
 // real datatype headers:
 header('Content-type: application/pdf');
 header('Content-disposition: attachment; filename="'.$pdf_file.'"');
 header('Content-length: '.$pdfsize);
 echo $pdfdata;
 exit; // keep this one so no #13#10 or #32 will be written
?>


28-aug-2005 05:58

If you want to display the number of pages (for example: page 1 of 3) then the following code could be helpful:
<?php
...
$pdf->begin_page_ext(842,595 , "");
 .. add text,images,...
$pdf->suspend_page("");
$pdf->begin_page_ext(842,595 , "");
 .. add text,images,...
$pdf->suspend_page("");
... create all pages
$pdf->resume_page("pagenumber 1");
... add number of pages to page 1
$pdf->end_page_ext("");
$pdf->resume_page("pagenumber 2");
... add number of pages to page 2
$pdf->end_page_ext("");
...
?>


sven.schuberth

I've improved the codesnipped for the pdf2txt version 1.2.
Now its possible the translate pdf version >1.2 into plain text.
Sven
<?php
// Function    : pdf2txt()
// Arguments   : $filename - Filename of the PDF you want to extract
// Description : Reads a pdf file, extracts data streams, and manages
//               their translation to plain text - returning the plain
//               text at the end
// Authors      : Jonathan Beckett, 2005-05-02
// : Sven Schuberth, 2007-03-29
function pdf2txt($filename){
$data = getFileData($filename);

$s=strpos($data,"%")+1;

$version=substr($data,$s,strpos($data,"%",$s)-1);
if(substr_count($version,"PDF-1.2")==0)
return handleV3($data);
else
return handleV2($data);

}
// handles the verson 1.2
function handleV2($data){

// grab objects and then grab their contents (chunks)
$a_obj = getDataArray($data,"obj","endobj");

foreach($a_obj as $obj){

$a_filter = getDataArray($obj,"<<",">>");

if (is_array($a_filter)){
$j++;
$a_chunks[$j]["filter"] = $a_filter[0];
$a_data = getDataArray($obj,"stream\r\n","endstream");
if (is_array($a_data)){
$a_chunks[$j]["data"] = substr($a_data[0],
strlen("stream\r\n"),
strlen($a_data[0])-strlen("stream\r\n")-strlen("endstream"));
}
}
}
// decode the chunks
foreach($a_chunks as $chunk){
// look at each chunk and decide how to decode it - by looking at the contents of the filter
$a_filter = split("/",$chunk["filter"]);

if ($chunk["data"]!=""){
// look at the filter to find out which encoding has been used
if (substr($chunk["filter"],"FlateDecode")!==false){
$data =@ gzuncompress($chunk["data"]);
if (trim($data)!=""){
$result_data .= ps2txt($data);
} else {

//$result_data .= "x";
}
}
}
}

return $result_data;
}
//handles versions >1.2
function handleV3($data){
// grab objects and then grab their contents (chunks)
$a_obj = getDataArray($data,"obj","endobj");
$result_data="";
foreach($a_obj as $obj){
//check if it a string
if(substr_count($obj,"/GS1")>0){
//the strings are between ( and )
preg_match_all("|\((.*?)\)|",$obj,$field,PREG_SET_ORDER);
if(is_array($field))
foreach($field as $data)
$result_data.=$data[1];
}
}
return $result_data;
}
function ps2txt($ps_data){
$result = "";
$a_data = getDataArray($ps_data,"[","]");
if (is_array($a_data)){
foreach ($a_data as $ps_text){
$a_text = getDataArray($ps_text,"(",")");
if (is_array($a_text)){
foreach ($a_text as $text){
$result .= substr($text,1,strlen($text)-2);
}
}
}
} else {
// the data may just be in raw format (outside of [] tags)
$a_text = getDataArray($ps_data,"(",")");
if (is_array($a_text)){
foreach ($a_text as $text){
$result .= substr($text,1,strlen($text)-2);
}
}
}
return $result;
}
function getFileData($filename){
$handle = fopen($filename,"rb");
$data = fread($handle, filesize($filename));
fclose($handle);
return $data;
}
function getDataArray($data,$start_word,$end_word){
$start = 0;
$end = 0;
unset($a_result);

while ($start!==false && $end!==false){
$start = strpos($data,$start_word,$end);
if ($start!==false){
$end = strpos($data,$end_word,$start);
if ($end!==false){
// data is between start and end
$a_result[] = substr($data,$start,$end-$start+strlen($end_word));
}
}
}
return $a_result;
}
?>


donatas

I've been looking for a way to extract plain text from PDF documents (needed to search for text inside 'em). Not being able to find one I wrote the needed functions myself. here you go folks.
<?php
 function pdf2string ($sourceFile)
 {
   $textArray = array ();
   $objStart = 0;
   
   $fp = fopen ($sourceFile, 'rb');
   $content = fread ($fp, filesize ($sourceFile));
   fclose ($fp);
   
   $searchTagStart = chr(13).chr(10).'stream';
   $searchTagStartLenght = strlen ($searchTagStart);
   
   while ((($objStart = strpos ($content, $searchTagStart, $objStart)) && ($objEnd = strpos ($content, 'endstream', $objStart+1))))
   {
     $data = substr ($content, $objStart + $searchTagStartLenght + 2, $objEnd - ($objStart + $searchTagStartLenght) - 2);
     $data = @gzuncompress ($data);
     
     if ($data !== FALSE && strpos ($data, 'BT') !== FALSE && strpos ($data, 'ET') !== FALSE)
     {
       $textArray [] = ExtractText ($data);
     }
     
     $objStart = $objStart < $objEnd ? $objEnd : $objStart + 1;
   }
   
   return $textArray;
 }
 
 function ExtractText ($postScriptData)
 {
   while ((($textStart = strpos ($postScriptData, '(', $textStart)) && ($textEnd = strpos ($postScriptData, ')', $textStart + 1)) && substr ($postScriptData, $textEnd - 1) != '\\'))
   {
     $plainText .= substr ($postScriptData, $textStart + 1, $textEnd - $textStart - 1);
     if (substr ($postScriptData, $textEnd + 1, 1) == ']') //this adds quite some additional spaces between the words
     {
       $plainText .= ' ';
     }
     
     $textStart = $textStart < $textEnd ? $textEnd : $textStart + 1;
   }
   
   return stripslashes ($plainText);
 }
?>


ontwerp

I was searching for a lowcost/opensource option for combining static html files [as templates] and dynamic output from perl or php routines etc. And the sooner or later I found out that this was the most stable, 'speedest' and customizeable way to produce usable pdf 's with nice formatting :
1] create html page output [perl-> html output, direct html output from any app or php echo's etc. [sort these html files locally]
2] parse all html [inluding webimages links, tables font formatting etc] to [E]PS files with the perl app : html2ps [as mentioned beneath]
http://user.it.uu.se/~jan/html2ps.html [sort all ps files by future pdf page positions]
3] use the free ps2pdf/ps2pdfwr linux application
http://www.ps2pdf.com/convert/index.htm [uses gostscript, ghostview libs and so on etc]
Has great formatting options like headers, footers, numbering etc
[sort pdf files]
4] convert all pdf files to 1 pdf file with : pdftk [pdftoolkit], deliveres optional compressions/encryption, background stamps etc
One should ask why using different scripts :
- combination perl/php is great : perl is speedier at some issues like conversion to ps files in my experience
- ps to pdf is quickier then direct php to pdf [in my exp.!]
- I have total control over every files whenever i change html files as a template I use only editors or other app. for it [online or offline].
p.s. I had to make a opensource solution for creating simpel report analyses that's based on things like :
- first page [name / title / #/ date]
- some static info [like introduction, copyrights etc]
- some dynamic info [outputted from php->dbase queries] combined
with html tags/images etc.
And this all mixed [so seperated in files for transparancy]. Also the 3 way manner : data-> html, html->ps, ps->pdf, is easier and quickier to program or adjust in every step.
Correct me if i'm wrong [mail me to]
ing. Valentijn Langendorff
Design & Technologist


spingary

I was having trouble with streaming inline PDf's using PHP 5.0.2, Apache 2.0.54.
This is my code:
<?
header("Pragma: public");
header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");
header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT");
header("Cache-Control: must-revalidate");
header("Content-type: application/pdf");
header("Content-Length: ".filesize($file));
header("Content-disposition: inline; filename=$file");
header("Accept-Ranges: ".filesize($file));
readfile($file);
exit();
?>
It would work fine in Mozilla Firefox (1.0.7) but with IE (6.0.2800.1106) it would not bring up the Adobe Reader plugin and instead ask me to save it or open it as a PHP file.
Oddly enough, I turned off ZLib.compression and it started working.  I guess the compression is confusing IE.  I tried leaving out the content-length header thinking maybe it was unmatched filesize (uncompressed number vs actual received compressed size), but then without it it screws up Firefox too.  
What I ended up doing was disabling Zlib compression for the PDF output pages using ini_set:
<?
ini_set('zlib.output_compression','Off');
?>
Maybe this will help someone. Will post over in the PDF section as well.


sam from dogmaconsult.de

I seriously tried to get PDF parsing to work to use it in the indexing for fulltext search for a document management. But none of the pdf2text functions below worked for my test cases (among them an openoffice generated pdf file and a file generated by fpdf).
But I found a REALLY WORKING SOLUTION! On linux systems, install the XPDF package. It comes with a tool called pdftotext. Use php code similar to the following to get the text content of your pdf files:
<?
$file = "test.pdf";
$outpath = preg_replace("/\.pdf$/", "", $file).".txt";

system("pdftotext ".escapeshellcmd($file), $ret);
if ($ret == 0)
{
$value = file_get_contents($outpath);
unlink($outpath);
print $value;
}
if ($ret == 127)
print "Could not find pdftotext tool.";
if ($ret == 1)
print "Could not find pdf file.";
?>
The solution works on all test cases and is much more powerful than any of the previous pure php functions posted here, although only available on linux.


jorromer

I recently use mattb code below for the extraction of text from PDF files. I modify this code for only extract text fields.
Hope i can help some one
Here is the Function
<?php
 $text = pdf2string("file.pdf");
 echo $text;
 function pdf2string($sourcefile){
   $fp = fopen($sourcefile, 'rb');
   $content = fread($fp, filesize($sourcefile));
   fclose($fp);
   $searchstart = 'stream';
   $searchend = 'endstream';
   $pdfdocument = '';
   $pos = 0;
   $pos2 = 0;
   $startpos = 0;
 
   while( $pos !== false && $pos2 !== false ){
     $pos = strpos($content, $searchstart, $startpos);
     $pos2 = strpos($content, $searchend, $startpos + 1);

     if ($pos !== false && $pos2 !== false){
if ($content[$pos]==0x0d && $content[$pos+1]==0x0a) $pos+=2;
else if ($content[$pos]==0x0a) $pos++;
if ($content[$pos2-2]==0x0d && $content[$pos2-1]==0x0a) $pos2-=2;
else if ($content[$pos2-1]==0x0a) $pos2--;
       $textsection = substr($content, $pos + strlen($searchstart) + 2, $pos2 - $pos - strlen($searchstart) - 1);
       $data = @gzuncompress($textsection);
       $data = ExtractText2($data);
       $startpos = $pos2 + strlen($searchend) - 1;

       if ($data === false){
 return -1;}
 
       $pdfdocument .= $data;}}
  return $pdfdocument;}
function ExtractText2($postScriptData){
 $sw = true;
 $textStart = 0;
 $len = strlen($postScriptData);
 while ($sw){
   $ini = strpos($postScriptData, '(', $textStart);
$end = strpos($postScriptData, ')', $textStart+1);
if (($ini>0) && ($end>$ini)){
 $valtext = strpos($postScriptData,'Tj',$end+1);
 if ($valtext == $end + 2)
   $text .= substr($postScriptData,$ini+1,$end - $ini - 1);}
 
$textStart = $end + 1;
if ($len<=$textStart) $sw=false;

if (($ini == 0) && ($end == 0)) $sw=false;}
 
 $trans = array("\\341" => "a","\\351" => "e","\\355" => "i","\\363" => "o","\\223" => "","\\224" => "");
 $text  = strtr($text, $trans);
 return $text;
}
?>


mattb

I recently tested Donatas' code below for the extraction of text from PDF files.  After running into a few problems where PDF files were not being read at all, I've modified it somewhat.  It still isn't perfect, but should work great for searching.  Thanks Donatas.
<?php
$test = pdf2string("<pathtoPDFfile>");
echo "$test";
# Returns a -1 if uncompression failed
function pdf2string($sourcefile)
{
  $fp = fopen($sourcefile, 'rb');
  $content = fread($fp, filesize($sourcefile));
  fclose($fp);
  # Locate all text hidden within the stream and endstream tags
  $searchstart = 'stream';
  $searchend = 'endstream';
  $pdfdocument = "";
  $pos = 0;
  $pos2 = 0;
  $startpos = 0;
  # Iterate through each stream block
  while( $pos !== false && $pos2 !== false )
  {
     # Grab beginning and end tag locations if they have not yet been parsed
     $pos = strpos($content, $searchstart, $startpos);
     $pos2 = strpos($content, $searchend, $startpos + 1);
     if( $pos !== false && $pos2 !== false )
     {
        # Extract compressed text from between stream tags and uncompress
        $textsection = substr($content, $pos + strlen($searchstart) + 2, $pos2 - $pos - strlen($searchstart) - 1);
        $data = @gzuncompress($textsection);
        # Clean up text via a special function
        $data = ExtractText($data);
        # Increase our PDF pointer past the section we just read
        $startpos = $pos2 + strlen($searchend) - 1;
        if( $data === false ) { return -1; }
        $pdfdocument = $pdfdocument . $data;
     }
  }
  return $pdfdocument;
}
function ExtractText($postScriptData)
{
  while( (($textStart = strpos($postScriptData, '(', $textStart)) && ($textEnd = strpos($postScriptData, ')', $textStart + 1)) && substr($postScriptData, $textEnd - 1) != '\\') )
  {
     $plainText .= substr($postScriptData, $textStart + 1, $textEnd - $textStart - 1);
     if( substr($postScriptData, $textEnd + 1, 1) == ']' ) // This adds quite some additional spaces between the words
     {
        $plainText .= ' ';
     }
     $textStart = $textStart < $textEnd ? $textEnd : $textStart + 1;
  }
  return stripslashes($plainText);
}
?>


webadmin

I found this info about pdflib scope on a Chinese (I think) site and translated it.  I was trying to do pdf_setfont and kept getting the wrong scope error.  Turns out it has to be in the Page scope.  So pdf_setfont will only work when called between pdf_begin_page and pdf_end_page.
#########################################
When API of the PDFlib is called, the error, Can't - IN 'document' scope occurs
There is a concept of " the scope " in the PDFlib, as for all API of the PDFlib it is called with some scope, the *1 which is decided This error occurs when it is called other than the scope where API is appointed. The chart below in reference, please verify API call position.
Path: PDF_moveto (), PDF_circle (), PDF_arc (), PDF_arcn (), PDF_rect () in each case PDF_stroke (), PDF_closepath_stroke (), PDF_fill (), PDF_fill_stroke (), PDF_closepath_fill_stroke (), PDF_clip (), PDF_endpath () the between
Page: PDF_begin_page () with PDF_end_page () in between outside path  
Template: PDF_begin_template () with PDF_end_template () in between outside path  
Pattern: PDF_begin_pattern () with PDF_end_pattern () in between outside path  
Font: PDF_begin_font () with PDF_end_font () in between outside glyph  
Glyph: PDF_begin_glyph () with PDF_end_glyph () in between outside path  
Document: PDF_open_* () with PDF_close () in between outside page tempalte and pattern  
Object: The PDF_new () with the PDF_delete () it belongs to the other no scope in between the place
Null: Outside object  
Any: All scopes other than  
##########################################
Hope this helps others as much as it helped me!!!


luc

I am trying to extract the text from PDF files and use it to feed a search engine (Intranet tool). I tried several functions "PDF2TXT" posted below, but not they do not produce the expected result. At least, all words need to be separated by spaces (then used as keywords), and the "junk" codes removed (for example: binary data, pictures...). I start modifying the interesting function posted by Swen, and here is the my current version that starts to work quite well (with PDF version 1.2). Sorry for having a quite different style of programming. Luc
<?php
// Patch for pdf2txt() posted Sven Schuberth
// Add/replace following code (cannot post full program, size limitation)
// handles the verson 1.2
// New version of handleV2($data), only one line changed
function handleV2($data){
       
   // grab objects and then grab their contents (chunks)
   $a_obj = getDataArray($data,"obj","endobj");
   
   foreach($a_obj as $obj){
       
       $a_filter = getDataArray($obj,"<<",">>");
   
       if (is_array($a_filter)){
           $j++;
           $a_chunks[$j]["filter"] = $a_filter[0];
           $a_data = getDataArray($obj,"stream\r\n","endstream");
           if (is_array($a_data)){
               $a_chunks[$j]["data"] = substr($a_data[0],
strlen("stream\r\n"),
strlen($a_data[0])-strlen("stream\r\n")-strlen("endstream"));
           }
       }
   }
   // decode the chunks
   foreach($a_chunks as $chunk){
       // look at each chunk and decide how to decode it - by looking at the contents of the filter
       $a_filter = split("/",$chunk["filter"]);
       
       if ($chunk["data"]!=""){
           // look at the filter to find out which encoding has been used            
           if (substr($chunk["filter"],"FlateDecode")!==false){
               $data =@ gzuncompress($chunk["data"]);
               if (trim($data)!=""){
   // CHANGED HERE, before: $result_data .= ps2txt($data);
                   $result_data .= PS2Text_New($data);
               } else {
               
                   //$result_data .= "x";
               }
           }
       }
   }
   return $result_data;
}
// New function - Extract text from PS codes
function ExtractPSTextElement($SourceString)
{
$CurStartPos = 0;
while (($CurStartText = strpos($SourceString, '(', $CurStartPos)) !== FALSE)
{
// New text element found
if ($CurStartText - $CurStartPos > 8) $Spacing = ' ';
else {
$SpacingSize = substr($SourceString, $CurStartPos, $CurStartText - $CurStartPos);
if ($SpacingSize < -25) $Spacing = ' '; else $Spacing = '';
}
$CurStartText++;
$StartSearchEnd = $CurStartText;
while (($CurStartPos = strpos($SourceString, ')', $StartSearchEnd)) !== FALSE)
{
if (substr($SourceString, $CurStartPos - 1, 1) != '\\') break;
$StartSearchEnd = $CurStartPos + 1;
}
if ($CurStartPos === FALSE) break; // something wrong happened

// Remove ending '-'
if (substr($Result, -1, 1) == '-')
{
$Spacing = '';
$Result = substr($Result, 0, -1);
}
// Add to result
$Result .= $Spacing . substr($SourceString, $CurStartText, $CurStartPos - $CurStartText);
$CurStartPos++;
}
// Add line breaks (otherwise, result is one big line...)
return $Result . "\n";
}
// Global table for codes replacement
$TCodeReplace = array ('\(' => '(', '\)' => ')');
// New function, replacing old "pd2txt" function
function PS2Text_New($PS_Data)
{
global $TCodeReplace;
// Catch up some codes
if (ord($PS_Data[0]) < 10) return '';
if (substr($PS_Data, 0, 8) == '/CIDInit') return '';
// Some text inside (...) can be found outside the [...] sets, then ignored
// => disable the processing of [...] is the easiest solution
$Result = ExtractPSTextElement($PS_Data);
// echo "Code=$PS_Data\nRES=$Result\n\n";
// Remove/translate some codes
return strtr($Result, $TCodeReplace);
}
?>


chu61 dot tw

How to get how many pages in a PDF? I read PDF spec. V1.6 and find this:
PDF set  a "Page Tree Node" to define the ordering of pages in the document. The tree structure allows PDF applications, using little memory to quickly open a document containing thousands of pages.
If a PDF have 63 pages, the page tree node will like this...
2 0 obj
<< /Type /Pages
   /Kidsn [ 4 0 R
              10 0 R
            ]
    /Count 63        <---- YES, got it
>>
endobj
[P.S]   a  PDF may not only a pages tree node, The right answer is in "root page tree node", if  /Count XX with  /Parent XXX node, it not "root page tree node"
SO, You must find the node with /Count XX and Without /Parent  terms, and you'll get total pages of PDF
%PDF-1.0  ~  %PDF-1.5 all works
Alex form Taipei,Taiwan


samcontact

Here is another great tutorial on basic PDF building w/ PHP:
http://hotwired.lycos.com/webmonkey/02/20/index3a.html?tw=programming
=======================
http://myteks.com
Computer Repair & Web Design
=======================


brendandonhue

Here is a function to test whether a file is a PDF without using any external library.
<?php
define('PDF_MAGIC', "\\x25\\x50\\x44\\x46\\x2D");
function is_pdf($filename) {
 return (file_get_contents($filename, false, null, 0, strlen(PDF_MAGIC)) === PDF_MAGIC) ? true : false;
}
?>
It's not checking if the whole file is valid, just if the correct header is present at the beginning of the file.


jkndrkn

For those of us that do not want to pay for a commercial license to use PDFlib in a closed-source project, there are at least two good alternatives: FPDF and TCPDF
http://www.fpdf.org/
PHP4 and PHP5 support
http://sourceforge.net/projects/pdf-php
PHP5 support only


brain23

For FPDF there also is an addon (FPDI) available, which let you import existing PDF documents:
http://www.setasign.de/products/pdf-php-solutions/fpdi/


david

Easiest way to get the text of a pdf is to install xpdf (on redhat yum -y install xpdf)
then run xpdftotext your.pdf - which will then generate your.txt.


praokean

domPDF is not so great PDF creator becouse don't support foreign charachters.

magnum

domPDF is also a great PDF creation interface. it basically converts your code to CSS and then builds the PDF from that with the absolute positions, and what not...

ragnar

After one hole day understanding how pdflib works i got the conclusion that its enough hard to draw just with words to furthermore for drawing a line maybe you will need something like four lines of code, so i did my own functions to do the life easier and the code more understable to modify and draw. I also made a function that will draw a rect with the corners round and the posibility even to fill it ;)
You can get it from http://www.deulos.com/pdf_php.php
feel free to make suggestions or whatever u like ;o)


senortz senortz

About creating a PDF document based on the content of another document(let's say a text file):
I have tried to send to the PDF-creator page from a link from the sender page the file name of the file I want to read the content from and generate the PDF document containing this content. The idea is is that when I tried to reffer the pdf-creator page via the link your_root/create_pdf.php?filename=$your_file_name, the pdf-creator page does not behave well when before creating the pdf document I have a line like $filename = $_GET["filename"].
I solved this using on the sender page instead of the link a form with a button, so the form has as action "create_pdf.php", as method "post" and a hidden field containing the "filename" value. And it works like this if, on the pdf-creator page I have a line like $filename = $_POST["filename"].
I would like to understand why this way it works and the other way does not.
I hope this helps. Here are the pieces of code I used.
Sender page:
print("<form name='to_pdf' action='see_pdf_file.php' method='post'>");
print("<br/><input type='submit' value='PDF'><input type='hidden' name='filename' value='$filename'></form>");
PDF-creator page:
<?
$filename = $_POST["filename"];
$file_handle = fopen($filename, "r");
$file_content = file_get_contents($filename);
fclose($file_handle);
//
$file_content = wordwrap($file_content,72,"|");
$a_row = explode("|",$file_content);
$i = 0;
//
$pdf = pdf_new();
pdf_open_file($pdf, "");
pdf_begin_page($pdf, 595, 842);
pdf_set_font($pdf, "Times-Roman", 16, "host");
pdf_add_outline($pdf, "Page 1");
pdf_set_value($pdf, "textrendering", 1);
pdf_show_xy($pdf, 'The content of the file:',50,700);
while ($a_row[$i] != "")
{
      pdf_continue_text($pdf,$a_row[$i]);
      $i++;
}
pdf_end_page($pdf);
pdf_close($pdf);
//
$data = pdf_get_buffer($pdf);
//
header("Content-type: application/pdf");
header("Content-disposition: inline; filename=test.pdf");
header("Content-length: " . strlen($data));
//
echo $data;
?>
PDFLib and PHP 431 used.
Thanks.


spadmore1980

http://www.fpdf.org/ is also quite good. Np lib install is required
-Shelon Padmore


tatlar

http://www.digitaljunkies.ca/dompdf/index.php
PHP5 class that converts HTML to PDF. From the website:
"At its heart, dompdf is (mostly) CSS2.1 compliant HTML layout and rendering engine written in PHP. It is a style-driven renderer: it will download and read external stylesheets, inline style tags, and the style attributes of individual HTML elements. It also supports most presentational HTML attributes."


michi alt+q marel.at

<?PHP
/* A little helpful function to calculate millimeters to points */
function calcToPt($intMillimeter) {
 $intPoints = ($intMillimeter*72)/25.4;
 $intPoints = round($intPoints);
 return $intPoints;
}
/* For example: Create DIN A4 210x297 mm */
pdf_begin_page( $pdf, calcToPt(210), calcToPt(297)); // 595x842 pt
?>


Change Language


Follow Navioo On Twitter
.NET Functions
Apache-specific Functions
Alternative PHP Cache
Advanced PHP debugger
Array Functions
Aspell functions [deprecated]
BBCode Functions
BCMath Arbitrary Precision Mathematics Functions
PHP bytecode Compiler
Bzip2 Compression Functions
Calendar Functions
CCVS API Functions [deprecated]
Class/Object Functions
Classkit Functions
ClibPDF Functions [deprecated]
COM and .Net (Windows)
Crack Functions
Character Type Functions
CURL
Cybercash Payment Functions
Credit Mutuel CyberMUT functions
Cyrus IMAP administration Functions
Date and Time Functions
DB++ Functions
Database (dbm-style) Abstraction Layer Functions
dBase Functions
DBM Functions [deprecated]
dbx Functions
Direct IO Functions
Directory Functions
DOM Functions
DOM XML Functions
enchant Functions
Error Handling and Logging Functions
Exif Functions
Expect Functions
File Alteration Monitor Functions
Forms Data Format Functions
Fileinfo Functions
filePro Functions
Filesystem Functions
Filter Functions
Firebird/InterBase Functions
Firebird/Interbase Functions (PDO_FIREBIRD)
FriBiDi Functions
FrontBase Functions
FTP Functions
Function Handling Functions
GeoIP Functions
Gettext Functions
GMP Functions
gnupg Functions
Net_Gopher
Haru PDF Functions
hash Functions
HTTP
Hyperwave Functions
Hyperwave API Functions
i18n Functions
IBM Functions (PDO_IBM)
IBM DB2
iconv Functions
ID3 Functions
IIS Administration Functions
Image Functions
Imagick Image Library
IMAP
Informix Functions
Informix Functions (PDO_INFORMIX)
Ingres II Functions
IRC Gateway Functions
PHP / Java Integration
JSON Functions
KADM5
LDAP Functions
libxml Functions
Lotus Notes Functions
LZF Functions
Mail Functions
Mailparse Functions
Mathematical Functions
MaxDB PHP Extension
MCAL Functions
Mcrypt Encryption Functions
MCVE (Monetra) Payment Functions
Memcache Functions
Mhash Functions
Mimetype Functions
Ming functions for Flash
Miscellaneous Functions
mnoGoSearch Functions
Microsoft SQL Server Functions
Microsoft SQL Server and Sybase Functions (PDO_DBLIB)
Mohawk Software Session Handler Functions
mSQL Functions
Multibyte String Functions
muscat Functions
MySQL Functions
MySQL Functions (PDO_MYSQL)
MySQL Improved Extension
Ncurses Terminal Screen Control Functions
Network Functions
Newt Functions
NSAPI-specific Functions
Object Aggregation/Composition Functions
Object property and method call overloading
Oracle Functions
ODBC Functions (Unified)
ODBC and DB2 Functions (PDO_ODBC)
oggvorbis
OpenAL Audio Bindings
OpenSSL Functions
Oracle Functions [deprecated]
Oracle Functions (PDO_OCI)
Output Control Functions
Ovrimos SQL Functions
Paradox File Access
Parsekit Functions
Process Control Functions
Regular Expression Functions (Perl-Compatible)
PDF Functions
PDO Functions
Phar archive stream and classes
PHP Options&Information
POSIX Functions
Regular Expression Functions (POSIX Extended)
PostgreSQL Functions
PostgreSQL Functions (PDO_PGSQL)
Printer Functions
Program Execution Functions
PostScript document creation
Pspell Functions
qtdom Functions
Radius
Rar Functions
GNU Readline
GNU Recode Functions
RPM Header Reading Functions
runkit Functions
SAM - Simple Asynchronous Messaging
Satellite CORBA client extension [deprecated]
SCA Functions
SDO Functions
SDO XML Data Access Service Functions
SDO Relational Data Access Service Functions
Semaphore
SESAM Database Functions
PostgreSQL Session Save Handler
Session Handling Functions
Shared Memory Functions
SimpleXML functions
SNMP Functions
SOAP Functions
Socket Functions
Standard PHP Library (SPL) Functions
SQLite Functions
SQLite Functions (PDO_SQLITE)
Secure Shell2 Functions
Statistics Functions
Stream Functions
String Functions
Subversion Functions
Shockwave Flash Functions
Swish Functions
Sybase Functions
TCP Wrappers Functions
Tidy Functions
Tokenizer Functions
Unicode Functions
URL Functions
Variable Handling Functions
Verisign Payflow Pro Functions
vpopmail Functions
W32api Functions
WDDX Functions
win32ps Functions
win32service Functions
xattr Functions
xdiff Functions
XML Parser Functions
XML-RPC Functions
XMLReader functions
XMLWriter Functions
XSL functions
XSLT Functions
YAZ Functions
YP/NIS Functions
Zip File Functions
Zlib Compression Functions
eXTReMe Tracker