phppdflib is a class written in php for dynamically generating PDF files from a web server.
In order to use the library, you will need a working installation of php version 4.0.6 or later. There are no other requirements that we are aware of. The library does not need any special compile options to php, and should work with any web server that php does.
In order to really use the library, you're going to need a good knowledge of php and how a web server works. The library has been written specifically so that you shouldn't need an extensive knowledge of the PDF format, but some basic knowledge is required.
If this is the first time you've used phppdflib, you'll do well by first installing the library and then trying out the files in the examples directory. If these scripts all produce valid PDF files, then you know the installation of the required components is sound. If not, check your webserver configuration and be sure that PHP is working. Check the install documentation for your software and seek out mailing lists and chat rooms that offer to help.
Once you know everything is working, take a look at the scripts
in the examples directory. example.php
and
template.php
are the best to start with, since they
contain a lot of comments that explain what the code is doing. They
also cover almost every method available in the library. Make copies
of these scripts and edit them to see what your changes do.
Once you feel that you understand how these scripts work, have a try at it yourself. Write a few scripts that create simple pages. Use the examples as templates and refer often to this documentation. Everything you need to know should be documented here.
Once you feel comfortable creating your own scripts, review the rest of the example files. There are some advanced tricks in them, but they are not as well documented. Also, the charting interface is still experimental.
The following are some things that you should understand about PDF files in order to effectively create them. This information is important whether you are using phppdflib or any other method to generate PDF files.
PDF files are designed around the print industry's needs. This may explain much of why things are done the way they are. The PDF format was also designed for digital publishing, so much about the way computers work was taken into account when developing it.
Any PDF file's resolution is infinity units per inch. This is limited by practical constraints (computers can only calculate so accurately, and a PDF file where every number was specified to an infinite number of decimal places would be, well infinately large) but the important thing to understand is that the precision to which you enter values does not determine the dots/inch. Your PDF viewer will automatically calculate the maximum resolution available for the output media. So when you display it on the screen, it may be 72dpi, but when you print it may be 600dpi.
Some people are confused by the scale at which PDF files operate. PDF files define their own units - we'll call them "PDF units" for simplicity. 1 inch equals 72 pdf units. This means that (for most of the world) 2.835 pdf units = 1 mm. To reiterate, this in no way defines the resolution of the PDF file.
PDF and HTML couldn't be more different. I've been searching, so if anyone has found a program to convert HTML into PDF, please point me at it.
The PDF format is really unlike anything you've used before (most likely). It isn't even like most word processor formats. When you place text in a PDF file, you place it - there is no automatic wrapping, or "flowing around an image" or anything like that. If you want the text to wrap at a certain point, or flow around another object on the page, you have to tell it to. phppdflib tries to insulate you from some of this with functions that do some of the work for you, and we have more planned. Part of the reason I bring this up is because many people expect phppdflib to be a html -> PDF converter, and it's not. Look at how poorly most browsers print HTML pages. If Microsoft and Netscape (and the Mozilla team) don't have the resources to make it work, why would you think that we do?
Don't take this to mean that I don't think it's possible, it's just extremely complicated
The PDF file is centered around the idea of a "page". Don't confuse this with HTML pages or pages of a word processor document. Moving objects from one page to another in a PDF file is non-trivial. Doing so does not cause the other objects on the page to reorient themselves to make room. The PDF doesn't care if your text doesn't fit on the page, it'll will just draw it off the page into a "limbo" where it can't be seen by a viewer. (notice how our paragraph drawing functions still don't fully insulate you from this problem.)
The order in which you draw things is even important. Remember how PDF is designed around the printing industry. If you place a lot of text on a page, and then color the page green, you end up with nothing but a green page. Much like an offset printer would, the green "paint" covers the text, thus hiding it.
If you're not familiar with object-oriented programming in php and the use of classes, please study that section in the php manual before going any further. It will save your sanity.
It will also help to refer to the files in the examples directory while reading this.
phppdflib is implemented as a php class. Once you have created an object from that class, you use methods of the class to add objects to the PDF and control how those objects will look. The basic process is: Configure settings for an object, then place the object on a page. There is no method to alter the characteristics of an object once it's been placed on a page, nor do we plan to create any.
While there are some functions in version 1 that automatically create pages for you (and more planned for version 2), the burdon of creating new pages when required is mostly on your shoulders. The order in which you create pages controls the order in which they will appear in the resultant document, although there are functions to alter page order after the fact. There is no requirement as to what order objects are placed on a page.
Once all the objects are created, the resultant PDF file is requrested from the class by way of the ->generate() method. What you do with this resultant file is totally up to you, but (in light of the fact that most people will want to send it to a browser) you should probably read the section on headers below to get what you're expecting. (even if you already familiar with headers, mime-types and http, there may be some informatin in this next section you'll like)
If you're not familiar with the HTTP spec, and how it works, the authoritative reference is here.
If you're going to do any programming in php, you should have a good working knowledge of that document. In case you're just playing around, or you need a little introduction, I'll summarize the parts that are important to understanding this section.
When your browser talks to a web server, a lot more goes on than just the web page you see as a result. Your browser tells the web server what make, model, and year it is, as well as a lot more information. The web server responds similarly, with make, model, and year, as well as some additional information about the document that it will be sending. This additional information takes the form of http headers and they're important because the browser uses the information (that you never see) in the headers to determine what to do with the rest of the information the web server sends it.
If you're confused or alarmed by what I just said, I suggest you install a copy of Ethereal and take a look at just how much chatter is really going on behind your back.
The first important header is the one labeled
Content-Type
. This is automatically set to "text/html"
by PHP,
which tells the browser that it can display the information as
a web page. But for the browser to know that it should fire up
your pdf viewer, it has to be set to "application/pdf". See the
files in the example directory for how this is done.
Unfortunately, Internet Explorer can get confused when the Content-Type is set to "application/pdf" but the file does not have a .pdf extension. In fact, there are a large number of oddities relating to Microsoft browsers.
Please see the php documentation on the header() function for some additional information on Internet Explorer problems.
You can control what the browser does with your PDF by using the
Content-Disposition
header to change
the filename. There are two distinct methods, and each causes the
browser to react differently. Using something like
"Content-Disposition: attachment; filename=somename.pdf" causes
the browser to assume that it should ask you what you want to do
with the file. If you decide to open it, the browser should be
able to figure out from the supplied filename that it needs to
fire up your PDF viewer.
Using "Content-Disposition: inline; filename=somename.pdf" causes the
browser to try to display the file. If a proper plug-in for viewing
PDF files has been installed, it should automatically start.
Most browsers seem to default to this behaviour if you don't supply
a Content-Disposition
header. Using
"Content-Disposition: filename=somename.pdf" seems to have the
same effect as specifying "inline". Please note that the use of the
Content-Disposition
header in http is not strictly to
spec, but most browsers seem to make use of it.
Another header that is very important is the "Content-Length" header. Although this header is not required by the HTTP spec, it appears to be required by Internet Explorer. IE tends to lock up while trying to launch the Acrobat plugin if this header is not specified. PHP doesn't figure it out for you, so you need to add it yourself. The following procedure should do the trick:
$temp = $pdf->generate(); header('Content-Length: ' . strlen($temp)); // strlen() is binary safe echo $temp;
We have heard reports that using a "Cache-Control: private" header also works around IE bugs that show up under certain configurations of IE. While I have not witnessed this header doing anything to fix any problems, others have reported that it helps, and I can't say that it breaks anything. The "private" setting tells the browser that it's allowed to cache the file, while telling proxy servers that they may not cache it.
I recommend you always use a Content-Disposition
,
Content-Type
, and Content-Length
header.
Not doing so will result in some visitors being unable to view the
PDF that you are generating at least some of the time. By properly
using these three headers, dynamic PDF files seem to display properly
on all platforms. If you still have problems, add a
Cache-Control
header and let me know if it helps.
See the files in the examples directory for, well, examples.
A telltale sign that you've got this wrong is when the browser
tries to display the PDF as if it were HTML. This pretty much looks
like garbeled junk (unless you're familiar with the inner workings
of PDF). Unfortunately, however, it could also be a problem with
your browser. Try displaying a PDF file from another location to
determine where the problem lie. Our experiments have also caused
lockups if the Content-Length
header is missing or
wrong.
If you do much more with phppdflib than install it and try out the example files, you're most likely going to hit a point where something you're trying to do doesn't work. Because of the nature of the whole thing, this can be a royal pain to fix. Here are some tricks:
With all versions of phppdflib, you should be checking the return
value of all method calls for a false
. This indicates
a detectable error occurred in that method. If you're using version
2, use the error methods to determine the nature of the error. All
your method calls should be wrapped in an if
statement
that checks for a valid return value, if they aren't, this is the
first step in debugging.
Please not that the example scripts are not good examples of this rule ... I'm working on it. Patches are welcome.
If you get foreach
errors originating from within
the library, you probably fed the method in question a single value
when it expected an array. If you're using version 2 and this happens,
please file a bug report as these types of mistakes should be caught
by the error reporting system.
It's likely that ->generate() is causing errors and mixing the error
message with the PDF output. This corrupts the PDF file. Comment out the
header()
lines in your code and view the result in your
browser. If you're getting errors, hunt them down as outlined above.
Once the first line to appear in the browser is "PDF%" followed by
gibberish, you're golden. Uncomment the header()
lines
and view the results of your efforts.
We've seen IE/Acrobat plugin respond that it could not find the file after it had downloaded it. This seems pretty ridiculous to me, but it happens.
We've also seen IE freeze up or display nothing but a blank PDF (even when other viewers displayed properly). In this case, saving the PDF to a file and launching your viewer independently of the browser yields the display you would expect.
In every case we've seen this is a problem with headers. Please read the headers section of this document for solutions.
phpdflib does not alter the display of your images at all. If the quality of your images is poor, you need to examine the procces that was used on the image before it was embedded into the PDF.
If your viewer can't display your images, first make sure you have the most recent version of the viewer program. Many viewers are known to have bugs in JFIF/JPEG display. To our knowledge, the most recent versions of the viewer software fixes these problems in each case.
If a recent viewer program still bonks on your images, try recreating them with different options. We've seen that saving a JPEG/JFIF from Gimp with the "Optimize" box checked creates images that don't display in many viewers. This is not a bug in Gimp, it's just that JPEG/JFIF images have so many options available that it's tough for viewers to properly handle them all.
Read this document and file a bug report based on the instructions in it.