How to convert HTML or XHTML to PDF
Apache FOP is an XSL-FO processor. If you want to convert HTML to PDF you need to convert it to XSL-FO first, before FOP can do anything for you. There are several possible approaches:
If the original data is available as XML it is probably the best approach to start with XML and create a separate XSLT that converts the XML to XSL-FO.
Convert the HTML to XHTML (e.g. by using jtidy) and convert the XHTML to XSL-FO using XSLT (e.g. with the Xalan included in the FOP distribution). Of course you will need a XSLT-stylesheet to be able to transform XHTML to XSL-FO. There is a
Stylesheet for XHTML to XSL-FO transformation available from Antenna House which is probably not completely compatible with FOP.
FOP currently doesn't support automatic table-layout. Column widths have to be specified.
Convert the HTML to XSL-FO directly using a specialized tool called html2fo (check out the Tools section below). This easy approach will offer no or very limited control of the PDF output design.
Add additional content (additional ideas, pitfalls, etc.)!