This page describes the new proposed design for FOP's intermediate format. The goals can be found on the parent page.
Basic ideas
Try to put the processing load on the layout side as much as possible. On the rendering side you should only have basic painting functions (basically a simple, XML-based graphical metafile).
The format needs to be streamable, i.e. no DOM is necessary, every command can be processed after the other, minimal stack with state information.
Difficulty: this works only for FOP's native feature set (basically text and border painting plus coordinate system handling). The processing of images, fonts and other resources are heavily output format specific, so the shift of processing focus to the layout side has its limits (this has to be done in the rendering stage).
The intermediate format has to become much easier and faster to parse.
Benefit: the renderers may be become leaner and therefore easier to maintain and implement.
Difficulty: features needed in the future (mainly: tagged PDF) could become a little more difficult since a graphical metafile will not contain structure information per se. OTOH, the current area tree doesn't have enough structure information, either.
Difficulty: until now, supporting extensions was relatively easy as each area tree object could carry extension attributes. With a graphical metafile, individual area tree object cannot directly be identified in the stream anymore.
When going in this direction, one question is obvious: Why not take SVG (or Adobe Mars) as the intermediate format? It would most probably be much slower than an optimized, proprietary format. But what if we restricted ourselves to a subset and didn't use Batik for the rendering? Just a thought.
It is worth noting that the working draft SVG 1.2 specification would provide pagination for the layout with the <page/> and <pageSet/> elements (see
http://www.w3.org/TR/2004/WD-SVG12-20041027/multipage.html) [2].
It is important to note that the XML representation of the area tree is still very important. The new IF is no replacement for unit testing the layout engine because the area tree contains much more verbose information of the layout result.
Sketching out a new XML format
<document xmlns="http://xmlgraphics.apache.org/fop/metafile" xmlns:xlink="http://www.w3.org/1999/xlink">
<header>
<x:xmpmeta xmlns:x="adobe:ns:meta/">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="" xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title>New Intermediate Format Demo Document</dc:title>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<bookmarks....
[PDF bookmarks]
</bookmarks>
</header>
<page index="1" name="1">
<page-header>
<ps:ps-setup-code>%FOPTestPSSetupCode: General setup code here!</ps:ps-setup-code>
</page-header>
<content>
<box transform="translate(5000, 6000)" width="18000" height="10000">
<font family="Helvetica" style="normal" weight="400" variant="normal" size="12000"
color="black"/>
<text x="1233" y="803" dx="0 0 20 0 0">Hello</text>
<draw-rect x="1233" y="1200" width="20000" height="20000" fill="yellow" stroke="none"/>
<box transform="translate(1233, 1200)" width="20000" height="20000" clip="true">
<image xlink:href="myimage.svg" x="0" y="0" width="20000" height="20000"/>
</box>
</box>
[..]
</content>
</page>
<page...
</document>
box: pushes the graphics state on a stack and applies an optional transformation (SVG-style). This is mainly used for reference areas.
image: in case of an instream-foreign-object, its content would simply be put in the image element's content instead of using xlink:href.
Other needed elements
color-profile element
destination (for PDF)
shape painting facility (for optimized, complex border painting, ex. in tables)
infrastructure for tagged PDF [1]: Structure tree (in the document header) plus content containers (in the page content), word-break info (could be modelled in a similar way Adobe Mars does it for their XML-based format)
Accessibility elements: language info and possibly other stuff
Clearly defined extension points (places where foreign namespaces can be used, at least: document header & page header)
maybe element-level metadata support
Old design (for reference)
New design
IFPainter design
IFPainter (working title, better suggestions welcome!) is a central interface, like Renderer. There's one implementation for each output format that is useful in the context of the intermediate format (probably includes all current renderers except text. most important are: PostScript, AFP and PCL). Ideally, the IFPainter interface is a direct equivalent to the possible SAX stream for the new IF format, i.e. it is possible to convert between IFPainter and the IF-NG SAX stream with no losses. The IFContentHandler in the graphic above would convert the SAX stream to IFPainter calls and a special IFPainter implementation used by IFRenderer could convert the calls to the SAX stream. That way, the IFRenderer could actually render to an IFPainter without the detour over XML.
The IFPainter interface is not fully designed, yet, so the following is just to give an idea what it could look like (all methods will probably throw SAXException):
public interface IFPainter {
void setUserAgent(FOUserAgent userAgent);
void setResult(Result result);
boolean supportsPagesOutOfOrder();
void startDocument();
void endDocument();
void startDocumentHeader();
void endDocumentHeader();
void startPageSequence(String id);
void endPageSequence();
void startPage(int index, String name, Dimension size);
void endPage();
void startPageHeader();
void endPageHeader();
void startPageContent();
void endPageContent();
void startPageTrailer();
void addTarget(String name, int x, int y);
void endPageTrailer();
void startBox(AffineTransform transform, Dimension size, boolean clip);
void startBox(AffineTransform[] transforms, Dimension size, boolean clip);
//For transform parsing, Batik's org.apache.batik.parser.TransformListHandler/Parser can be reused
void endBox();
void setFont(String family, String style, Integer weight, String variant, Integer size, String color);
//All of setFont()'s parameters can be null if no state change is necessary
void drawText(int x, int y, int[] dx, int[] dy, String text);
void drawRect(Rectangle rect, Paint fill, Color stroke);
void drawImage(String uri, Rectangle rect); //external images
void startImage(Rectangle rect); //followed by a SAX stream (SVG etc.)
void endImage();
void handleExtensionObject();
//etc. etc.
}
public class IFState {
//all font traits
//list of transforms since the last state safe (by startBox())
//maybe the effective clip shape
}
//additional needed classes
public class IFSerializer implements IFPainter {
public IFSerializer(ContentHandler handler) {
[..]
//convert IFPainter calls to XML (IF-NG)
}
public class IFContentHandler implements ContentHandler {
public IFContentHandler(IFPainter painter) {
[..]
//convert SAX stream calls to IFPainter calls
}
Note that the IFPainter should be designed so it is easy to write some kind of filter (like FilteredOutputStream) where implementors can react to certain events like startPageContent() so they can add their own content calls (content enrichment) for things like barcodes, OMR marks, background images etc.
Performance evaluation compared to previous approach
Performance is expected to be higher for the following reasons:
We have fewer content elements which accounts for a leaner implementation and reduction to essential content. No potentially empty container structures like we have now.
Currently, the IF is implemented over our area tree which contains a generic structure for traits. Maps present a certain amount over overhead by themselves which this approach avoids. Furthermore, the AreaTreeParser has to inspect various trait sets per area tree object which causes too many Map operations.
The conversion from ContentHandler to IFPainter only requires one lookup (see AttributesImpl.getValue()) per parameter. The number of parameters is very small. Some parameters are complex Strings which need to be parsed themselves (for example startBox()'s transform parameter). The overhead here should be relatively low since these are not extremely frequent operations. setFont()'s parameters could be changed from String to an enumeration class which would cause a Map lookup, but again setFont() is not a very frequent operation and not every parameter will be set and therefore parsed on every call.
Resource Management (Idea)
Some formats like PostScript and AFP require special processing to optimize resources (images, fonts etc.). The PostScript renderer currently supports an optional two-pass approach where the resources are only added in the second pass to the beginning of the PostScript file, i.e. after you know which resources are needed. The idea now is to enrich the IF renderer with a mechanism to track track used resources so a second pass can be avoided when producing the final output format. After all, the IF renderer already processes the full document and knows which resources are necessary.
We can make three categories of renderers:
no resource optimization (Java2D-based output formats, PCL, Text, SVG)
implicit resource optimization (PDF due to its object structure, Mars)
explicit resource optimization (PostScript, AFP)
Please note, that this mechanism is only useful to the third category, so the mechanism will not be enabled unless done so explicitely. For formats like PDF this processing is not necessary since resources are added as they are needed.
We define a listener interface that receives notification of resource usage (ResourceUsageListener). The RenderingResource interface will be implemented by a handful of classes (at the beginning: FontResource & ImageResource). The object identity of these classes is defined by what the IF supports:
Font: font-family, -weight, -style, -variant, -size
Image: URI
We need some infrastructure to keep track of resource usage on page-, page-sequence- and document-level. The most important is page-level. The other levels just summarize the accumulated data. Keeping track of resources down to page-level serves the following purposes:
Formats like PostScript note which resources are used by page.
The IF can be split and merged. Resource usage on document-level alone would not be sufficient to list which resources are effectively used in the final print file.
The resource usage information can optionally be integrated into the IF. For this purpose the IF is extended by a structure that is inserted into the page trailer and the document trailer. If desired the IF renderer could also support writing a separate file parallel to the generated IF, if the information needs to be tracked somewhere (could be implemented as a special resource listener).
Subformat:
<resource-usage> <font family="Arial" weight="normal" style="normal" variant="normal" size="10pt" count="1"/> <image uri="http://xmlgraphics.apache.org/fop/images/logo.jpg" count="1"/> </resource-usage>
On a side-note, resource counting will allow to only move resources to the document resources which are needed on more than one page. Resource on the target device can become too high if every resource is moved to the document resources unconditionally.
This whole idea adds some complexity but will make it possible to avoid a two-pass approach for PS and AFP generation which causes a reduction of through-put.
TODO
[done] Check usage of PPML in this context. (Result: Wouldn't really help in this context but the resemblance of the basic structure to the new IF is remarkable.)
[OPEN] Don't forget different writing modes
[OPEN] Decide whether to split IFPainter into IFDocumentHandler and IFPainter to reduce the number of methods per interface.
Comments
[1] JM: I wonder how far we'll need to go for tagged PDF. Do people only need the basic structure of the document (separating headers from flow content, maybe indicating block roles (para, title, footnote, ...) through extension attributes on fo:block? Or does someone need the whole tagged PDF feature set which basically allows you to embed many original semantics from the original FO document to be present in the PDF (spaces, indents, baseline shifts, alignment, etc.)?
[2] AC: Taking a SVG 1.2 subset (Tiny?) format, how slow would Batik be as a final renderer for FOP? JM: as already indicated in the first place: quite slow. The process: build-up of DOM tree, build-up of GVT tree, rendering GVT tree to Graphics2D, conversion of Graphics2D calls to final format. Too many steps in between. I suspect it would even be slower than today's solution. What we need is something that can be streamed and processed on the fly without building up too many intermediate structures in memory. BTW, Tiny is already too powerful for what we need. And I really don't intend to write a second Batik. If we decide to use an SVG subset, it would be an "SVG Nano".
I'll try to formulate a minimal format first trying to stay as close to SVG as possible. From there, we can check if this can be fully mapped to SVG elements. The risk is not needing all of SVG's features but needing many extensions which could again make the parsing to slow because of the growing complexity.

