This page was copied from http://xml.apache.org/fop/dev/fonts.html. Once we're done here the content should be put back to XML and published on the website.

Back to FOPProjectPages.

Authors:


Glossary

output handler: A set of classes making up an implementation of an output format (i.e. not just the renderer, but for example the PDF Renderer plus the PDF library)

rendering run: One instance of the process of converting an XSL:FO document to a target format like PDF (or multiple target formats at once)

rendering instance: One instance of the process of converting an XSL:FO document to exactly one target format. Note that there may be multiple rendering instances that are part of one rendering run.

typeface: A set of bitmap or outline information that define glyphs for a set of characters. For example, Arial Bold might be a typeface. This concept is often also called a "font", which we have defined somewhat differently (see below).

typeface family: A group of related typefaces having similar characteristics. For example, one typeface family might include the following typefaces: Arial, Arial Bold, Arial Italic, and Arial Bold Italic. A typeface family may be named in a way ambiguous with its members -- for example, the family mentioned in the previous sentence might also be named "Arial".

font: A typeface rendered at a specific size. For example -- Arial, Bold, 12pt.

Goals

Issues

Design

Concern areas

There are several concern areas within FOP:

Thoughts

Central font registry

{ { { Until now each renderer set up a FontInfo object containing all available fonts. Some renderers then used the font setup from other renderers (ex. PostScript the one from PDF). That indicates that there are merely various font sources from which output handlers support a subset. } } }

{ { { So the font management should be extracted from the renderers and made standalone, hence the idea of a central font registry. The font registry would manage various font sources (AWT fonts, Type 1 fonts, TrueType fonts). Then, there's an additional layer needed between the font registry and the layout engine. I call it the font selector (for now), because it determines the subset of fonts available to the layout engine based on the set of output handlers (one or more) to be used in a rendering run. It could also do font substitution as the layout engine looks up fonts and selects the most appropriate font. If a font is available from more than one font source the font selector should select the one favoured by the output handler(s). This means that we need some way for the output handlers to express a priority on font sources. } } }

The font selector will have to pass to the output formats which fonts have been used during layout.

Common Resources

There are some possible efficiencies to be gained by using one FOP session to generate multiple documents, and by generating multiple output formats for one document. However, to accomplish this, we probably need to explicitly distinguish which resources are available to / used by a Session, a Document (rendering run), and a Rendering Instance.

Proposed Interface for Fonts (wvm)

We wish to design a unified font interface that will be available to all other sections of FOP. In other words, the interface will provide all needed information about the font to other sections of FOP, while hiding all details about what type of font it is, etc.

 { { {   /** {{{ * Hidden from FOP-general

package class TypeFaceFamily

/** {{{ * Hidden from FOP-general.

package interface TypeFace     package getEmbeddingStream() {} 

public class Font {{{ private TypeFace typeface;

/** some accessor methods **/ {{{ String getTypeFace();

/** The following methods are either computed by the implementations of the TypeFace {{{ interface, or are computed based on information returned from the TypeFace interface

Jeremias and I (wvm) have gone around a bit about whether the Font should be a class or an interface. The place where the interface is needed is at the { { { TypeFace } } } level, which is where the differences between AWT, { { { TrueType } } } custom, { { { PostScript } } } custom, etc. exist. The rest of FOP needs only the following:

So while the interface for { { { TypeFace } } } is good (to handle the many variations), any attempt to expose it to FOP-general makes FOP-general's interaction with fonts more complex than it needs to be.

Hardware fonts

Definition: "hardware" fonts are fonts implicitly available on a target platform (printer or document format) without the need to do anything to make them available. Examples: PDF and { { { PostScript } } } specifications both define a Base 14 font set which consists of Helvetica, Times, Courier, Symbol and ZapfDingbats. PCL printers also provide a common set of fonts available without uploading anything.

The Base 14 fonts are already implemented in FOP. We have their font metrics as XML files that will be converted to Java classes through XSLT. The same must be done for all other "hardware" fonts supported by the different output handlers. The layout engine simply needs some font metrics to do the layout.

OpenType Fonts

OpenType font files come in three varieties: 1) TrueType font files, 2) TrueType collections, and 3) Wrappers around CFF (PostScript) fonts. The first two varieties can be treated just like normal TTF and TTC files. The fonts metric information for all three is stored in identical (i.e. TrueType) tables. The CFF files require a small amount of additional logic to unwrap the contents and find the appropriate tables.

How to define a "font" in various contexts

A "font" can have various definitions in different contexts:


[1] It is important. Because these values are used to create the font descriptor in PDF. If these values are wrong you get error messages from Acrobat Reader. (jm) See http://xml.apache.org/fop/fonts.html where this issue is discussed in a note. If I understand the note correctly, we need to read in the pfb file to get this information. (wvm)

[2] Actually OpenType is the unified next generation font format that also replaces Type 1 Fonts. An OpenType font contains more information than either a Type 1 or TrueType font contains. It can wrap either of the two methods for describing the font outline information. (wvm)

[3] Please define what is meant by this term. Are we talking about which font to use when we can't find the one that is requested? (wvm)

[4] I am not sure about RTF, but I think that MIF will need some access to this information. (wvm) <wvm date="20030718"> OK, I think I was wrong about this. The StructureRenderers should be able to get everything they need about fonts directly from the XSL-FO input. If they need to aggregate similar fonts, or track which ones have been used, they should do that themselves.</wvm>

[5] My thought is that this should never happen. If the font registry is centralized, then when "XYZ, bold, 12pt" is requested, the same font should be selected every time. (wvm)

[6] I envision this information to be stored in the appropriate objects -- Session, Document, or RenderingInstance (these are concepts, not class names, because classes filling these concepts may already exist). Session should either be static or a singleton, and includes a list of all fonts (actually probably typefaces) used in this session. Document may not need to list anything, but RenderingInstance needs to know which fonts need to be embedded, among other things. (wvm)

[7] I think I am against allowing this. We would first need to first resolve how to register hardware fonts & get their metric information, which seems almost impossible. Then we would have to build a mechanism that maps font sources to output media -- PDF can use software fonts, but not hardware. PCL can use hardware, and depending on the printer, perhaps can use downloadable software fonts as well. This seems like an ugly, slippery slope, at least for a 1.0 release. I think it better to say that we support only soft fonts, and let the user build a workaround. The other really ugly aspect of this is that if you allow two different fonts to be used for two different rendering contexts, I think you have to have two different area trees to handle the layout differences. (wvm)

[8] See http://www.w3.org/Fonts/Panose/pan2.html. (wvm)

[9] Actually, multiple masters are used to generate specific instances of .pfm and .pfb files, so these live in separate files. We do have an issue with .cff (Compact Font Format, which contain multiple Type 1 faces), and .ttc (TrueType Collection, which contain multiple TrueType faces). Until we have parsing tools, these are really unusable to us. OpenType fonts have native support for multiple typefaces within a font file, and I think they support both of these formats. (wvm)

10 With regard to the design aspect, when a font object is requested by the layout classes, the same object should always be returned for the same basic information that is passed. (wvm)

11 A fixed-size font can be thought of as merely an instance at a specific point size of a typeface. So, for text, we probably need to take the first point size passed to us & use that throughout, spitting out an error if other point sizes are subsequently used. For the others, I suppose we have to first resolve the issues of whether/how to support hardware fonts. (wvm)

12 I don't think this is really a strong argument unless we refrain from using Batik for SVGs too. (pij) Response from wvm: <wvm> I don't understand this comment. The point is that we use our own registry for fonts because it is the only way to get font information in a headless environment. </wvm>

FOPFontSubsystemDesign (last edited 2009-09-20 23:52:34 by localhost)