Differences between revisions 5 and 6
Revision 5 as of 2009-09-20 23:52:12
Size: 5522
Editor: localhost
Comment: converted to 1.6 markup
Revision 6 as of 2009-12-14 13:25:11
Size: 5771
Comment: new things learned about AFP fonts
Deletions are marked like this. Additions are marked like this.
Line 37: Line 37:
At the moment, only AFP bitmap and outline fonts (single-byte) are supported. That limits the set of printable characters to what the specified codepage file provides. Many fonts, however, have more than 256 glyphs. There are two different directions we could take for better Unicode coverage: At the moment, only AFP bitmap and outline fonts (single-byte) are supported. That limits the set of printable characters to what the specified codepage file provides. Many fonts, however, have more than 256 glyphs. There are three different directions we could take for better Unicode coverage:
Line 40: Line 40:
 1. CID Keyed Fonts (Type 0) which use GCUIDs (Graphic Character UCS Identifier) for glyph names (also called "Unicode Fonts" in AFP jargon).
Line 42: Line 43:
One difficulty will be mapping Unicode scalar values to GCGIDs (Graphic Character Global Identifier from IBM's CDRA). GCGIDs are character names with a length of 4 to 8 characters. A large collection of Unicode characters can be mapped to GCGIDs but there's no 1:1 relationship. However, for most use cases, using this subset would probably be sufficient. So far, no machine-readable map for Unicode to GCGID could be found. A human-readable map can be found here: http://www-01.ibm.com/software/globalization/gcgid/gcgid.jsp. If you have an AFP fonts with GCUIDs (Graphic Character UCS Identifier) instead of GCGIDs, that might also work because GCUIDs just especially prefixed Unicode values expressed as 8 byte string. At the moment, the mapping from Unicode to code points (8bit inside the codepage) happens through the "encoding" setting on the font configuration element. However, that codepage file/object might also be generated by FOP. One difficulty will be mapping Unicode scalar values to GCGIDs (Graphic Character Global Identifier from IBM's CDRA). GCGIDs are character names with a length of 4 to 8 characters. A large collection of Unicode characters can be mapped to GCGIDs but there's no 1:1 relationship. However, for most use cases, using this subset would probably be sufficient. So far, no machine-readable map for Unicode to GCGID could be found. A human-readable map can be found here: http://www-01.ibm.com/software/globalization/gcgid/gcgid.jsp. If you have an AFP fonts with GCUIDs (Graphic Character UCS Identifier) instead of GCGIDs, i.e. a "Unicode fonts", that might also work because GCUIDs are just Unicode values expressed as 8 byte string and with the prefix "U000". At the moment, the mapping from Unicode to code points (8bit inside the codepage) happens through the "encoding" setting on the font configuration element. However, that codepage file/object might also be generated by FOP in the future.

More details on AFP fonts on the [[AFPFonts]] page.

Notes on AFP output

Please make sure you've read the tips on AFPResources.

Interoperability

As noted in AFPResources, great care has to be taken so files generated by FOP are printable on a wide range of output platforms.

Problematic features include:

  • IOCA function sets other than FS10
  • Include Object (IOB) (use page segments instead!)
  • Begin/End Image Object (BIM/EIM): native image formats are only supported on newer platforms (see "native" configuration setting in AFP renderer configuration)
  • Interchange Sets: it's important to stay at a low level. Respect maximum field sizes.

GOCA support

GOCA's vector graphic support is better than for example HP/GL2 than can be used for PCL output but it is still quite restricted compared to the capabilities of PDF or PostScript. So, simple vector images can easily be generated using GOCA but for more complex images, the image will have to be rasterized to achieve decent quality. The fox:conversion-mode="bitmap" extension attribute can be used for this.

Here's an incomplete list of restrictions which can cause suboptimal representation of vector graphics:

  • arbitrary transformations (like skewing)
  • line caps are undefined (one viewer painted rounded caps while another uses butt caps)
  • no miter limits or line join settings
  • dash styles are restricted
  • limited clipping support (only the outer image edges, NYI)
  • general accuracy problems due to reduced resolution with certain painting commands.
  • line widths are ambigously defined in the spec and need to be assumed device-specific. In the extreme case, we may even need to make the "normal line width" configurable so FOP can adjust the calculation accordingly.
  • accuracy problems when using bitmap fonts because you can't scale the font by fractional point sizes.
  • ...and probably more

Fonts

At the moment, only AFP bitmap and outline fonts (single-byte) are supported. That limits the set of printable characters to what the specified codepage file provides. Many fonts, however, have more than 256 glyphs. There are three different directions we could take for better Unicode coverage:

  1. Add support for double-byte fonts. This is similar to what we do with single-byte fonts in PDF and PostScript where multiple single-byte encodings are generated. AFP provides double-byte fonts by allowing to span out a matrix of codepages. Basically, you select the codepage with the first byte and the glyph with the second.

  2. CID Keyed Fonts (Type 0) which use GCUIDs (Graphic Character UCS Identifier) for glyph names (also called "Unicode Fonts" in AFP jargon).
  3. Add support for TrueType fonts. That's available in newer environments and would allow to embed Unicode characters as glyph selectors.

One difficulty will be mapping Unicode scalar values to GCGIDs (Graphic Character Global Identifier from IBM's CDRA). GCGIDs are character names with a length of 4 to 8 characters. A large collection of Unicode characters can be mapped to GCGIDs but there's no 1:1 relationship. However, for most use cases, using this subset would probably be sufficient. So far, no machine-readable map for Unicode to GCGID could be found. A human-readable map can be found here: http://www-01.ibm.com/software/globalization/gcgid/gcgid.jsp. If you have an AFP fonts with GCUIDs (Graphic Character UCS Identifier) instead of GCGIDs, i.e. a "Unicode fonts", that might also work because GCUIDs are just Unicode values expressed as 8 byte string and with the prefix "U000". At the moment, the mapping from Unicode to code points (8bit inside the codepage) happens through the "encoding" setting on the font configuration element. However, that codepage file/object might also be generated by FOP in the future.

More details on AFP fonts on the AFPFonts page.

Glyph positioning

It is best practice to use PTOCA's SIA (Set Intercharacter Adjustment) and SVI (Set Variable Space Character Increment) structures for glyph positioning where possible. Only for kerning effects or fixed-width spaces should relative cursor movements be used. This is done to keep print file sizes small and performance high. However, there's a catch. Experiments have shown that there's no consensus about the interpretation of those two features. Currently, FOP uses them but the result may not be as expected on every target environment. In the new AFPPainter there are now two different implementations, one using above structures and one doing glyph positioning exclusively using RMI (Relative Move Inline). The latter is currently disabled. If the need should arise, an option could be implemented to let the user choose which implementation to use.

Random Notes

  • We should include resource (especially fonts) at the beginning of the print file. That makes sure the resources are available to the viewers and reduces the complex setup of fonts. Furthermore, it guarantees that the printer always has the most current font files.
  • Instead of just referencing Charset and Codepage we should generate "Coded Fonts" (FOCA BCF/ECF) in the file-level resource group. It appears to be best practice.
  • There are established naming conventions for various kinds of objects. At some point we should use those. (TODO need to provide a list of those)
  • Add support for TLEs on page-sequence level
  • PTX should always begin with a PTO triplet
  • ev. generate Composed Text Control (obsolete but recommended)
  • at some point we'll need to support form maps
  • integrate higher-quality conversion to bi-level images that was created for PCL (MonochromeBitmapConverter). Add setting to favor quality or speed.

AFPOutput (last edited 2009-12-14 13:25:11 by JeremiasMaerki)