I'm planning to read the following literature:

1. "Digital Typography": Breaking Paragraphs into Lines

The algorithm works when each line of the paragraph has a different length. This may be interesting for implementing side-floats, provided that we know in advance which height each line will have. And I think this may depend on the line-stacking strategy.

There are three main aspects which characterize the Knuth approach:

To enter a bit in the details:

Regarding (before-)floats:

2. "On the Pagination of Complex Documents"

From Brüggeman-Klein, Klein, Wohlfeil. This article may be found here. Also, from the same authors, the article "Pagination Reconsidered" gives the details of the algorithm.

2.1. Some Theoretical Background

This article let me understand some abstract background of document layout. In fact, document layout is somewhat similar to a bin packing problem: bins correspond to pages, objects to typeset material of some height. Those objects usually come from different flows: before-floats, main flow, footnotes. The order of each flow must be respected (a figure appearing after another one in the list must appear later on the pages), but there is some freedom in choosing from which flow the next typeset element will be taken.

An optimal way of putting those elements on the pages must be found. In other words there is some cost function to minimize. This cost function should reflect what a human would call a good pagination; among various criteria we may cite the following: figures should appear as near to their first reference as possible; footnotes must appear on the same page as their citation; facing pages (or columns) should be balanced; etc.

As for packing problems, page layout will usually be a combinatorial NP-hard problem. However, if the cost function has some particular properties (namely optimal substructure), it becomes possible to solve it in polynomial time by using dynamic programming. The cost function used by Knuth and Plass for computing demerits of each line of a paragraph has such a property.

For the page layout problem, the main issue is then to find a cost function:

Note that other techniques than dynamic programming may be used to layout pages. For example, by using some heuristic approach combined to genetic algorithms or simulated annealing (like in this article). But exploring those approaches would lead us much too far. Let's concentrate on the dynamic programming approach implemented in Fop.

2.2. Comments on the Article

2.3. Resulting Thoughts

The algorithm cannot be used directly as it is presented, mainly because of the two last items of the previous section. However, it provides a strong basis IMO. Some points:

When there are no figures nor footnotes, pagination becomes equivalent to breaking paragraphs into lines. In such a particular case the Knuth algorithm may be used. It would thus be great to find a cost function and an associated algorithm that become equivalent to line-breaking in such a case. Anyway, this would be great to utilize the full potential of the g/b/p model for page-breaking. And, moreover, to rely as much as possible on the current source code. More to come when I have looked at it.

3. "Optimal Pagination Techniques for Automatic Typesetting Systems"

Michael Frederick Plass -- PhD thesis

3.1. Introduction

Plass starts by having a look at the practices which were in use in the past. In short, compositors were mostly performing a first-fit pagination, because of the difficulty to look ahead and because re-typesetting the n-1 previous pages to have a better-looking page n demanded too much time. Footnotes were particularly difficult to handle. Others problems were balanced facing pages, and obviously floats placement.

The g/b/p model used for line-breaking does not readily apply to page-breaking, it must be extended with some "insert" operator to handle figures and footnotes.

3.2. Pagination may be an NP-complete Problem

The basic algorithm which consists in choosing the best pagination among all of the possible ones is obviously exponential. But depending on the chosen badness function, the problem may become solvable in polynomial time.

Basically Plass shows that, among others, "quadratic" badness functions lead to an NP-complete problem. Such functions compute the square of the differences between the page numbers of their citations and the page numbers of their corresponding figures.

In reality this is a bit more subtle. Indeed, with certain constraints, quadratic functions may be solvable in polynomial time. Plass actually shows that this is quite easy to choose constraints which makes the problem exponential. For example, simply allowing glue between the elements suffices to make the problem NP-complete. The issue is then to find the right constraints (number of citations per figure, number of objects per page, constant page size or not, and so on) and the right badness function, so that the problem still is a good representation of the reality yet is solvable in polynomial time.

3.3. Making Pagination Polynomial

It may be interesting to quickly explain the case where there are no footnotes nor figures. The problem becomes equivalent to line-breaking and lets understand the main basis on which the whole dynamic programming approach relies.

We define Bjk to be the badness corresponding to the best way to put the first j lines on the first k pages. Then the following formula holds:

where βijk is the badness due to placing lines i+1 through j on page k. The best way to put the first j lines on the first k pages only depends on the best ways to put i lines on the first k-1 pages. The only problem is to find this particular i for which (the badness of the first k-1 pages) + (the badness of page k) will be minimal.

We can easily see the recursive nature of this problem. The rest, well, "only" corresponds to implementation details.

When introducing footnotes and figures, the challenge is to find a formula which has a similar structure. Plass begins with a formula which computes the distance (in absolute value) between a figure and each(!) of its citations. The problem is solvable but the constraints are unrealistic: their may be only one type of element on each page, either text or a figure. Plass then generalizes the problem by handling any number of independant flows (normal text, figures, tables, etc.), but again with only one element per page.

Ok, I've decided to stop here. At the end of the paragraph, Plass presents a differently generalized problem and tries to find sufficient conditions which make it solvable by dynamic programming. In this case the previous weird constraints are relaxed: their may be any number of element on each page, several flows, several references from any flow to any other, etc. Then he formulates the cost function as depending on 4 independent arbitrary sub-functions, which give some freedom to the characterization of good pagination.

Then there is an entire chapter describing an implementing algorithm, which is very similar to the line-breaking algorithm. Here I think it is more interesting to refer to the work of Brüggemann et al., which is easier to read and more directly re-usable than Plass' work. It will only have to be adapted to the g/b/p model, which shouldn't be too difficult.

It may be worth returning to Plass' work if we encounter unexpected difficulties during the implementation of floats (e.g., the chosen algorithm doesn't provide satisfying results), as it provides a strong theoretical basis. For now I think I know enough to proceed to the implementation.

4. The Fop Source Code

Even if it is well explained, the Knuth line-breaking algorithm isn't that easy to understand. ATM I've concentrated on the class layoutmgr.BreakingAlgorithm, which contains the part of the algorithm which is common to page- and line-breaking. It is splitted in parts which follow pretty closely those described in "Digital Typography". It relies on the following skeleton:

findBreakingPoints(a paragraph, max allowable ratio for stretching, force or not the finding of breakpoints)
    initialize the various fields and variables
    For each Knuth element of the paragraph Do
        update totalWidth, totalStretch, totalShrink
        If the element is a feasible breakpoint Then
            > considerLegalBreak(at this element)
    If no set of feasible breakpoints could be found Then
        reactivate the last node and restart the algorithm from it
    Select some active nodes among the available ones
            > filterActiveNodes()
    For each remaining active node Do
        Compute the corresponding set of optimal breakpoints
                > calculateBreakPoints

considerLegalBreak(Knuth element)
    For each currently active node Do
        compute the adjustment ratio of a line starting at the active node and ending at this element
        If the adjustment ratio is unacceptable Then
            deactivate this node
            compute the fitness class of the corresponding line
                    > computeFitness
            update the total demerits by adding the demerits of this line
                    > computeDemerits
            If the result is better than the best recorded demerits for the given fitness class Then
                register this new best active node
                        > BestRecords.addRecord
        (perform something when the adjustment ratio is unacceptable and we are in forced mode)
    add new active nodes for breaks at this element
            > addBreaks

addBreaks(element ending the line, line number)
    compute the new totalWidth, totalStretch, totalShrink (after this element)
    For each fitness class Do
        If the recorded best active node for this fitness class is to be retained Then
            create a new active node for a break from this node to the element

calculateBreakPoints(last active node, corresponding paragraph, corresponding number of lines)
    For j = number of lines DownTo 1 Do
        breakpoint = node.element
        node = node.previous
  1. don't forget to have a look at Luca's LayoutExtensions on this issue (1)

GoogleSummerOfCode2006/FloatsImplementationProgress/Phase1Documentation (last edited 2009-09-20 23:52:30 by localhost)