Template Engine Design

For the original post, see http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=110132582421156&w=2. The idea is to keep this design document updated to reflect the implementation, this far it is just a slightly edited version of the post.

Here are some ideas about how to implement JXTG 2.0.

I'll focus on describing how pre-compilation can be implemented and how to connect to Java written executable tags.

After having evaluated Jelly and taglib, I don't think it is worthwhile to trying to implement JXTG2 on top of them. It is certainly a good idea to reuse ideas and code from JXTG. One could possibly start from JXTG and go to JXTG2 through a sequence of refactoring steps, but the monsterous size of the implementation of JXTG makes that less attractive, at least for me. I would probably start from Jonas taglib as it contains some things that we are going to need any way and is a good starting point. Anyway it is IMHO a good idea to take a look at it, and see that a taglib implementation can be small and lean and does not have to be +150kB source code.

I'll continue to describe a possible design of a template generator (TG) that uses pre-compilation and has customer tags. This will be done in three steps: We start with a trivial pre-compiled template language (TL) without expressions. In the next step we add an expression language to it and we finish by adding executable tags.

A Trivial Pre-Compiled "Template Generator"

This step is not that useful in it self, the result will just be an unnecessarily complicated implementation of the file generator. This is for introducing some ideas.

The pre-compiling TG works in two steps: First it compiles the content of the input source to a script, and store the script in its cache together with an appropriate cache key. Then the cached script will be used as long as the cache key is valid. It is important that the script is thread safe otherwise it will be unsafe to reuse. As store, Cocoon's transient store could be used.

In the second step the script is executed and SAX output is generated.

The Trivial Script

The task of the trivial "template generator" is to output its input template document as is :) For this task there already is a feature complete script implementation in Cocoon; o.a.c.xml.SAXBuffer.

The SAXBuffer contains a list of SAXBits where each SAXBit is an (immutable) representation of a SAX event. The SAXBit interface contains the method send(ContentHandler). And the SAXBuffer contains an inner class implementing SAXBit for each SAX event type.

The SAXBuffer implements XMLConsumer which records the input events as list of SAXBits. It also implements XMLizable with the method toSAX(ContentHandler) that iterates through the recorded SAXBits and execute send(ContentHandler) on them.

This far compiling a script means sending the input source to a SAXBuffer and executing a script means calling the toSAX method on the SAX buffer.

JXTG contains code that is similar to the SAXBuffer.

A Real Template Generator

The defining property of an XML TL is that you can embed expressions in the XML attributes and text. These expression are replaced with the result of evaluating the expressions in some context. To add this to our TG we need to define the context, get a expression evaluator and handle the expression embedding in the script.

Expression Language

It would be best if we could decide to use JXPath everywhere in Cocoon ;) But as that not is going to happen we need to support several expression languages (EL). This is not just a need in JXTG but also in CForms, input modules and many other places. So we have a need for pluggable EL in Cocoon.

Dmitri Plotnikov, the main author of JXPath have written a common API for EL http://www.plotnix.com/jex/index.html, and committed the code to Jakarta Commons Sandbox http://cvs.apache.org/viewcvs.cgi/jakarta-commons-sandbox/jex/. AFAICS the project never took off so we shouldn't use it. But we could probably steal some good ideas from it, although I think that we should keep it much simpler.

ELs often has the possibility to create a compiled expression object from the expression string. When that is possible, (and given that the compiled expression is thread safe), we can get better performance by storing compiled expressions in the script.

Besides pluggable EL we also want to be able to plug in a locale sensitive and configurable converter component for handling the results from the EL.

Now, we don't have to implement everything at once. We can start with supporting one EL and refactor to using the pluggable EL, when/if someone implements it.

Expression Context

The expression must be evaluated in some context. JXTG have a lot of code that package the different objects in the Cocoon object model so that it looks the same way as in flowscripts, but from JXPath or JEXL. Carsten have factored out the code from JXTG to o.a.c.environment.TemplateObjectModelHelper in scratchpad. It is marked as "work in progress" and I don't know the current state of it. But IMHO we should base JXTG2 on that (and possibly improve it) rather than having still another implementation of the script object model.

Embedding Expressions in the Template Language

In the TL we are using expressions like this:

<tag attr1="text ${expr1} text ${expr2}">
  Text ${expr3} more text ${expr4}, even more text.

We can see that it is the SAX startElement and the characters events that are affected. During compilation of the TL into a script, we need to parse the attribute content and characters content and create a list of text elements and compiled expression elements.

During execution of the script the expression context must be available when executing the EL parts of the script.

In JXTG this is implemented in the TextEvent and the StartElement classes, where both contains list with interspersed expression and text objects.

If we continue to base our script implementation on the SAXBuffer, we need to extend it with new handling of startEvent and characters like in JXTG. We need new StartElement and Characters classes that can handle expressions like I described above. During script compilation time they need a EL factory. During script execution time they need a expression context, so their interface must be extended with something like send(ContentHandler, ExpressionContext) and similarly the script (extended SAXBuffer) need a method {{{toSAX(ContentHandler, ExpressionContext)}}}.

Adding Executable Tags

There are numerous ways to add executable tags (ET) to a TL. Of those I have seen I prefer Jelly's way, and therefore I'll describe how to implement something similar. Take a look on the implementation of some Jelly tags http://cvs.apache.org/viewcvs.cgi/jakarta-commons/jelly/src/java/org/apache/commons/jelly/tags/core/ and compare them with the corresponding tags in e.g. taglib and decide for yourself what you prefer.

For concreteness lets say that we have a for-tag in our TL (example from Jonas):

<core:for var="index" begin="0" end="8" step="2">

We have a TagRepository where URI and tag name is mapped to a tag object. Tags that are registered in the TagRepository (executable tags) gets special treatment. All others are handled as described in the previous sections.

When a ET is executed it will have access to:

e.g. choose/when, take a look at the Jelly tags)

Execution Context

The execution context contains the expression context as before but now we want the expression context to be implemented as a stack of (variable name, value) bindings, searched from top and downwards for a variable binding, (see Jonas code e.g. for details). This is needed for creating local variables in template "macros" and for handling context info in recursive templates. Recursive templates are needed as soon as you have recursively defined data structures like the class stuff in CForms.

The execution context gives access to the tag repository so that it is possible to add tags and use tags in user defined tags. It might also contain access to Cocoon components, and the source resolver.

The Executable Tag

The most complicated question in deciding how the ETs should work is if we should require them to be thread safe or not. If they are thread safe they can be made part of the script and be completely prepared at compile time. But OTH they will be harder to write as one cannot use member variables for state info.

For the moment I think not requiring thread safety would be enough and that we can see more efficient treatment of thread safe tags as a future optimization. In Jelly, tags can implement CompilableTag, which give them the possibility to do some work at compile time, I've no idea about how it work though.

Anyway the tag will be created at execute time and need access to the different data specified above.

Jelly uses the following interface for tags:

public interface Tag {
    public Tag getParent();
    public void setParent(Tag parent);
    public Script getBody();
    public void setBody(Script body);
    public JellyContext getContext();
    public void setContext(JellyContext context) throws JellyTagException;
    public void doTag(XMLOutput output) throws MissingAttributeException, JellyTagException;
    public void invokeBody(XMLOutput output) throws JellyTagException;

and setter injection for the attributes. Or this is rather the default way, Jelly also use various reflection based tricks and tries to handle almost any bean as a tag.

The "for" tag above would be implemented like this:

public class For extends TagSupport {
  String var;
  int begin, end, step;

  public void setBegin(int begin) { this.begin = begin }
  // other setters and getters

  public void doTag(XMLOutput output) throws JellyTagException {
    for (int i = begin; i <= end; i+= step) {
      if (var != null)
        getContext().variables().put(var, new Integer(i));

(I'm using Jelly interfaces, so the example isn't completely consistent with other discussion about interfaces in the mail).

I have no strong opinions about if we should user setter injection, getAttribute methods in doTag, etc.

Compiling the Script

Now we continue to extend our script implementation so that it can contain an ExecutableTag element as well as the previous ones. An ExecutableTag object is set up with attributes, the script for its body, its parent and a reference to the factory for creating its tag bean during compile time. During execution we need a {{{send(ContentHandler, Context)}}} method that creates and instantiates the tag object (as described above) and then call its doTag method. The script also need a toSAX(ContentHandler, Context) for executing all of the script.

Now we need a template parser its not enough to feed the input source into the SAXBuffer anymore as we are going to create a new script for every body of an executable tag. The parser need stacks for keeping track of the current tag and the current script, take a look at e.g. Jonas TemplateTransformer to get an idea about how such parsing can be implemented.

Implementing the Tags

The idea is that all tags in JXTG are implemented as executable tags in the above described framework. Also tags for defining new tags e.g. macros.

Besides implementing the JXTG tags, we would need to implement the ESQL tags (or a replacement of them), and porting (if that is needed) the CForms tags. After that we would have a replacement for XSP, AFAIU :)

TemplateEngineDesign (last edited 2009-09-20 23:41:37 by localhost)