Help for Sitemap Browser

Abstract

A simple Cocoon sitemap can be clean and elegant. But as pipelines aggregate calls to other pipelines, and the number of pipelines increases, a sitemap can become difficult to follow. Sitemap Browser (SB) addresses this problem by visualizing a sitemap as an HTML document, displaying each pipeline next to the pipeline(s) it calls, and by hyperlinking related pipelines to each other for easy navigation. SB works to some degree on unmodified sitemaps but works better if you add sb:* markup to help handle the harder cases. SB can also be a convenient aid in unit testing, as a framework for linking to a sample invocation of each pipeline.

Purpose

The Sitemap Browser (SB) helps the user explore the connections between pipelines (technically, <map:match> elements) in a sitemap. It does this by arranging lining up pipelines with the pipelines they use, left to right. For an example, see How to use SB. SB reduces the need to hunt all over a sitemap to trace the flow of data.

Sitemap Browser is most useful for sitemaps in which pipelines call each other a lot.

Suggested Installation

Put SB's files into a folder that is a sibling of the folders of applications whose sitemaps you want to explore. For example, if your application is installed at cocoon-2.1.10/build/webapp/myapp, install the SB files (sitemap.xmap, *.xsl, and *.html) into cocoon-2.1.10/build/webapp/sb. Then the index page for SB will be http://(hostname)/sb/.

How to use SB

The easiest way to start SB is to go to the index page (http://(hostname)/sb/), click 'Browse...' to find the sitemap file you would like to browse, and press the "Go" button. Alternatively, you can run SB via the URL .../sb?sitemap=xyz/sitemap.xmap. The ... depends on where SB is installed. The value of the sitemap parameter is the path to the sitemap file, either an absolute path, or a relative one (relative to the directory where SB is installed). For example, if you want to browse the sitemap in the clam directory, and SB is installed in the directory sitemap-browser which is a sibling of clam, use .../sb?sitemap=../clam/sitemap.xmap.

When SB runs, you will see a diagram something like the following:

Pipeline(s) in focusPipeline(s) used
display (pattern="display-stats") [hide] code
<!-- produce web page displaying statistics -->
<map:match pattern="display-stats">
  <map:aggregate element="agg">
    <map:part src="cocoon:/data/{request-param:foo}/1"/>
    <map:part src="cocoon:/data/{request-param:foo}/2"/>
  </map:aggregate>
  <map:transform src="format-data.xsl"/>
  <map:serialize type="html"/>
</map:match>
data1 (pattern="data/*/1") [hide] code
<!-- fetch data from source 1 -->
<map:match pattern="data/*/1">
  <map:generate src="data1.xsp" type="serverpages"/>
  <map:serialize type="xml"/>
</map:match>
data2 (pattern="data/*/2") [hide] code
<!-- fetch data from source 2 -->
<map:match pattern="data/*/2">
  <map:generate src="data2.xsp" type="serverpages"/>
  <map:serialize type="xml"/>
</map:match>

The above is a diagram of a sitemap containing three pipelines. In the left column is an "outer pipeline" labeled display. By default, when SB starts up with no particular pipeline specified, it puts all pipelines in the left column that are not (as far as it can tell) used by other pipelines. In the right column are two pipelines used by display: data1 and data2. For each pipeline, the label is shown first, then the match pattern, and below that, the code. The label is a string used to identify the pipeline; by default, it is the same as the match pattern, unless you have specifically added an sb:id attribute to the pipeline (as has been done for the above example). See Marking up your sitemap for more information.

You can click on the [hide]/[show] buttons to toggle the display of pipeline code, allowing more pipelines to be shown at once. You can also click on the label of a pipeline to focus on that pipeline. When this happens, SB creates a new diagram with that pipeline in the left column, and any pipelines it uses in the right column. By using these links, and the Back button, you can browse through a a hierarchy of pipelines using other pipelines.

If the sitemap is marked up for SB, there may also be [test] links. Clicking on one of these links runs the associated pipeline, possibly with some sample parameters. This makes it easy to see what the output of a given pipeline might look like. (May only work if your sitemap is specified with a relative URL.) You may want to run the test in another web browser window; for example, in some web browsers, if you Ctrl+click on the [test] link, the output will appear in a new window or tab.

Marking up your sitemap

SB tries to do a decent job on any sitemap as-is, but it can do more if you add some SB-specific markup to the sitemap.

sb:id and sb:uses

One thing you can do is help SB know which pipelines are used by other pipelines. Without markup, SB tries to figure this out by using primitive heuristics to match src attributes with pattern attributes. Basically, it takes the beginning of each attribute, up to the first '*' or '?'. (It calls this the sb:id attribute (of a map:match) or the sb:uses attribute (of something inside a map:match that has a src attribute). The @sb:id is what is shown in bold at the beginning of each table cell in the display.) When this works, it's convenient, because you don't have to do anything to your sitemap to use SB. It works well when src and pattern attributes are mostly literals, not wildcards or parameters. But in other cases it doesn't work well, so you can add your own @sb:id attributes to your <map:match> elements to make sure they're properly identified. If you do this, you must declare the sb namespace prefix at the top of your sitemap, e.g.

<map:sitemap xmlns:map="http://apache.org/cocoon/sitemap/1.0"
          xmlns:sb="http://apache.org/cocoon/sitemap-browser1.0">
The value of the sb:id attribute can be any unique string, preferably something readable and intuitive. Then you also need to add an sb:uses attribute to each place where one pipeline uses another (typically where there is a src="cocoon:/..." attribute). Just make sure the value of sb:uses matches the value of an sb:id somewhere.

For example, the simple sitemap shown above could be marked up as follows:

<map:sitemap xmlns:map="http://apache.org/cocoon/sitemap/1.0"
  xmlns:sb="http://apache.org/cocoon/sitemap-browser1.0">
  ...
  <map:pipelines>
    <map:pipeline>
      <!-- produce web page displaying statistics -->
      <map:match pattern="display-stats" sb:id="display">
        <map:aggregate element="agg">
          <map:part src="cocoon:/data/{request-param:foo}/1" sb:uses="data1"/>
          <map:part src="cocoon:/data/{request-param:foo}/2" sb:uses="data2"/>
        </map:aggregate>
        <map:transform src="format-data.xsl" />
        <map:serialize type="html" />
      </map:match>
      <!-- fetch data from source 1 -->
      <map:match pattern="data/*/1" sb:id="data1">
        <map:generate src="data1.xsp" type="serverpages" />
        <map:serialize type="xml" />
      </map:match>
      <!-- fetch data from source 2 -->
      <map:match pattern="data/*/2" sb:id="data2">
        <map:generate src="data2.xsp" type="serverpages" />
        <map:serialize type="xml" />
      </map:match>
    </map:pipeline>
  </map:pipelines>
</map:sitemap>

Note that SB does not show the sb:* attributes when displaying a sitemap.

sb:example

You can also add sb:example="string" to each <map:match> element. For example,

<map:match pattern="data/*/1" sb:id="data1" sb:example="data/Johnson/1?record=124">

The string given as a value for sb:example provides concrete values for wildcards and any URL parameters required for the pipeline to run as intended. This enables SB to display a link, which you can use to test the pipeline and see its output. The value of sb:example should, of course, match the pattern given in the pattern attribute, and should match no earlier pattern. Otherwise, you could click on the test link and the wrong pipeline would run.

sb:main

Sometimes a pipeline of central importance in the structure of an application is used by other pipelines. If this major pipeline is two "uses" deep, it won't even show up on the SB page by default, when only top-level pipelines and the pipelines they use are displayed. To fix this problem, you can add an sb:main="true" attribute to the <map:match> elements of pipelines that you want to show up in the left column by default. (They will also show up in the right column next to pipelines that use them.)

Limitations of SB

Future directions

Feedback

If you have some ideas for improving SB, or if you improve it yourself, please let me know.