Motivation (or problems I want to solve)

I don't think that a technical solution will directly lead to better docs. I just think that making the doc authoring process easier will invite much more people to help out.

Architecture

Overview

    +------------------+              +------------------+
    |SVN REPOSITORY    |              |  COCOON WEBSITE  |
    | +--------------+ |period. update| +--------------+ |
    | |global docs   |-+--------------+>|/             |-|----+
    | +--------------+ |              | +--------------+ |    |
    | +--------------+ |period. update| +--------------+ |    |
    | |2.3 docs      |-+--------------+>|/2.3          |-|----+
    | +--------------+ |              | +--------------+ |    |
    | +--------------+ |              | +--------------+ |    |
    | |2.2 docs      |-+--------------+>|/2.3          |-|----+
    | +--------------+ |              | +--------------+ |    |
 +->| +--------------+ |              |                  |    |
 |  | |...           | |              |                  |----+
 |  | +--------------+ |              |                  |    |
 |  +------------------+              +------------------+    |
 |                                                            |
 |                                                            |
 |         +---------------+                                  |
 |         |WEB APPLICATION|                +-------------+   |
 |         |               |  redirect to   | unique edit |   |
 |         | - edit        |<---------------|    link     |<--+
 +-------->| - comment     |                +-------------+
 update &  | - approve     |   ,---.
  commit   | - report      |  /     \
           |               | / local \
           |               |( tempor. )
           +---------------+ \ repos /
                              \     /
                               `---'

The center of the new architecture is SVN. SVN contains all Forrest repositores. For the first, these are\[1\]:

There are two ways how the docs of these repositores are published:

Let's look at the structure of the website:

http://cocoon.apache.org ................... contains the global docs
http://cocoon.apache.org/2.2/............... contains the periodically (e.g. every 6 hours)
                                             published docs

Note that publishing multiple versions is good but can lead to confusion when people search on the web and get a page from a different release than what they are using. I'd suggest a note/link on each page, clearly stating to which version of Cocoon this page applies, and giving access to the version navigation (ideally a link which searches for the same info in other versions, but that's for a future release (wink) - BD

After the discussions on dev@cocoon I dropped the idea of publishing patch release docs. See the new drawing above that already reflects this. If you're interested in the original version of it, look into one of the former versions of this wiki page. ReinhardPoetz


The "living docs" docs that are periodically published out of the current repositories contain an "edit" and a "comment" link. This link is redirected to a web application where authors can write new docs or comments, e.g.

http://cocoon.apache.org/edit/2.2/23.html --> http://someapacheserver.apache.org/webedit/edit/cocoon/2.2/23.html

http://cocoon.apache.org/comment/2.2/23.html --> http://someapacheserver.apache.org/webedit/comment/cocoon/2.2/23.html

The web application at http://someapacheserver.apache.org/webedit/ has following features:

The implementation of this mini CMS could be done based on


_Look at http://wiki.apache.org/cocoon/CocoonDocumentationSystem and find all my requirements. I'm sure that all six options are good enough but as *I* have to do it, I'll take the road that's the fastest for *me*. Don't forget that learning where I have to add extensions to an enterprise level CMS like Daisy, Hippo or Lenya would take *a lot* of time. And as I have the (maybe illusionary) view, that I would be rather fast in implementing it on my own, I'll probably go the Forrest, Gianugo or "write my own" way. That's it.

Once again, nothing speaks against taking one of the enterprise level CMS and if one of the communities around them does the implementation and takes all requirements listed at http://wiki.apache.org/cocoon/CocoonDocumentationSystem into consideration, I'm the last one who stops them. One thing to add: I want to see a prototyp of the webapp running in 6-8 weeks from now; whoever writes it._ ReinhardPoetz (a reply to a mail of Andreas Kuckartz who asked what would speak against Apache Lenya)


\[1\] If 2.2 is coming soon we should concentrate on 2.2 docs 

Document format

I propose the format that the default configuration of the HTMLCleaningConvertor generates. What is good enough for Daisy should work as well for us (wink) Alternativly we can use plain HTML4. This would have the advantage that the document can be edited using any HTML editor.


I propose HTML as a base format. Forrest is able to render it without problems, and the Incubator website uses it as a source format. - NKB

Currently I think of supporting both, HTML (17.html) and "default HTMLCleaningConvertor XML" (17.html). Only one can be available, which is checked by Forrest. I think that after the online scenario is running, everybody will use it. Also if you like to write a first version with Mozilla Compose because you can copy the text into the HTMLArea form field afterwards. But for the time until the online scenario is running, we need an alternative, and here HTML comes in. ReinhardPoetz


In the repository, each document contains of 3 files:

 17.xml ........................ contains the content
 17.meta.xml ................... contains meta information (date, authors, history)
 17.comments.xml ............... contains user comments

Splitting up the docs into three parts has the advantage that the structure of each document is simpler, editors can be used to edit the content only. Also Openoffice goes this way.


The SimpleContentModel idea might also be good, it uses a slightly different but interesting model, where a documents is either a single xml file or a directory containing the main document and additional files - BD

This is what I propose, that does not contain a metadata file. The author and dates infos are in the source file, while the history is in SVN... no need to replicate stuff. The comments stuff can be added as a Forrest plugin. Maybe putting the comments in a doc.comments directory, with each a separate html file would be nice. - NKB

 17.html ........................ contains the content
 17.comments.xml ...............  contains user comments

For the comments stuff, see my proposal in the Cocoon whiteboard. I simply added it in a custom pipeline - see http://svn.apache.org/repos/asf/cocoon/whiteboard/doc-repos/2.2/src/documentation/sitemap.xmap. About skipping the meta infos, I'm not sure. I also want to provide info about status, target audience, keywords and this for all languages. Putting this information into plain text makes it hard/inpossible to explicitly use it in the publishing process query it in the future. Here the structure of a document on the filesystem:

In the repository, each document consists of a directory, which can contain following files:

 ./content_en.xml ........................ content as cleaned XML
 ./content_en.html ....................... the content as HTML (only content_en.xml OR content_en.html are allowed)
 ./content_de.xml ........................ content in German
 ./meta.xml .............................. contains meta information (date, authors, status, target audience, keywords)
 ./comments_en.xml ....................... contains user comments in English
 ./comments_de.xml ....................... contains the user ccomments on the German version
 ./mimes/ ................................ contains all images and attachements

The changes include the ideas from Bertrand and Nicola. Additionaly I added a location for mimes (images, attachments like ZIPs) and the option for having the content in multiple languages available. ReinhardPoetz

Why 'mimes'?? 'attachments' or 'files' sounds less surprising to me - BD


Document identifiers

I think we should move away from speaking names like "custom_components" and use plain numbers and put all documents into a flat directory. Speaking identifiers are nearly always a problem in data modelling.

Advantages:

Note to auto-generated docs: Every process that generates docs automatically, has to use a unique numbering scheme (namespace) so that IDs can't conflict with existing docs.


Apart from the very controversial idea of having numbers as IDs, the most important part is the flat structure (WIKI style) of all documents. (see http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=110593998230556&w=2 by Stefano). Structuring of content is IMO not a concern of the repository but of the publishing process. Forrest offers everything we need. ReinhardPoetz

After reading other's opinions I withdraw my proposal, and will use flat URLs with speaking identifiers. ReinhardPoetz


Forrest repositories

See http://svn.apache.org/repos/asf/cocoon/whiteboard/doc-repos/ for two examples that work with the latest SVN version of Forrest 0.6.

Published docs

The proposal for new global docs and user docs are available at http://apache.org/~reinhard/cocoon/1.html. Note that the page is generated out of two forrest repositories.

The global repository is responsible for the tabs "Projekt" and "Community". The tabs "Getting started" and "Documentation" link to the most recent, editable userdocs. Older (frozen) docs get their links in the second-level pelt in the "Documentation" tab.

What do we have to do?

Step by Step

The good news is that we don't need all the features at once. We can go the path to better docs and a better documentation system step by step:

Who does the work?

Roadmap

Readers comments

Please use italics to add comments above this line, or write more extensive comments and ideas below.