Apache Harmony is an open source Java implementation, starting with Java SE 5.0 and unfortunately it still doesn't have full Java SE API implementation. There are several leaks of functionality in Swing. And one of such missed functionalities is RTF parser for Swing.

The Rich Text Format is a document file format developed by Microsoft for cross-platform document interchange. Most word processors are able to read and write RTF documents. Standard Sun implementation of Java SE API has class RTFEditorKit, which can read text in RTF format from a stream and transform it into an implementation of Document interface, or write content of a Document to stream in RTF format. RTF parser implementation from Sun is incomplete and does not support the latest versions of RTF specification [1].

The main goal of this project is to write complete implementation of RTF parser that can read and write text in RTF, in other words, class RTFEditorKit will be implemented with its all necessary methods. Implementation will fully cover functionality of Sun implementation of RTF parser, and, if possible, supersede it adding support of latest versions of RTF specification. All required test cases for RTF parser will be created as well.

[1] http://www.microsoft.com/downloads/details.aspx?FamilyId=DD422B8D-FF06-4207-B476-6B5396A18A2B&displaylang=en

The original idea could be found at http://wiki.apache.org/general/SummerOfCode2008#harmony

Subject ID: harmony-swing-rtf


The RTF Java SE API [1] consists of a single class named RTFEditorKit, which contains methods for reading and writing text in RTF format. As this project aims to implement an RTF parser with functionality which will at least cover the functionality of Sun's RTF parser, the preliminary stage will be be consist of black-box tests on Sun's implementation of RTFEditorKit. The aim of these test is to determine the version of RTF specification supported by Sun RTF parser. Then a grammar under Apache license should be created for one of the popular parser generators, and RTFEditorKit toolkit implementation based on that grammar. The side goal of exposing such grammar under open source license would help any java program to conduct RTF parsing.

Parser generators will be used for implementation of RTF parser. Parser generators are able to create a parser based on grammar. Generated parser is a part of Java application which will understand RTF lexemes. There are several parser generators which could be used for RTF project:

Each of these implementations has its advantages and disadvantages. For example, current version of CSS parser [5] contains implementation based on JavaCC, some other parsers used in Harmony were generated using AntLR. Also there were propositions to implement an RTF parser using Custom Parser. Some research must be undertaken to determine which parser is more suitable for RTF project.

Test cases will be created to test all new Java classes. Also, as RTFEditorKit is inherited from StyledEditorKit, test cases must also be re-checked for this class because for some reasons they are excluded [6].

[1] http://java.sun.com/javase/6/docs/api/javax/swing/text/rtf/package-summary.html

[2] https://javacc.dev.java.net/

[3] http://www.antlr.org/

[4] http://sourceforge.net/projects/lpg/

[5] http://svn.apache.org/viewvc/harmony/enhanced/classlib/trunk/modules/swing/src/main/java/common/org/apache/harmony/x/swing/text/html/cssparser/

[6] http://svn.apache.org/viewvc/harmony/enhanced/classlib/trunk/modules/swing/make/exclude.common?view=markup


To be the full implementation of Java SE 5.0 Harmony must have full implementation of Java API and of course it must have the implementation of RTF parser. As RTF parser may be needed for some tasks, its existence in Harmony will be a benefit.


RTFEditorKit class. As RTFEditorKit works with Document interface, implementation of this interface for RTF will be created.

RTF parser classes generated by parser generator (or custom classes created without parse generator help). Grammar for parse generator.

Test cases for all created Java classes.



All results, problems and proposed solutions will be discussed on a public mailing list. I will try to keep all Harmony developers, interested in RTF parser implementation, up to date on my progress.


My name is Aleksey Lagoshin. I am a student of National Technical University of Ukraine "Kyiv Polytechnic Institute" and studying for my Master's degree.

I have experience as a Java developer for over three years. I worked in NetCracker company [1], where I was using such technologies as Swing, EJB2, JSP, JSF, JAXB, Oracle(PL/SQL), ant.

Also I created web based IRC [2] client using Google Web Toolkit (GWT). For example, to join #harmony channel on Freenode from my web client use this link [3].

Last year I participated in Google Summer of Code with Sockets for GWT project [4] and successfully finished it. Also I have integrated the module which I developed into my IRC client.

[1] http://www.netcracker.com

[2] http://code.google.com/p/webirc/


[4] http://code.google.com/soc/2007/google/appinfo.html?csaid=98704A52B8FEB4B7

AlekseyLagoshin/GSoC2008/harmony-swing-rtf (last edited 2009-09-20 23:35:28 by localhost)