Apache Harmony is an open source Java implementation, starting with Java SE 5.0 and unfortunately it still doesn't have full Java SE API implementation. There are several leaks of functionality in Swing. And one of such missed functionalities is RTF parser for Swing.
The Rich Text Format is a document file format developed by Microsoft for cross-platform document interchange. Most word processors are able to read and write RTF documents. Standard Sun implementation of Java SE API has class RTFEditorKit, which can read text in RTF format from a stream and transform it into an implementation of Document interface, or write content of a Document to stream in RTF format. RTF parser implementation from Sun is incomplete and does not support the latest versions of RTF specification .
The main goal of this project is to write complete implementation of RTF parser that can read and write text in RTF, in other words, class RTFEditorKit will be implemented with its all necessary methods. Implementation will fully cover functionality of Sun implementation of RTF parser, and, if possible, supersede it adding support of latest versions of RTF specification. All required test cases for RTF parser will be created as well.
The original idea could be found at http://wiki.apache.org/general/SummerOfCode2008#harmony
Subject ID: harmony-swing-rtf
The RTF Java SE API  consists of a single class named RTFEditorKit, which contains methods for reading and writing text in RTF format. As this project aims to implement an RTF parser with functionality which will at least cover the functionality of Sun's RTF parser, the preliminary stage will be be consist of black-box tests on Sun's implementation of RTFEditorKit. The aim of these test is to determine the version of RTF specification supported by Sun RTF parser. Then a grammar under Apache license should be created for one of the popular parser generators, and RTFEditorKit toolkit implementation based on that grammar. The side goal of exposing such grammar under open source license would help any java program to conduct RTF parsing.
Parser generators will be used for implementation of RTF parser. Parser generators are able to create a parser based on grammar. Generated parser is a part of Java application which will understand RTF lexemes. There are several parser generators which could be used for RTF project:
- JavaCC 
- AntLR 
- LPG 
- Custom parser
Each of these implementations has its advantages and disadvantages. For example, current version of CSS parser  contains implementation based on JavaCC, some other parsers used in Harmony were generated using AntLR. Also there were propositions to implement an RTF parser using Custom Parser. Some research must be undertaken to determine which parser is more suitable for RTF project.
Test cases will be created to test all new Java classes. Also, as RTFEditorKit is inherited from StyledEditorKit, test cases must also be re-checked for this class because for some reasons they are excluded .
To be the full implementation of Java SE 5.0 Harmony must have full implementation of Java API and of course it must have the implementation of RTF parser. As RTF parser may be needed for some tasks, its existence in Harmony will be a benefit.
RTFEditorKit class. As RTFEditorKit works with Document interface, implementation of this interface for RTF will be created.
RTF parser classes generated by parser generator (or custom classes created without parse generator help). Grammar for parse generator.
Test cases for all created Java classes.
- April 15 - May 3: Understanding the RTF specification and it's different implementations.
- May 4 - May 10: Sun RTF parser black-box tests.
- May 11 - May 25: Parser generators research.
- May 26 - June 7: Creation of design for RTFEditorKit, parser and all additional helper classes.
- June 8 - June 14: Writing grammar for parser generator.
- June 15 - July 7: Writing the workable version of RTF parser with limited functionality. Writing test cases.
- July 7 - July 14: Mid-Term evaluation. At this point RTF parser should be able to read a simple RTF file, pass it to Swing GUI form and after editing write it back in RTF file. Bugfixing.
- July 15 - August 11: Finishing the grammar and parser at all to cover functionality that Sun RTF parser provides. Adding support for the latest RTF specification. Writing test cases.
- August 11 - August 17: Bugfixing. Documentation improving.
- August 18 - September 1: Final evaluation.
All results, problems and proposed solutions will be discussed on a public mailing list. I will try to keep all Harmony developers, interested in RTF parser implementation, up to date on my progress.
INFORMATION ABOUT ME
My name is Aleksey Lagoshin. I am a student of National Technical University of Ukraine "Kyiv Polytechnic Institute" and studying for my Master's degree.
I have experience as a Java developer for over three years. I worked in NetCracker company , where I was using such technologies as Swing, EJB2, JSP, JSF, JAXB, Oracle(PL/SQL), ant.
Also I created web based IRC  client using Google Web Toolkit (GWT). For example, to join #harmony channel on Freenode from my web client use this link .
Last year I participated in Google Summer of Code with Sockets for GWT project  and successfully finished it. Also I have integrated the module which I developed into my IRC client.