Apache Harmony is an Open Source implementation of Java SE 5 platform, which includes class libraries and DRLVM (Dynamic Runtime Layer Virtual Machine). DRLVM has good modularity; the functionality of it is grouped into a limited number of coarse-grained modules with well defined interfaces.
Garbage collection is an important module in DRLVM; generally speaking, it is responsible for the Java objects allocation and heap reclamation. There are several garbage collection approaches in the Java heap management, including stop-the-world collector, concurrent collector etc. Harmony DRLVM GCv5 already has a concurrent GC called Tick, which is a concurrent mark-sweep collector. The benefit of this kind of GC is faster completion of collection, but the drawback is potential fragmentation.
The main purpose of this project is to implement a concurrent copying algorithm in DRLVM GC. It can improve the Java heap management by reduce the fragmentation and improve the data locality.
Idea: Harmony-GC, Concurrent Copying GC of Harmony DRLVM
Copying garbage collectors have several obvious advantages. They permit fast allocation by incrementing a pointer in the allocation space, and compaction of live objects for better locality and to minimize fragmentation.
At the first step of the project, I will create copying heap data structures, and then a basic copying algorithm will be implemented. Concurrent Mark-sweep garbage collection algorithm has been implemented in Tick. It does not move live object and coalesces unused memory into free chunks, which are used later to satisfy allocation requests. This approach is very efficient; however, it may introduce more fragmentation and the allocation time will be linear with respect to free spaces. Initially, I will use a simple semi-space copying algorithm in NOS space, which requires a little more space to swap the live objects. It arranges live objects in a continuous space, and leaving an empty space to satisfy allocation requests in constant time.
Second step is to reduce the long pause time in Stop-the-world-style garbage collection, a shout-pause garbage (or concurrent) collection can use incremental collection, partial collection or hybrid of them to perform the collection work without stopping the mutator in most phases. There is a common rule of tracing algorithm, that is, a scanned object can only contain references to other scanned or unscanned objects, never to unreached objects when the tracing is complete. But a concurrent collection may violate this rule by updating reference during the concurrent collection process if we do not record the reference variation after collection started. To deal with this situation, we can use write barrier, read barrier or transfer barriers. Any of them can intercept behaviors of mutator and perform fix-up operation. Write barriers are the most efficient while read barriers are too expensive and transfer barriers may be produce more floating garbage when there are many object died young. In this project, write barriers will be implemented coordinating with JIT compiler in DRLVM. After this step, the concurrent copying garbage collection can work correct in DRLVM.
The third step is improving the performance of concurrent garbage collector. Several goals are listed as follows:
- Minimizing the floating garbage;
- Mutator stopping time elimination, it may be caused by exhausted heap space etc.
- Reducing the worst situation of heap fragmentation;
- Improving the object data access locality;
- Subsequent allocation will be faster and easier.
As a result, Harmony DRLVM can archive better performance and higher throughput using this concurrent copying garbage collector.
- Source code in C/C++
- Design documents and descriptions of implementation detail
- Performance evaluation results
I’d like to accomplish the project in following milestones:
April 15 - April 30
Understanding gc_gen in DRLVM, design the basic heap structures;
May 1 - May 15 (milestone 1)
Coding for the basic heap structures and implementing a simple copying algorithm;
May 16 - June 15 (milestone 2)
Implementing write barrier for garbage collector, adding certain phase to JIT compiler to generate helper code for write barrier and make the concurrent collection can work across threads;
June 16 – June 30 (milestone 3)
Minimizing the floating garbage;
July 1 – July 8 (milestone 4)
Mutator stopping time elimination; Midterm evaluations.
July 9 – July 20 (milestone 5)
Improving the collection algorithm to reduce the fragmentation, accelerate the allocation time;
July 21 – August 3 (milestone 6)
Improving data access locality of in the copying phase of garbage collection;
August 4 – August 17
Find more opportunities to improve the performance of concurrent copying collector; Write documents for the implementation; doing evaluation and analyzing results data.
Name Simon-Chow (Xun Zhou)
Affiliations and Contracts
I am a second-year graduate student of Parallel Processing Institute, Fudan University, China. My research interest includes programming language, compiler, and virtualization technique.
Email: simon.harmony AT gmail DOT com
IM: probing6 AT hotmail DOT com (msn)
I have been using Java programming language for 6 years, I really love it. I have experiences both in Java SE, EE and ME development. Additionally, I am also familiar with C/C++ programming languages and system programming.
From Oct 2007, I begin to study Harmony in order to complete a research project of combining static compiler with dynamic compilation environment. We are using Open64 as static compiler and aiming to gain higher performance of Java program by coordinating with Harmony.
In the last few months, I have studied most of modules in DRLVM such as VMCore, EM, interpreter, Jitrino.OPT, GC and written several documents about DRLVM in my research group. My basic study approaches include debugging and code review, and then I am familiar with the basic structure of Harmony DRLVM and codes in particular modules, such as VMCore, interpreter, Jitrino.OPT, GC_gen etc.
Open source is a great movement in software development, especially in Java world, which provides numerous resources for both developers and software users. I often use open source code in my projects, which is always flexible and extendable. As a young researcher, I think Harmony is a great platform for Java related research, in which there are many challenge work and opportunities. I love Open Source very much, and I hope I can contribute to Harmony.
As a graduate student, I work in lab from 9:00am to 8:30pm at workdays. I can spend at least 25 hours per week on this project. If this project is assigned to me, it will be part of the final project of my master degree. So I will explore it as deep as possible and try my best to archive it perfect.