Abstract

Apache Harmony is an Open Source implementation of Java SE 5 platform, which includes class libraries and DRLVM (Dynamic Runtime Layer Virtual Machine). DRLVM has good modularity; the functionality of it is grouped into a limited number of coarse-grained modules with well defined interfaces.

Garbage collection is an important module in DRLVM; generally speaking, it is responsible for the Java objects allocation and heap reclamation. There are several garbage collection approaches in the Java heap management, including stop-the-world collector, concurrent collector etc. Harmony DRLVM GCv5 already has a concurrent GC called Tick, which is a concurrent mark-sweep collector. The benefit of this kind of GC is faster completion of collection, but the drawback is potential fragmentation.

The main purpose of this project is to implement a concurrent copying algorithm in DRLVM GC. It can improve the Java heap management by reduce the fragmentation and improve the data locality.

Proposal

Idea: Harmony-GC, Concurrent Copying GC of Harmony DRLVM

Project description

Copying garbage collectors have several obvious advantages. They permit fast allocation by incrementing a pointer in the allocation space, and compaction of live objects for better locality and to minimize fragmentation.

At the first step of the project, I will create copying heap data structures, and then a basic copying algorithm will be implemented. Concurrent Mark-sweep garbage collection algorithm has been implemented in Tick. It does not move live object and coalesces unused memory into free chunks, which are used later to satisfy allocation requests. This approach is very efficient; however, it may introduce more fragmentation and the allocation time will be linear with respect to free spaces. Initially, I will use a simple semi-space copying algorithm in NOS space, which requires a little more space to swap the live objects. It arranges live objects in a continuous space, and leaving an empty space to satisfy allocation requests in constant time.

Second step is to reduce the long pause time in Stop-the-world-style garbage collection, a shout-pause garbage (or concurrent) collection can use incremental collection, partial collection or hybrid of them to perform the collection work without stopping the mutator in most phases. There is a common rule of tracing algorithm, that is, a scanned object can only contain references to other scanned or unscanned objects, never to unreached objects when the tracing is complete. But a concurrent collection may violate this rule by updating reference during the concurrent collection process if we do not record the reference variation after collection started. To deal with this situation, we can use write barrier, read barrier or transfer barriers. Any of them can intercept behaviors of mutator and perform fix-up operation. Write barriers are the most efficient while read barriers are too expensive and transfer barriers may be produce more floating garbage when there are many object died young. In this project, write barriers will be implemented coordinating with JIT compiler in DRLVM. After this step, the concurrent copying garbage collection can work correct in DRLVM.

The third step is improving the performance of concurrent garbage collector. Several goals are listed as follows:

As a result, Harmony DRLVM can archive better performance and higher throughput using this concurrent copying garbage collector.

Deliverables:

Milestones:

I’d like to accomplish the project in following milestones:

Understanding gc_gen in DRLVM, design the basic heap structures;

Coding for the basic heap structures and implementing a simple copying algorithm;

Implementing write barrier for garbage collector, adding certain phase to JIT compiler to generate helper code for write barrier and make the concurrent collection can work across threads;

Minimizing the floating garbage;

Mutator stopping time elimination; Midterm evaluations.

Improving the collection algorithm to reduce the fragmentation, accelerate the allocation time;

Improving data access locality of in the copying phase of garbage collection;

Find more opportunities to improve the performance of concurrent copying collector; Write documents for the implementation; doing evaluation and analyzing results data.

Final evaluation.

About Me

Name Simon-Chow (Xun Zhou)

Affiliations and Contracts

I am a second-year graduate student of Parallel Processing Institute, Fudan University, China. My research interest includes programming language, compiler, and virtualization technique.

Contract information

Email: simon.harmony AT gmail DOT com

IM: probing6 AT hotmail DOT com (msn)

Programming skills:

I have been using Java programming language for 6 years, I really love it. I have experiences both in Java SE, EE and ME development. Additionally, I am also familiar with C/C++ programming languages and system programming.

Related experience:

From Oct 2007, I begin to study Harmony in order to complete a research project of combining static compiler with dynamic compilation environment. We are using Open64 as static compiler and aiming to gain higher performance of Java program by coordinating with Harmony.

In the last few months, I have studied most of modules in DRLVM such as VMCore, EM, interpreter, Jitrino.OPT, GC and written several documents about DRLVM in my research group. My basic study approaches include debugging and code review, and then I am familiar with the basic structure of Harmony DRLVM and codes in particular modules, such as VMCore, interpreter, Jitrino.OPT, GC_gen etc.

Motivation

Open source is a great movement in software development, especially in Java world, which provides numerous resources for both developers and software users. I often use open source code in my projects, which is always flexible and extendable. As a young researcher, I think Harmony is a great platform for Java related research, in which there are many challenge work and opportunities. I love Open Source very much, and I hope I can contribute to Harmony.

As a graduate student, I work in lab from 9:00am to 8:30pm at workdays. I can spend at least 25 hours per week on this project. If this project is assigned to me, it will be part of the final project of my master degree. So I will explore it as deep as possible and try my best to archive it perfect.