Description/design document

Harmony offers a number of different garbage collectors. GSoC proposal harmony-gc-3 indicates the desire for a graphic front-end visualizing the memory management activities and the runtime status: a heap/GC visualization tool. GCSpy[1] is "an architectural framework for the collection, transmission, storage and replay of memory management behaviour", and is already used in the JikesRVM[2] project for observation and study of garbage collector behaviour[3, 4].

The GCSpy architecture[5] consists of a server integrated with the VM, and a separate visualization client which communicate over a socket. Integrating the server requires some amount of data collection within the VM, and a driver that maps this data onto the abstractions used by GCSpy. The drivers are collector- and VM-specific. GCSpy is also designed to be performance-neutral, in that when the visualization client is not connected, the speed of the VM is unaffected. The client itself defines a plugin architecture for visualizations, which gives flexibility for future additions.

GCSpy can also operate in an offline manner, by using a specialised server to replay traces. The same visualization client can then connect to the server as if it were attached to a running VM. Further enhancements to GCSpy include: sample-driven client-server communication, incremental stream updates and client-controlled stream update frequency for the framework; hierarchical driver grouping and hierarchical visualisation, zooming, and the ability to define and view relationships between tiles in different spaces for the client[4]. However, not all aspects of GCSpy have yet been updated to support these enhancements, notably the store/reply server.

Harmony has a very flexible, modular design for GC. In additional to documentation on the Memory Manager[6], a full, worked example of creating a garbage collector is available[7], as well as an overview of the state of GC within Harmony[8]. The presence of well-written and extensive documentation gives encouragement as to the feasibility of adding drivers such as required for GCSpy.

It is important to show that this integration does not compromise the performance of Harmony. Accordingly, the methodology described by Georges et al.[9] will be used to compare performance before and after integration, both with and without the visualization client connected.

List of deliverables

Quantifiable results for community

Having a heap/GC visualization tool available is an obvious benefit for the Harmony project. Further, by integrating a visualization framework that is already used with other VMs, it becomes possible to compare garbage collector behaviour not just within a VM, but across VMs. It also means that any further plugins written for the GCSpy visualization client for other projects may also be used for Harmony GC visualization.

Approach

The approach for this project is defined by the goals: to produce a working visualization tool for Harmony GC, and to ensure that it is performance-neutral. The former will involve running a number of benchmark cases to completion (without failure). Accuracy of the visualization will then be checked by the mentor/community, by comparison to expected behaviour. For performance, the techniques described in Georges et al.[9] will be used to benchmark before and after changes. This process will be followed for each driver as work progresses.

Approximate schedule

In terms of the timeline suggested by Google, there are three sets of goals:
- Initial project goal:

- Mid-term project goal:

- Final project goal:

Background text

I'm a part-time PhD student at the University of Kent, with my research interests centered around visualization. I have proficiency in Java (8 years), C (8 years) and C++ (6 years) along with some knowledge of Perl, Python, Haskell and some other languages. As part of my PhD, I have written and tested roughly 25k lines of C++ code in an application for modelling star formation[10]. I have lectured and assessed in both Java (including Swing) and C at Canterbury Christ Church University[11]. I am due to submit my PhD thesis by 31st April, and so, with the exception of a day or two for my viva, I have no academic or work commitments over the entire GSoC period. This means that, if selected, I will be able to devote my full attention to the project, which would not have been possible for me in previous years.

[ 1] http://www.cs.kent.ac.uk/projects/gc/gcspy/
[ 2] http://jikesrvm.org/
[ 3] http://doi.acm.org/10.1145/582419.582451
[ 4] http://dx.doi.org/10.1145/1133956.1133972
[ 5] http://research.sun.com/projects/gcspy/
[ 6] http://wiki.apache.org/harmony/MemoryManager
[ 7] http://harmony.apache.org/subcomponents/drlvm/gc-howto.html
[ 8] http://people.apache.org/~xli/docs/harmony_gcv5_overview.pdf
[ 9] http://doi.acm.org/10.1145/1297105.1297033
[10] http://www.cs.kent.ac.uk/people/rpg/rw24
[11] http://www.canterbury.ac.uk