Google Summer of Code 2008 - Project Proposal
aleksey.shipilev AT gmail DOT com
Apache Harmony is an Open Source Implementation of the Java5 SE, designed to be modular. Current mainline version of Harmony uses DRLVM as the primary JVM which development passed the peak stage and now all works are focused on J2SE 5 and J2SE 6 compliance. However, some parts of DRLVM need refactoring on the architectural and performance standpoints.
The crucial area of such refactoring should include the unification of native memory management in the DRLVM. Now DRLVM components use multiple wrappers for memory management: APR pools, STD_MALLOC, PoolManager, MemoryManager, CRT malloc, VirtualAlloc, PortLib pools and others. This amount of wrappers introduces troubles with debugging, tracing and performance tuning of DRLVM.
The primary goal of this project is to implement one solid Unified Memory Manager (UMM) which would serve for all DRLVM components instead of multiple wrappers, doped with debugging features like memory leak control and precise management. Upon completion this project will improve DRLVM architecture and manageability, and also will open the way to performance research (gathering allocation patterns, large pages impact, caching/prefetch strategies, concurrent allocation, etc).
Suggested plan for the project is following:
- Developing the UMM:
- Add new DRLVM component "UMM" – Unified Memory Manager.
- Implement the malloc wrapper in UMM - hymalloc
- Implement the mmap/VirtualAlloc wrapper in UMM - hyvmmap
- Add memory leak detection capability in hymmalloc and hyvmmap
- Add precise management capability in hymmalloc and hyvmmap
- Extending the interfaces to fit hymalloc/hyvmmap
- Converting VM core to hy* usage
b. PoolManager c. others
- Converting GC to hy* usage
VirtualAlloc b. mmap c. others
- Converting JIT to hy* usage
MemoryManager b. others
- Converting remaining VM components to hy* usage: Interpreter, EM, etc.
- Implement class-based allocation. E.g. reserve some memory for critical consumers such as exception handlers, finalizers, etc.
- Wrapping conventional malloc/free to UMM. This would allow Classlib native code and user native code to use UMM features.
- Provide the way to handle of unsuccessful memory allocation.
- Implementing thread-local and other performance-optimized schemes of pooling
- Provide concurrent allocation schemes and check the impact on performance
The plan can be changed upon the discussion on Harmony dev-list.
- New DRLVM component: UMM
- source code published in Apache JIRA or committed to trunk b. documentation, usage examples
- UMM impact on VM performance.
April 14 - May 5: Getting familiar with DRLVM internals on native memory management, instrumenting and gathering info on existing allocation schemes inside. Gathering the requirements and ideas on dev-list.
May 5 - May 26: Prototyping the UMM, implementing hymalloc/hymmap, prototyping features
May 26: Milestone: UMM prototype is ready and working on microtests
May 26 - June 16: First shot: moving VM core to UMM, extending the interfaces and capabilities if needed
June 16: Milestone: UMM prototype is working for VM core
June 16 - June 23: Second shot: moving GC to UMM
June 23 - June 30: Third shot: moving JIT to UMM
June 30 - July 7: Last shot: moving remaining components to UMM
July 7: Milestone: UMM prototype is working for all DRLVM components
July 7 - July 14: Completing mid-term evaluation
July 14 - August 4: Implementing missed features, bugfixing
August 4 - August 18: Stabilizing UMM, contributing to Harmony, solving remaining problems
August 18: Milestone: UMM declared stable.
August 18 - September 3: Bugfixing and performance research
The completion of this project will require participation in Harmony developers' community through mailing lists and issue trackers. There is already discussion started with possible mentors for this projects (Xiao-Feng Li and Andrey Yakushev), resulting in this proposal. The implementation will also require participation of Harmony committers at the time of contribution and wide testing.
I'm a student of Saint-Petersburg University of Information Technologies, Mechanics and Optics, Saint-Petersburg, Russia. I'm completing the 5-th of 6 year of my study in field of Computer Science. My experiences include 5+ year experience in C/C++, 3+ year experience in Java and lot of other useful technical skills like bash/Perl scripting. I'm participating in Apache Harmony project for 1 year, focusing on Java performance in general. My recent works include epoll()-based j.n.c.Selector, JNI transition improvements, number of JIT-side and Classlib patches.
I'm strongly addicted to Open Source Software both in scope of development and usage. I'm running Gentoo Linux on all of my PCs and actively contributing on bugfixing and local and global community support there. Though I feel my contribution in OSS is sporadic and not enough, thus my involvement in Harmony. It's also motivated on creating the OSS implementation of JRE which can be competitive with proprietary products. And since the Java spec is more or less stable, so no competition there, the main competition point of all JREs is performance. So, seeking and eliminating the performance problems became my hobby. Now it's time to get further and implement something solid like UMM.