Google Summer of Code Project - Add new LRU Cache Manager

This is the Wiki page for the Google Summer of Code Project to add new LRU Cache Manager

Student: Gokul Soundararajan

Mentor: Oystein Grovlen

Overview

In databases, caching plays an important role in performance. To obtain high performance, we need to cache pages from disk in memory to reduce access time. The problem stated is that the existing replacement algorithm performs poorly so we need to research new algorithms for page replacement and implement it in Apache Derby. The benefit of a better algorithm is improved performance.

In this project, I first have to look at existing work to quantify the different replacement algorithms used in databases. From these, I (with my mentor's help) need decide which one suits well with the workloads people run on Apache Derby. After selecting the algorithm, I will implement it in Derby. In addition, I will implement unit tests. Finally, I will perform experiments showing the benefit of the new algorithm over the existing algorithm.

Schedule

Design and Implementation Plan/Thoughts

References

  1. 2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm - Theodore Johnson and Dennis Shasha Link

  2. CAR: Clock with Adaptive Replacement - Sorav Bansal and Dharmendra S. Modha Link

  3. The LRU-K Page Replacement Algorithm For Database Disk Buffering - Elizabeth J. O'Neil, Patrick E. O'Neil, Gergard Weikum Link

  4. ARC: A Self-Tuning, Low Overhead Replacement Cache - Nimrod Megiddo and Dharmendra S. Modha Link

  5. CLOCK-Pro: an effective improvement of the CLOCK replacement - Song Jiang, Feng Chen, and Xiaodong Zhang Link

Status/Updates

August 19, 2006

I have done a bit in the last month. I have two different implementations of CacheManager: (1) the PluggableCache and (2) ClockProCache. The ClockProCache is my implementation of the ClockPro algorithm within the Derby CacheManager interface. The PluggableCache is my attempt to separate the cache implementation from the replacement policy. The nice thing is that many policies can easily be tried with this approach.

To conclude my summer work, I have written a report detailing my efforts PDF. In addition, I have placed by code here CODE. There is much more work to do. I will continue working on this addition to Apache Derby.

July 31, 2006

With some simple code changes, I modified the mirrored cache implementation to have almost no overhead (only 1% now). I have posted the results of my Zipf workload that I execute through the T_CacheService unit test. I'm using an approximation to ClockPro which gives pretty good performance. It is comparable to my earlier results that I got from simulation. The ClockPro approximation is like 2Q but the LRU lists are replaced with Clock. This avoids the overhead of shifting items to the top of the stack. I'll post more details later.

See the results of the mirrored cache implementation compared to the original Clock already in Derby.

Graphs: PNG PDF

July 21, 2006

It's been a long month. I have been trying to figure out how to separate the replacement policy from the actual CacheManager design and I think I finally have the answer. What I did was to maintain the cache's metadata separately in accordance to the replacement policy (an instantiation of an replacement policy is called a replacement policy engine). Whenever a page is faulted in, we update the engine saying that a page was faulted in. If the engine is full, then it will remove a page according to the replacement policy. When the CacheManager wants to do an eviction, it asks the policy engine for eviction candidates (this is similar for performWork() as well). From this list, the CacheManager checks each and evicts the best one that satisfies the CacheManager's constraints like not being kept and not being dirty.

There is some overhead in this approach. On every cache access, we have to update the engine as well. If the engine is well implemented, then the overhead is minimized. The second potential problem is that the engine might have evicted an item which the cache hasn't evicted (maybe it was kept earlier). Since the engine doesn't have this item, it won't recommend the item for removal again. This can be fixed by making  performWork()  periodically ask the engine the eviction status of an item by going through the holders list. This can be adaptive by only running when we are above the maximumSize by some threshold.

The Replacement Policy is implemented based on this interface:

public interface ReplacementPolicy {
        
        /* Record cache accesses inside the engine */
        public void access(Object key, String params);
         
        /* Given an item, it returns how evictable this item is
          * 1.0 => very sure
          * 0.0 => unsure
          */
        public double evictionConfidence(Object key);
        
        /* Return a sorted list of evictable items
         * index(0) => most evictable
         * index(inf) => least evictable
         */
        public Object[] evictableItems(int numOfItems);
        
}

June 20, 2006

Added new graphs for 0.8 Zipf Workload. I also include Clock-Pro in my simulation. See PNG and PDF

Clock-Pro performs the best for this workload which Oystein pointed as being representative of database workloads. However, the results are not complete since this workload does not include scans.

June 10, 2006

Commenting on the original thread in Derby Dev Mailing List Link

Thanks to all who commented on my early results. I have added the results of a mixed workload containing both Zipf and Scans. I followed the example provided in the 2Q paper in which they tried out scans of different lengths. I used a 10000 item cache with 100000 item dataset and ran a mixed workload of 0.8 Zipf with 33% scans of lengths (0, 10, 100, 1000, 10000). The resulting graph is available as PNG and PDF. The results show that there is a significant benefit by using the CART algorithm. Earlier, I was leaning towards using the 2Q algorithm but I found out that it has a significant synchronization penalty. The Postgres community implemented the 2Q algorithm (in 8.0) when they found out that ARC was patented by IBM. Since then, they have gone to Clock (in 8.1) mostly due to the contention penalty in 2Q. Since CART is a cousin of Clock, it may have less overhead.

June 04, 2006

Posted graphs for Zipf 0.5/0.8 access pattern. In the experiment, I experimented with a dataset containing 100000 items and different cache sizes (10,100,1000,5000,10000) for various Zipf workloads. I'm reporting the Hit Rate (%) after 100000 accesses.

PNG

0.5Zipf

0.8Zipf

PDF

0.5Zipf

0.8Zipf

The TwoQueue, CAR, and CART Algorithm perform better than LRU and Clock

June 02, 2006

Created prototypes for different cache algorithms:

  1. FIFO Cache
  2. LRU Cache
  3. Clock Cache
  4. Two Queue Cache
  5. CAR Cache
  6. CART Cache

Running basic experiments to measure performance for a Zipf Workload

May, 2006

Gokul is setting up a Derby build of the trunk and reading up on the CacheManager and Clock implementation. Talked with Oystein Grovlen and discussed plans. Immediate plan to create prototype caches to test different algorithms

DerbyLruCacheManager (last edited 2009-09-20 22:11:37 by localhost)