Resource Usage in Derby
This page contains results found when investigating Derby's resource usage (DERBY-1961). If you have numbers you want to share, please add them to the Results section at the end.
Environment
- The Derby version tested is 10.2.1.6.
- Hardware: 2.4 GHz Dual-CPU AMD Opterons with 2 GB memory.
- OS: Solaris 10
Configuration: 2 disks, one for log and one for data. Write cache is enabled . derby.storage.pageCacheSize=12500.
- Platform: Java SE 6.0 - build 1.6.0-rc-b102 or later
Load
The testing uses the load clients in DERBY-1961. They are able to perform three kinds of operations:
- Single-record select operations. Each transaction reads one row from a table of 100000 rows (each row is around 100 bytes).
- Single-record update operations. Each transaction updates one row in a table of 100000 rows (the same table as the select operations).
- Join operations. Joins one table of 10000 rows with a table of 1000 rows. The result contains 1000 rows.
All three load types access the rows by primary key.
The tests are run as client/server tests so that the resource usage of the network server can be compared to the amount of resources used by the other parts of Derby. To make sure that the page cache can hold the entire database, increase the page cache size by starting the network server with -Dderby.storage.pageCacheSize=12500.
Modules
When looking at where Derby consumes resources, it helps to divide the code into modules or layers. Since the Derby code already is structured in packages, the modules could be defined in terms of the code packages:
Module |
Derby package |
Network server |
org.apache.derby.drda and org.apache.derby.impl.drda |
JDBC |
org.apache.derby.jdbc and org.apache.derby.impl.jdbc |
Execution |
org.apache.derby.impl.sql |
Access |
org.apache.derby.impl.store.access |
Lock manager |
org.apache.derby.impl.services.locks |
Buffer manager |
org.apache.derby.impl.services.cache |
Logging |
org.apache.derby.impl.store.raw.log |
Data store |
org.apache.derby.impl.store.raw.data |
Transaction control |
org.apache.derby.impl.store.raw.xact |
Results
CPU and wall clock time
These results were obtained using a byte code instrumentation tool. The CPU times include both system and user CPU. The runs were done on a Solaris 10 machine with x86 hardware. Wall clock time were obtained using gethrtime and the CPU times using getrusage.
The network server was run with Java SE 6 (build 104, 64 bit Server VM). TestClient was configured with 30 seconds of warmup and 120 seconds runtime ("steady state").
|
Single-record select (10 clients) |
Single-record update (20 clients) |
Join (4 clients) |
|||
Module |
System & user CPU |
Wall clock time |
System & user CPU |
Wall clock time |
System & user CPU |
Wall clock time |
Network server |
14.5% |
47.0% |
10.8% |
6.0% |
1.3% |
1.0% |
JDBC |
4.1% |
1.4% |
1.0% |
0.0% |
2.4% |
1.0% |
Execution |
7.9% |
2.6% |
8.1% |
0.4% |
4.7% |
2.7% |
Access |
18.1% |
7.0% |
13.2% |
0.8% |
32.5% |
24.6% |
Lock manager |
19.7% |
30.9% |
18.8% |
10.0% |
13.5% |
40.6% |
Buffer manager |
6.8% |
2.3% |
5.5% |
0.3% |
1.3% |
0.7% |
Logging |
0.0% |
0.0% |
11.0% |
81.1% |
0.0% |
0.0% |
Data store |
23.6% |
7.2% |
24.4% |
1.1% |
33.4% |
22.2% |
Transaction control |
5.3% |
1.8% |
7.3% |
0.3% |
10.8% |
7.2% |
Object allocations
These results have been found by running the JVM (Java SE 6.0 - build 1.6.0-rc-b102) with the -agentlib:hprof=heap=sites,... option and analysing the stack traces that were dumped. The numbers are not very accurate (in particular the join results), but they should at least give an impression of how much memory each module allocates.
Module |
Single-record select (10 clients) |
Single-record update (20 clients) |
Join (4 clients) |
|||
Network server |
905 B/tx |
15 objs/tx |
586 B/tx |
13 objs/tx |
68 B/tx |
4 objs/tx |
JDBC |
416 B/tx |
5 objs/tx |
|
|
176 B/tx |
3 objs/tx |
Execution |
1265 B/tx |
21 objs/tx |
2041 B/tx |
33 objs/tx |
1625 B/tx |
24 objs/tx |
Access |
840 B/tx |
25 objs/tx |
921 B/tx |
28 objs/tx |
110756 B/tx |
4093 objs/tx |
Lock manager |
579 B/tx |
23 objs/tx |
707 B/tx |
29 objs/tx |
42874 B/tx |
1774 objs/tx |
Buffer manager |
|
|
|
|
|
|
Logging |
|
|
93 B/tx |
3 objs/tx |
|
|
Data store |
1099 B/tx |
30 objs/tx |
1327 B/tx |
36 objs/tx |
92658 B/tx |
22960 objs/tx |
Transaction control |
80 B/tx |
3 objs/tx |
136 B/tx |
6 objs/tx |
80 B/tx |
3 objs/tx |
Derby1961Objects has more details about what sorts of objects are allocated, the size of the allocated objects, and the number of allocations.
Garbage collection
The next table shows how much of the CPU is spent on garbage collection. Platform: Java SE 6.0 - build 1.6.0-rc-b103, Solaris 10, dual CPU. The numbers were collected using the GC probes in DTrace.
VM |
Single-record select (10 clients) |
Single-record update (20 clients) |
Join (4 clients) |
Server VM |
3.8% |
0.05% |
0.04% |
Client VM |
5.8% |
2.7% |
1.4% |
CPU spent on GC when the tests were run on a single-CPU machine:
VM |
Single-record select (10 clients) |
Single-record update (20 clients) |
Join (4 clients) |
Server VM |
13% |
(not tested) |
(not tested) |
Client VM |
11% |
3.7% |
2.4% |
System calls
Results found by logging all system calls on the database server with DTrace.
Module |
Single-record select (10 clients) |
Single-record update (20 clients) |
Join (4 clients) |
Network server |
4.0 scl/tx |
4.0 scl/tx |
4.2 scl/tx |
JDBC |
|
|
|
Execution |
|
|
|
Access |
|
|
3.7 scl/tx |
Lock manager |
1.9 scl/tx |
0.6 scl/tx |
26.4 scl/tx |
Buffer manager |
|
|
1.2 scl/tx |
Logging |
|
2.0 scl/tx |
|
Data store |
|
0.4 scl/tx |
0.5 scl/tx |
Transaction control |
|
|
|
Unknown |
|
|
7.2 scl/tx |
Context switches
The results in the following table have been found by logging all context switches on the database server with DTrace. There are two kinds of context switches:
- voluntary context switches, which means the thread gives the CPU to another thread voluntarily (e.g., blocking reads, calls to
Object.wait() or attempts to lock a monitor which is locked by another thread)
- involuntary context switches, which means that a thread is preempted by the scheduler (because its time quantum has expired)
All the numbers are context switches per transaction.
Module |
Single-record select (10 clients) |
Single-record update (20 clients) |
Join (4 clients) |
|||
|
Voluntary |
Involuntary |
Voluntary |
Involuntary |
Voluntary |
Involuntary |
Network server |
2.0 |
0.0 |
2.0 |
0.0 |
2.0 |
0.0 |
JDBC |
|
|
|
|
|
|
Execution |
|
|
|
|
0.0 |
0.6 |
Access |
|
|
|
|
0.0 |
0.5 |
Lock manager |
0.4 |
0.1 |
0.3 |
0.1 |
7.3 |
2.9 |
Buffer manager |
|
|
|
|
0.1 |
0.2 |
Logging |
|
|
1.6 |
0.3 |
|
|
Data store |
|
|
|
|
0.0 |
0.6 |
Transaction control |
|
|
|
|
0.0 |
0.1 |
Unknown |
|
|
0.1 |
0.0 |
0.8 |
0.0 |
Disk Access
The results in the following table have been found by logging all disk accesses on the database server with DTrace. There are two types of disk access (No reads were observed. Disk accesses for select and join have been omitted):
- Asynchronous write
- Synchronous write
All the numbers are per transaction.
Module |
Single-record update (20 clients) |
|
|
Async write |
Sync write |
Logging |
5.5E-6 |
0.30 |
Data store |
0.06 |
6.7E-5 |
Monitors and synchronization
The next table shows the number of times the test clients tried to enter a contended monitor per transaction (as reported by the hotspot:::monitor-contended-enter probe in DTrace), and the number of times the clients slept on a synchronization object per transaction (as reported by the sched:::sleep probe in DTrace).
Platform: Java SE 6.0 - build 1.6.0-rc-b103, Solaris 10, dual CPU.
Module |
Single-record select (10 clients) |
Single-record update (20 clients) |
Join (4 clients) |
|||
|
Contention |
Sleep |
Contention |
Sleep |
Contention |
Sleep |
Network server |
|
2.0 |
|
2.0 |
|
2.0 |
JDBC |
|
|
|
|
|
|
Execution |
|
|
|
|
|
|
Access |
|
|
0.1 |
|
|
|
Lock manager |
0.3 |
1.0 |
0.5 |
0.7 |
2.7 |
6.8 |
Buffer manager |
0.1 |
|
0.1 |
|
0.1 |
0.1 |
Logging |
|
|
0.2 |
1.9 |
|
|
Data store |
|
|
|
|
|
|
Transaction control |
0.1 |
|
0.1 |
|
|
|
Unknown |
|
|
|
|
|
0.8 |
Profiling Results
Running the test client for 900 sec with the embedded driver and 2 clients. Only the two working threads have been profiled. Only methods that use more than 1% User CPU or have a "significant" Sync Wait Time or Sync Wait Count, have been included.
Profiling was done using Sun Studio11 collect and analyzer
Method Calls
Derby1961MethodCalls shows which methods are called in a transaction and how often they are called.