Table of Contents

1. Goal

The patch implements caching of the isolation level and the current schema name in the client driver. The caching lets the client driver execute getTransactionIsolation() and a new method called getCurrentSchemaName() (which is not part of the java.sql.Connection interface), without a round-trip to the server.

It would also let the setTransactionIsolation() be "short-circuited" (and avoid a round-trip), in the case where the new isolation level is equal to that cached in the client. The latter optimization is not part of this patch since it would mean that setTransactionIsolation() no longer would be guaranteed to commit the current transaction. Including it would have made the client driver behave the same way as the embedded driver in this respect, but unfortunately it would also mean that a number of tests expecting the old behavior would fail.

2. Motivation

One would ordinarily not expect the performance of getTransactionIsolation() or getCurrentSchemaName() to be critical. However, the use of connection pools (commonly found in application server environments), invalidates this assumption.

A connection pool works by keeping a limitied number of (open) connections in a pool and letting application (threads) share the connections in the pool. The JDBC spec mandates that the connection pool must make sure that all attributes of a connection obtained from a connection pool must have the same default values as a connection obatined from a call to DriverManager.getConnection() (or DataSource.getConnection()).

The connection pool manager can either unconditionally set the isolation level back to its default value when a connection is returned to the pool (alternatively just before it is returned from the pool), or it can query the connection for its isolation level setting and just change it if it is different from the default. In either case, a driver that doesn't cache the isolation level will be forced to make an additional round-trip to the server each time a connection enters (or leaves) the connection pool.

A cheap way to get the current schema for a connection is important when implementing a [https://issues.apache.org/jira/browse/DERBY-3313 statement cache] on the client side, since PreparedStatement objects only can be re-used if the statement text is identical, and the schema used during compiliation and the current schema are identical.

3. The Stale Cache Problem

Caching the isolation level and current schema in the context of JDBC is simple, and prior to DERBY-1148 the client driver would cache the isolation level and only update its cache whenever setTransactionIsolation() was called. As explained in DERBY-1148 this is not sufficient, as the isolation level can change without setTransactionIsolation() being called (XA, SQL, stored procedures, and functions). The current schema can not be modified from JDBC, but it can be changed both from SQL and in stored procedures and functions.

In all of theses cases the cached values become stale (different from the actual isolation level and current schema in the embedded connection used by the NetworkServer).

4. Solutions to the Stale Cache Problem

One approach to solving the problem would be to scan all activity on the connection (including statements and prepared statements) to detect if a new isolation level is being set. This would be able to detect changes done in SQL, but not those coming from XA state changes, or those happening in stored procedures. A defensive approach would be to assume that any stored procedure or XA state change may have changed the isolation level so that a refresh from the server is required. This approach seems to limiting to be truly useful.

4.1. Implementing Piggy-backing

The discussion at DERBY-3192 showed that it is possible to add new product-specific code points which can be used to implement piggy-backing. This idea is the basis for the derby-3192-mark2.*.diff series of patches, and is described in the following paragraphs.

4.1.1. Getting session data from the embedded driver

In order for piggy-backing to be reasonably efficient the NetworkServer needs a cheap (and preferably convenient) way of obtaining session data from its EmbedConnection. The proposed patch achieves this by adding a new method, EngineConnection.getCurrentSchemaName() that is implemented in EmbedConnection by simply forwarding to the LanguageConnectionContext.

It is probably possible to do this by executing VALUES CURRENT SCHEMA on the embedded connection. Preparing the query on the server would give the benefit a shared, cached query plan, but this would still be considerably more costly than direct access, especially since there would be substantial contention for the shared query plan. This query would, after all, need to be executed once for each statement which has the potential to modify session data.

One way of reducing the number of calls is to use a callback to notify the NetworkServer layer about changes to session data. This approach would probably be the most efficient, but it would also require more invasive changes to the embedded driver. It also has the potential to introduce unwanted coupling between the embedded driver and the NetworkServer.

4.1.2. Identifying session data changes

Given that the proposed solution does not use callbacks, it must be possible to compare the last piggy-backed session data values to those returned from the current embedded connection. This means that there must be some way of storing the last piggy-backed values in the NetworkServer layer. This simple idea is complicated by a number of factors.

The values stored within the NetworkServer need to be initialized to values which lets them be compared to values from the current embedded connection, even if no piggy-backing has happened yet. It may seem reasonable to use each session attribute's default value for this, but doing so would mean that the initially stored values would compare equal to those in the embedded connection until until an actual change was made. As explained in the client section further down, the client starts with an empty cache, so that if no piggy-backing has happened yet, the client falls back to sending queries as before. That would, in turn, mean that any request for a session data attribute which happens before the client has changed it from its default value, will result in a query being sent. And since relying on the defaults is common, this would largely defeat the purpose of session data caching. In the proposed patch this is remedied by initializing the stored values to a special value which is different from anything that can be returned from the embedded connection. The actual implementation keeps the stored session attribute values in an object and lets null function as the special value.

An alternative solution would have been to let the client initialize its cache to the default values, but this would introduce a brittle dependency between the the client and the server code that would be hard to enforce and detect. The client would also be forced to trust (based on the version information about the server) that it actually will piggy-back changes, whereas with the proposed approach the client can be defensive and assume that it needs to use queries to get session data until it actually sees session data arrive from the server.

The comparison also needs to consider the state of the current embedded connection. The following special conditions are equivalent to a comparison that evaluates to true, so no piggy-backing should be done when:

4.1.3. Encoding and decoding session data

The goal is to piggy-back all changes to the isolation level and the current schema on the first message going from the server to client after the change has been made. To avoid having to change every reply (and every parsing method in the client) only replies to commands that can change either attribute will include the changed values. The following DRDA commands appears to be able to modify either the isolation level or the current schema, (the server was equipped with an ASSERT to verify that session data was not modified by other commands):

The SYNCCTL command is used in conjunction with XA, and the changes caused by this command must also be piggy-backed.

An initial idea was to add the session data to existing DSS variables in the replies. However, closer examination showed that the replies were structured rather differently, contained different types of DSS variables, and that some of the candidate DSSes were not always included in the reply. This meant that modifying existing variables to include piggy-backing would require messy changes to many write* methods on the server, and especially to the corresponding parse* methods in the client.

Using a separate DSS which can be included without depending on other elements of the reply seemed to offer a much cleaner solution. Three new code points in the product-specific range (0xC000-0xffff) have been added to implement the "separate DSS" solution:

Code point

Code point name

Payload

0xC000

PBSD

One or both of PBSD_ISO and PBSD_SCHEMA

0xC001

PBSD_ISO

1 byte, JDBC isolation level

0xC002

PBSD_SCHEMA

UTF-8 encoded string, current schema name

4.1.4. Maintaining cached session data on the client side

As mentioned in a previous section, the client starts with and "empty" cache where the cached session attributes have been initialized to illegal values. Any attempt to access such an illegal value is trapped an causes the client to fall back to sending a query to the server.

Whenever session data arrives piggy-backed on a message, parsePBSD() gets called, and will extract the modified session data and call ConnectionCallbackIinterface.completePiggyBackIsolation(int) or ConnectionCallbackIinterface.completePiggyBackSchema(String) as appropriate to update the cached session data inside the am.Connection object.

4.2. XA Transactions

There is really no reason to treat the XA commands differently from other DRDA commands which modify session data. Piggy-backing has (in the latest version of the patch, after some gentle nudging by the reviewers) been added to the reply to the SYCNCCTL command. This way every XA state change which modifies session data also updates the client-side cache. Hence there is no longer necessary to clear the client cache on each XA state transition (as was done in earlier versions of the patch).

4.3. Overhead

Piggy-backing will necessarily add some overhead to the replies being sent to the client, but there is no overhead on the requests from the client to server (unlike in the previous patch proposal). The exact size of the overhead depends on the data being sent, as shown in the following table:

DRDA VARIABLE

SIZE (BYTES)

DSS (PBSD)

6

PBSD_ISO

4 + 1

PBSD_SCHEMA

4 + bytelen(UTF8(schemaname))

TOTAL

11/15 + bytelen(UTF8(schemaname))

When no session parameter has changed (the common case) no overhead is introduced. If only the isolation level has changed 11 additional bytes are added to the next reply. If only the schema has been modified 10 bytes + the byte length of the UTF-8 representation of the schema name are added. In the worst case, when both attributes have changed, 15 bytes + the byte length of the schema name will be added.

4.4. Compatibility

A change which introduces new product-specific code points needs to consider compatibility with other drivers (DB2 Universal Driver and IBM's ODBC driver) and older versions of the DerbyNetClient driver. Specifically the patch needs to ensure that the new code points only gets sent to DerbyNetClient drivers with version 10.4 or higher. This is achieved by testing both the client type and client version before piggybacking session data:

        if (appRequester.getClientType() != AppRequester.DNC_CLIENT ||
                !appRequester.greaterThanOrEqualTo(10, 4, 0)) {
            return;
        }

On the client side compatibility comes "for free" since the client will never cache session data unless it has received them through piggy-backing. But to be on the safe side DatabaseMetaData has been extended with predicates for determining if the server supports session data caching through piggy-backing. This allows the client to ASSERT that the server is piggy-backing changes before it uses its cached session data.

Derby3192Writeup (last edited 2009-09-20 22:11:17 by localhost)