NOTE: This wiki page describes the initial implementation of replication and may not accurately describe how it works in current versions of Derby. For an up to date description, see the Replicating databases topic in the Derby Server and Administration Guide for your Derby version (link to 10.9 manual).

Table of Contents

1. Definition

Database replication is the creation and maintenance of multiple copies of the same database.

2. Classification of Replication forms

Replication is classified based on

2.1. When update propagation takes place

2.1.1. Eager (synchronous)

Updates propagate within the boundaries of the transaction, the user does not receive commit notification until sufficient number of copies of the database have been updated

Adv

Guarantees straightforward consistency

Disadv

Expensive because of message overhead and response time.

2.1.2. Lazy (Asynchronous)

Update a local copy commit it and only sometime after the commit the propagation of the changes takes place. Directly after the execution of the transaction the local server sends the response back to the client.

Adv

Wide variety of optimizations since it allows to bundle changes of different transactions and propagate updates on an interval basis to reduce communication overhead.

Disadv

Inconsistencies might result because of the delay in the propagation of the update information.

2.2. Who can perform updates

2.2.1. Primary copy

Update at the master copy or primary copy and then propagate the changes to the other copies. All clients contact the same server for updates.

Adv

Simplifies the replica control, since updates take place in the primary and the replicas apply only the changes that are propagated.

Disadv

The master copy is a bottleneck since all the transactions result in updates here.

2.2.2. Update Anywhere

Update any server for updates.

Adv

Speeds up access since multiple masters can be updated simultaneously.

Disadv

Establishing agreement among different replicas in complex, since, conflicting transactions might have executed on different replicas and might result in transactions not only being stale but inconsistent too. Reconciliation to decide which transactions are winners and which must be undone will be expensive.

3. Derby Replication

3.1. Principle

Derby implements an asynchronous, primary copy replication scheme. The replication implementation builds on Derby's existing capability to recover from a crash by starting with a backup and then rolling forward archived and active log files. This capability is used to fashion a slave implementation.

The master forwards logs to the slave. In the slave mode in contrast to the situation during normal crash recovery instead of proceeding to rollback unfinished transactions from when it has last seen the log, derby will instead wait for new log. When no more log is forthcoming derby will finish the recovery and the slave database is booted and can start serving transactions (failover).

Until failover happens only the master will serve transactions, so there is no load-sharing by the slave. Since the replication is asynchronous, some transactions may be lost upon failover, but the failed-over database will be transaction consistent when it is booted.

Related JIRA Issue(s)

Derby-2872

3.2. Basic Design

A simple block diagram for the replication implementation can be found below

ReplicationDesign.gif

The solid arrows indicate the flow of the log records through the system.

The dotted arrows indicate that the different modules interact through the master controller.

Explanation of individual modules

3.2.1. Master Controller

The derby module that implements the master mode of replication. It boots the other master sub-systems viz. Replication Log Buffer, Replication Log Shipper and the Replication Message Transmitter and acts as the central co-ordinator between them.

Related JIRA Issue(s)

Derby-2977

3.2.2. Replication Log Buffer

All log records that are written to the local log file are also appended to the Replication log buffer. The Derby logging system passes the log records to the Replication log buffer through the master controller. The log buffer is needed for the following reasons

1) A buffer is needed because the log records should not be shipped one at a time.

2) Writing to the log buffer instead of transmitting them immediately removes the network communitcation from the

Related JIRA Issue(s)

Derby-2926

Derby-3051

3.2.3. Replication Log Shipper

Does asynchronous shipping of log records from the master to the slave being replicated to. The implementation does not ship log records as soon as they become available in the log buffer (synchronously), instead it does log shipping in the following three-fold scenarios

1) Periodically (i.e.) at regular intervals of time.

2) when a request is sent from the master controller (force flushing of the log buffer).

3) when a notification is received from the log shipper about a log buffer element becoming full and the load on

Related JIRA Issue(s)

Derby-3064

Derby-3359

3.2.4. Replication Message Transmitter

Used to send replication messages to the slave. Called by the Master controller to transmit replication messages to the receiver.

Related JIRA Issue(s)

Derby-2921

3.2.5. Replication Message Receiver

Receives the message from the master and performs appropriate action depending on the type of the message.

Related JIRA Issue(s)

Derby-2921

3.2.6. Slave Controller

The Derby module that implements the slave mode of replication. It starts the Replication log receiver and writes the log records that are received from the master to the derby logs. It also initiates recovery upon receiving the failover command resulting in the eventual booting of the slave database.

Related JIRA Issue(s)

Derby-2978

3.3. How to get your first replication trial running?

What follows below is the series of steps taken to configure replication on a local machine in a client/server environment.

Assumptions:

1) The user knows how to start and stop the derby network server

2) The user knows to use ij (the command line jdbc tool that comes with derby).

Step 1.

Create a master and the slave directories. (Assuming that the master directory is called "master" and the slave directory is called "slave".

Step 2.

Start the derby network server that will house the replication master. Let this network server run on port 1527.

java -Dderby.system.home=$REPLICATION_SOURCE/master org.apache.derby.drda.NetworkServerControl -noSecurityManager -p $portno start &

Where REPLICATION_SOURCE will point to the directory that contains the master and the slave directory. portno in this case will be 1527.

Step 3.

start ij

java org.apache.derby.tools.ij

Step 4.

Create the database that will be used during replication. Let this database be called replicationdb.

ij> connect 'jdbc:derby://localhost:1527/replicationdb;create=true';

Step 5.

Freeze this database

call SYSCS_UTIL.SYSCS_FREEZE_DATABASE();

Step 6.

Open another terminal, move to the slave directory, copy the master database from the master directory.

cd slave cp -rf ../master/replicationdb ./

Step 7.

Start the network server that will house the replication slave. Let this network server run on port 1528.

java -Dderby.system.home=$REPLICATION_SOURCE/slave org.apache.derby.drda.NetworkServerControl -noSecurityManager -p $portno start &

Where REPLICATION_SOURCE will point to the directory that contains the master and the slave directory. portno in this case will be 1528.

Step 8.

start ij

java org.apache.derby.tools.ij

Step 9.

Start the slave.

connect 'jdbc:derby://localhost:1528/replicationdb;startSlave=true;slaveHost=localhost;slavePort=8001';

The connection attempt will hang until a successful connection from the master has been received.

Step 10.

Move to the terminal in which the master database has been frozen and start the master

connect 'jdbc:derby://localhost:1527/replicationdb;startMaster=true;slaveHost=localhost;slavePort=8001';

The slave would throw an error message similar to this

ERROR XRE08: DERBY SQL error: SQLCODE: -1, SQLSTATE: XRE08, SQLERRMC: Replication slave mode started successfully for database 'replicationdb'. Connection refused because the database is in replication slave mode.

Step 11.

Excecute a few inserts and updates on the master

Step 12.

Attempt a failover on the master now

connect 'jdbc:derby://localhost:1527/replicationdb;failover=true';

The following error message is thrown.

ERROR XRE20: DERBY SQL error: SQLCODE: -1, SQLSTATE: XRE20, SQLERRMC: Failover performed successfully for database 'replicationdb', the database has been shutdown.

Indicating that the master database has been shutdown after a failover attempt.

Step 13.

Now try connecting to the slave database (failed-over database, or the new master database)

connect 'jdbc:derby://localhost:1528/replicationdb';

Step 14.

Verify that the transacations performed on the master were reflected on the slave also.

ReplicationWriteup (last edited 2013-01-23 07:48:44 by 177)