Proposal for Saving and Restoring the AXIS2 Message Context

There is an AXIS2 requirement to save the message context at some point in the handler chain, store it away, and restore it later. This requirement also includes the need to let handlers manage message-specific data when the message context is saved and restored.

Refer to http://wiki.apache.org/ws/Axis2/Requirements

In particular, this feature can be used by a WS Reliable Messaging implementation to persist and re-send messages and to recover from a server restart. The idea being to save a message context, then later restore the message context and pick up the message processing at the point at which the message context was saved (that is, without having to completely restart the message from the beginning).

Another consideration is with WS Security handlers, which may have message-specifc security data, like certificates or tokens, that need customized processing when a message context is saved and restored.

When considering design options for saving the message context, it is important to understand what the AXIS2 message context covers. As shown in the figure below, the AXIS2 message context is composed of a complex inter-connected graph of objects. Some of these objects represent runtime data and some of these object represent static, deployment-time data for the AXIS2 engine.

messagecontext.jpg

Because of the complexity of the AXIS2 message context, there are two important design points:

1. Save runtime data but not deployment or configuration data.

2. Have an object save/restore its own state.

This approach results in having the following objects as major participants in the message context serialization:

  1. Message Context
  2. Operation Context
  3. Service Context
  4. Service Group Context
  5. Session Context
  6. Options

In general, the message context will save objects with pertinent runtime data. The message context will not save objects with static deployment data but instead saves some identifying information about them. When the message context is restored, the message context "plugs itself back in" to the deployment set of objects that exist on that engine.

How to Save/Restore a Message Context

Sample code to save a Message Context

Sample code to restore a Message Context

Limitations/Restrictions on the Message Context Serialization Support

Java Serialization Background

Object serialization is the process of saving an object's state and rebuilding the object from the saved information. The serialization mechanism provides a way to read and write an object to and from a raw byte stream. The process of serializing an object involves traversing the graph created by the object's references to other objects and primitives. The point is to collect the information needed to preserve the object and allow the object to regain its state when the object is de-serialized. In other words, all pertinent data in the object (including other objects that are referenced by the object) get saved. There are 3 ways to perform object serialization: using the default mechanism (java.io.Serializable), customizing the default mechanism, and using a customized protocol (java.io.Externalizable).

An object that implements the java.io.Serializable interface uses the default protocol in Java to control the serialization. All instance variables or fields of the object must be Serializable. If a field in a class is marked as transient, the field will be skipped when Java serializes the object. The only things that are saved from the object are the non-static and non-transient instance data. Class definitions are not saved and must be available when the object is de-serialized.

An object that implements the java.io.Serializable interface can customize the default serialization mechanism by providing two private methods for reading and writing the object.

The object can also explicitly declare the specific variables or fields to be saved by defining the variable

An object uses the java.io.Externalizable interface when the object needs to control its serialization and re-constitution process. This interface allows the object to encode the serialized data, to select which fields are serialized, and to put any additional information to the serialization stream.

An Externalizable object must implement two methods:

and have a public constructor that takes no arguments.

In the writeExternal() method, the object must manually write each piece of data to be saved to the ObjectOutput. Note that the implementation must also account for all superclass information.

If a class definition changes, then the serialization of the class is affected. Versioning of the class is important to identify class definitions. For example, if an older version of a class is deserialized into a newer version of the class, it is important to determine what state will that leave the object in. The serialization mechanism uses an identification value to keep track. The identifier is called the serialVersionUID (SUID) and is computed from the structure of the class to form a unique 64-bit identifier. The following example shows how to declare the SUID that the object is compatible with:

Refer to the "serialver" utility to compute an SUID for a class.

Data that is serialized is in FIFO order; de-serialization sees the same FIFO order.

An object that needs to be serialized but does not implement the Serializable or Externalizable interfaces will result in an exception java.io.NotSerializableException.

Considerations for Using or Implementing Serialization

1. Sending the same object again: if you send the same object more than once into the same output stream, it isn't actually sent. The first copy makes it into the stream but subsequent copies (even with modified fields) do not. Either close the stream after a write or reset the stream to flush the entire object cache. Or, clone the object, modify the clone and send a copy of the clone to the stream.

2. Performance: the default serialization/de-serialization mechanism in Java is not the best performer.

3. Special cases like circular references and multiple references to a single object need to be preserved such that when the object is re-created, the object tree graph doesn't cause new objects to be created where a reference to an existing object in the tree should be.

4. Use javadoc @serial tag to identify serializable fields when using java.io.Serializable.

5. When serializing a Collection, store the number of objects in the Collection and use the number to read them back when de-serializing.

6. The JVM may limit the class depth. The class depth refers to the number of objects in the object tree graph for the object to be serialized.

7. Unexpected constructor firing: if the superclass of a serialized object does not implement java.io.Serializable, its default constructor (and not the serialized object's) will fire upon being de-serialized. The simplest solution to prevent the superclass's constructor from firing is to have the superclass implement java.io.Serializable. There does not seem to be a good explanation for this behavior.

8. To correctly externalize an object, the field data must be read back in the same order and type as it was written. Consider using a revision number, in addition to the serialVersionUID.

9. Incompatible changes while evolving a class include

  1. Deleting fields
  2. Moving classes up or down in the hierarchy
  3. Changing a non-static field to static or a non-transient field to transient
  4. Changing the declared type of a primitive field
  5. Changing the class from Serializable to Externalizable
  6. Removing Serializable or Externalizable from the class
  7. Changing a class from a non-enum type to an enum type or vice versa

10. If your strings might contain more than 64k characters, you will have to customize your serialization. Refer to the java.io.DataOutput interface and the writeUTF() method.