Serialization Patterns
The Serialization Builder
This pattern is useful for avoiding lock in to serialized form, by taking the responsibility for Serialization away from your class and placing it in the hands of a Serialization Builder.
Best used when you have a number of package private classes that implement a common interface or abstract superclass.
When you take the responsibility for serialization away from your class and give it to a builder, the builder becomes responsible for all forms of construction of your class, so your constructors should be package private.
It is of utmost importance that your Builder implementation doesn't become part of your public API and remains package private, to do this you need an abstract superclass with a static method that retrieves your builder implementation.
The major benefits of this pattern:
- Decoupling of the serialized form from the implementation, allowing the use of final fields and preventing deserialization attacks.
- Only uses constructors to create instances, instead of implicit creation by deserialization.
- Allows the developer to change the serialized form, by replacing the builder with another implementation.
- By preserving the existing builder, but no longer utilising it, old serial data can be deserialized still, allowing an upgrade path.
- The Builder can even produce a different instance if you refactor your implementation, so you can remove old classes.
- You can have as many serial forms as you like in the form of old builders.
- By using a codebase, new builders and implementations can be dynamically downloaded by other nodes using earlier versions.
- Many implementations can share the same builder.
- Migration: During a live upgrade, object data can be updated to new or different implementation classes using serialization to transfer state to another jvm, after the upgrade the old builders can be removed if desired. The original instance class name doesn't need to exist on the target provided the original builder is modified to produce a different instance.
Caveats:
- Not for use with classes that are extended by clients.
- Circularity issue caused by readResolve() needs to be considered. Example, if the Map contained a copy of itself, or the Objects in the Map contain references to the Map, then the circularity issue exists.
Generics have been omitted for clarity.
The Abstract Builder
public abstract class Builder {
public static Builder create(){
return new SerializationBuilderImp();
}
public abstract Builder put(Object key, Object value);
public abstract Builder putAll(Map map);
public abstract Map build();
}
The Serialization Builder Implementation
class SerializationBuilderImp extends Builder implements Serializable {
private Map mutableMap = new HashMap();
public Builder put(Object key, Object value){
mutableMap.put(key, value);
return this;
}
public Builder putAll(Map map){
mutableMap.putAll(map);
return this;
}
public Map build(){
return new ImmutableMap(mutableMap);
}
private void readObject(ObjectInputStream in)
throws IOException, ClassNotFoundException {
in.defaultReadObject();
}
private void writeObject(ObjectOutputStream out) throws IOException {
out.defaultWriteObject();
}
private Object readResolve() throws ObjectStreamException {
return build();
}
}
Package private Map implementation
class ImmutableMap extends AbstractMap implements Map, Serializable {
private final Map immutable;
ImmutableMap( Map map ){
Map newMap = new HashMap(map.size());
newMap.putAll(map);
immutable = Collections.unmodifiableMap(newMap);
}
public Set entrySet(){
return immutable.entrySet();
}
private Object writeReplace() {
// returns a Builder instead of this class.
return Builder.create().putAll(immutable);
}
private void readObject(ObjectInputStream stream)
throws InvalidObjectException{
throw new InvalidObjectException("Builder required");
}
}
Circularity Issue
The Serialization Builder takes advantage of the readResolve() method to replace itself with a newly built object after the builder is deserialized. The java serialization specification[2] describes an issue with circular references:
Note - The readResolve method is not invoked on the object until the object is fully constructed, so any references to this object in its object graph will not be updated to the new object nominated by readResolve. However, during the serialization of an object with the writeReplace method, all references to the original object in the replacement object's object graph are replaced with references to the replacement object. Therefore in cases where an object being serialized nominates a replacement object whose object graph has a reference to the original object, deserialization will result in an incorrect graph of objects. Furthermore, if the reference types of the object being read (nominated by writeReplace) and the original object are not compatible, the construction of the object graph will raise a ClassCastException.
Bob Lee is credited with the following work around:
Have the builder implement the same interface and have it delegate to a transient copy of the built object.
So the Serialization Builder above would need the following modifications:
The Abstract Builder with readResolve workaround
public abstract class Builder extends AbstractMap{
public static Builder create(){
return new SerializationBuilderImp();
}
public abstract Map build();
}
The Serialization Builder Implementation with readResolve workaround
class SerializationBuilderImp extends Builder implements Serializable {
// Serial fields
private Map mutableMap = new HashMap();
// Build target object to work around readResolve issue.
private transient volatile Map serialBuilt = null;
// Map interface method, delegates if constructed with serialization.
public Object put(Object key, Object value){
if ( serialBuilt != null ) return serialBuilt.put(key, value);
return mutableMap.put(key, value);
}
// Map interface method, delegates if constructed with serialization.
public Set entrySet(){
if ( serialBuilt != null ) return serialBuilt.entrySet();
return mutableMap.entrySet();
}
public Map build(){
return new ImmutableMap(mutableMap);
}
private void readObject(ObjectInputStream in)
throws IOException, ClassNotFoundException {
in.defaultReadObject();
}
private void writeObject(ObjectOutputStream out) throws IOException {
out.defaultWriteObject();
}
// If deserialized state may have changed since, if another type of
// Map, apart from ImmutableMap, uses the same builder for example.
private Object writeReplace() {
if ( serialBuilt != null ) return serialBuilt;
return this;
}
private Object readResolve() throws ObjectStreamException {
serialBuilt = build();
return serialBuilt;
}
}
Package private Map implementation
class ImmutableMap extends AbstractMap implements Map, Serializable {
private final Map immutable;
ImmutableMap( Map map ){
Map newMap = new HashMap(map.size());
newMap.putAll(map);
immutable = Collections.unmodifiableMap(newMap);
}
public Set entrySet(){
return immutable.entrySet();
}
private Object writeReplace() {
// returns a Builder instead of this class.
Map builder = Builder.create();
builder.putAll(immutable);
return builder; // note build() method is not called until deserialisation.
}
private void readObject(ObjectInputStream stream)
throws InvalidObjectException{
throw new InvalidObjectException("Builder required");
}
}The examples here are relatively simplistic, it is possible to have a number of Serialization Builder implementation classes. It is also likely that the static factory method Builder.new() might accept parameters or may choose different builder implementations under different conditions.
You could also for instance have two serialized forms, one old, one new. A configuration setting could dictate the use of only the old serialized form until your environment has been completely upgraded, then set the configuration to only use the new serialized forms. Any old marshalled objects (in serial form) can still be deserialized, but no more created, thus all will be deserialised until no more remain.
The Serializable Builder pattern is inspired by the Serializable Proxy pattern[1], where the proxy has been decoupled using an abstract builder.
Discussion
If circular references are possible, make sure to also implement compatible equals and hashCode methods, in the case above AbstractMap does so on our behalf, equals and hashCode needs to be implemented in a common superclass.
Care needs to be taken to ensure implementation classes remain decoupled from their serialized form implementations.
References
[1] Joshua Bloch "Effective Java Second Edition" Item 78, Page 312. ISBN-10: 0-321-35668-3