ItemState management

Note: This documentation has been created based on jackrabbit-core-1.6.1.

The JCR API evolves around Nodes and Properties (which are called Items). Internally, these Items have an ItemState which may or may not be persistent. Modifications to Items through the JCR Session act on the ItemState, and saving the JCR Session persists the touched ItemStates to the persistent store, and makes the changes visible to other JCR Sessions. The management of ItemState instances over various concurrent sessions is an important responsibility of the Jackrabbit core. The following picture shows some of the relevant components in the management of ItemStates.

ISM Components.png

Core components and their responsibilities:

Miscellaneous concepts:

Remarks:

The next sections show collaboration diagrams for various use cases that read and/or modify content.

Use case 1: simple read

The following is a collaboration diagram of what happens w.r.t. ItemState management in the Jackrabbit core when a Session reads a single existing node just after startup (i.e., Jackrabbit caches are empty).

Session load.png

  1. The client of the API has a Session already and asks it to get the node at path "/A".

  2. The Session delegates this to its ItemManager.

  3. The ItemManager does not have ItemData in its cache (a plain Map) for the requested item. It asks its SessionItemStateManager for the ItemState.

  4. The SessionItemStateManager does not have the item in its caches and delegates to the LocalItemStateManager.

  5. The LocalItemStateManager does not have the item in its caches and delegates to the SharedItemStateManager.

  6. The SharedItemStateManager does not have the item in its caches and delegates to the PersistenceManager which returns the "shared" ItemState after reading it from the persistent store (database).

  7. The SharedItemStateManager puts the item in its cache and returns it to the calling LocalItemStateManager.

  8. The LocalItemStateManager creates a new "local" ItemState based on the returned (shared) ItemState. The former refers to the latter as the "overlayed state". This basically is a copy-on-read action.

  9. This local ItemState is put in the cache of the LocalItemStateManager and returned to the SessionItemStateManager which returns it to the ItemManager.

  10. The ItemManager creates an ItemData instance and puts it in its cache,

  11. then it creates a new NodeImpl instance based on the ItemData and returns that to the Session which gives it to the client.

Remarks:

Use case 2: simple write

Consider the situation in which we have just executed use case 1. I.e., the client has a reference to a NodeImpl. The following shows what happens when the client adds a single property to that node.

Session modify.png

  1. The client invokes setProperty("prop A", "value") on the NodeImpl.

  2. The NodeImpl delegates the creation of a transient property state to the SessionItemStateManager.

  3. The SessionItemStateManager creates a new PropertyState,

  4. puts it in the transient store cache and returns it to the NodeImpl.

  5. The NodeImpl now must create a property item instance and calls the ItemManager for this with the new PropertyState.

  6. The ItemManager creates new property data based on the given state, puts it in its local cache, and

  7. creates a new PropertyImpl instance which it returns to the NodeImpl.

  8. The NodeImpl must still modify the NodeState to which the property has been added and delegates the creation of a transient state to the SessionItemStateManager.

  9. The SessionItemStateManager creates a new NodeState,

  10. puts it in its transient store and returns it to the NodeImpl which sets the name of the new property in its now transient state and returns the created PropertyImpl instance.

Remarks:

Now suppose that another Session also adds a property to the same node. The following picture shows the relations of the various item state managers and the existing item states in the caches of the unsaved sessions.

Cache hierarchy.png

The SharedItemStateManager has the shared state of the persistent node. It has two listeners: the LocalItemStateManagers of the existing Sessions. Each LocalItemStateManager has a local copy of the shared state, the local state which has the shared state as overlayed state, and has a SessionItemStateManager as a listener. Both SessionItemSTateManagers have a transient copy of their local state (the overlayed state) which contains the modifications (addition of a property), and also a transient state for the new property.

Use case 3: save

Consider the situation in which we have just executed use case 2. I.e., the client has added a single property to an existing node. The transient space contains two items: the new PropertyState and the modified NodeState of the Node to which the new property has been added. The following shows what happens when the session is saved. Note that only the modified NodeState instance is taken into account because otherwise the picture becomes even more unreadable. The handling of the new PropertyState however is approximately the same.

Session save.png

  1. The client calls Session.save().

  2. The SessionImpl retrieves the root node from the ItemManager.

  3. The SessionImpl calls the save method on the ItemImpl which represents the root node.

  4. The root NodeImpl retrieves the transient ItemStates that are descendants of itself (in this case the states of the modified node and the new property).

  5. The root NodeImpl does some validations on the transient space (checks that nodetype constraints are satisfied, that the set of dirty items is self-contained, etc.).

  6. The root NodeImpl starts an edit operation by calling edit on the SessionItemStateManager.

  7. The SessionItemStateManager delegates this to the LocalItemStateManager,

  8. which resets its ChangeLog instance.

  9. The root NodeImpl retrieves the first transient item from the ItemManager (assume that this is the modified node),

  10. and calls makePersistent on it.

  11. The NodeImpl instance of node "A" copies data from the transient NodeState to the local NodeState (i.e., to the overlayed state of the transient state).

  12. The NodeImpl instance calls store on the SessionItemStateManager with the local NodeState as argument.

  13. The SessionItemStateManager delegates this to the LocalItemStateManager which

  14. calls modified on its ChangeLog instance with the local state as argument. The ChangeLog records the local state as modified and

  15. disconnects the local state from its overlayed state (the shared state which is contained by the SharedItemStateManager).

  16. The NodeImpl instance of node "A" now asks the SessionItemStateManager to diconnect the transient state and it

  17. delegates this call to the state itself. Now the transient state of node "A" has no overlayed state.
  18. The NodeImpl instance of node "A" now uses the local state for its NodeData.

  19. The NodeImpl instance of the root node now disposes the transient item state of node "A" and

  20. the SessionItemStateManager calls discard on the transient NodeState of node "A" and removes it from its cache.

  21. The NodeImpl instance of the root node now calls update on the SessionItemStateManager which

  22. delegates this to the LocalItemStateManager, which on its turn

  23. delegates this to the SharedItemStateManager with the local ChangeLog as argument.

  24. The SharedItemStateManager reconnects the local state to the shared state managed by itself, merges changes from the local state and the shared state, and

  25. adds the shared state to the shared ChangeLog.

  26. The shared ChangeLog disconnects the shared state from its overlayed state (which does nothing because the shared state has no overlayed state).

  27. The SharedItemStateManager now pushes the changes from the local ChangeLog to the shared states.

  28. The local ChangeLog calls push on the local NodeState contained in its set of modified states.

  29. The local NodeState copies its own data to the overlayed shared state (which is contained in the shared ChangeLog).

  30. The SharedItemStateManager stores the ChangeLog to the persistent store and

  31. calls persisted on its shared ChangeLog. (Important note: this invokes callbacks that make the changes visible to other sessions. This is modeled in the next view.)

  32. After control is returned to the LocalItemStateManager it also calls persisted on ist local ChangeLog to invoke ItemStateListener callbacks.

Remarks:

Consider the situation in which we have just executed use case 2. I.e., the client has added a single property to an existing node. The transient space contains two items: the new PropertyState and the modified NodeState of the Node to which the new property has been added. Now suppose that there is another Session which also has added a property to that existing node. The following shows what happens when the session is saved (focus on callbacks from the SharedItemStateManager to the transient states in the second session).

Session save callback.png

  1. The API client saves the session.
  2. The update operation on the LocalItemStateManager is called (the local ChangeLog has already been constructed and the local states in that ChangeLog have been disconnected from their shared counterparts).

  3. The SharedItemStateManager is called to process the update in the given local ChangeLog.

  4. The states in the local ChangeLog are connected to their shared counter parts and merged (step 4') if they are stale. This is necessary because it could have happened that between step 22 and 23 another session on another thread was saved which changed states in the local ChangeLog (which are disconnected from their shared states). The detection of whether a state is stale is done via modification counter. If the merge fails, a StaleItemStateException is thrown and the save fails.

  5. When the merge succeeds the shared state is added to the shared ChangeLog which

  6. disconnects the shared state (which effectively does nothing as the shared state has no overlayed state).
  7. The changes in the local ChangeLog are pushed to the shared ChangeLog by

  8. invoking push on each local state in the local ChangeLog which

  9. copies the local state to the shared state.
  10. Then the shared ChangeLog is given to the PersistenceManager to persist.

  11. Now the changes are pushed down to other sessions. This is started by invocation of the persisted method on the shared ChangeLog.

  12. The shared ChangeLog calls notifyStateUpdated on each of its modified states.

  13. The modified shared state calls stateModified with itself as argument on its container, the SharedItemStateManager.

  14. The SharedItemStateManager calls stateModified with the shared state as argument on every of its listeners. These are at least LocalItemStateManagers of every session in the workspace.

  15. The SharedItemStateManager calls stateModified with the shared state as argument on a LocalItemStateManager of the other session.

  16. The LocalItemStateManager pulls in the changes by invoking pull on the local state (if it exists\!).

  17. The state copies the information from its shared counterpart.
  18. The LocalItemStateManager now calls his listeners with as argument the local state.

  19. The SessionItemStateManager sees that a local state has been changed which has a transient state. It invokes the NodeStateMerger to try to merge the changes to the transient state.

  20. The NodeStateMerger tries to merge the local state with the transient state by e.g., merging the child node entries.

  21. ...

Remarks:

Staleness detection, merging and synchronization considerations

Since different Sessions can be used concurrently (i.e., from different threads) it is possible that two Sessions modify the same node and save at the same moment. There is essentially a single lock that serializes save calls: the ISMLocking instance in the SharedItemStateManager. This is a Jackrabbit-specific read-write lock which supports write locks for ChangeLog instances and read locks for individual items. The read lock is acquired when the SharedItemStateManager needs to read from its ItemStateReferenceCache (typically when Sessions access items: the LocalItemStateManagers create a local copy of the shared state). The write lock is acquired in two places:

Thus, two threads can concurrently build their local ChangeLogs, but they need exclusive access to the SharedItemStateManager for processing their ChangeLog. During this processing a saving thread invokes call backs on the ItemStateListener interface (LocalItemStateManager and SessionItemStateManager implement that type) in other open sessions (via the persisted call on the shared ChangeLog in step 11). The intention of this listener structure is that open sessions get immediate access to the saved changes. Of course, there can be collisions which must be detected. Therefore, every ItemState instance has the following fields:

The isStale method on the ItemState is used for determining whether an ItemState can be saved to the persistent store. The method of staleness detection depends on whether the state is transient. If it is, then a staleness is determined by checking the status field. If it is not transient, then staleness is determined by the modCount and the modCount of the overlayed state. It is used in three places:

Concurrency issues:

Clustering (external updates)

In clustered environments potentially every cluster node may write to the JCR. Because the JCR spec requires a certain level of consistency the synchronized access to the local SharedItemStateManager is not enough to make things work. The global idea of the clustering mechanism is the following:

ItemStateManagement (last edited 2010-03-25 12:25:49 by StephanHuttenhuis)