Differences between revisions 12 and 13
Revision 12 as of 2006-10-09 22:24:24
Size: 4785
Editor: RaymondFeng
Comment:
Revision 13 as of 2009-09-20 22:48:47
Size: 4797
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 29: Line 29:
attachment:Tuscany-DataBinding.ppt [[attachment:Tuscany-DataBinding.ppt]]
Line 37: Line 37:
Prototype: attachment:tuscany-databinding.zip Prototype: [[attachment:tuscany-databinding.zip]]
Line 65: Line 65:
attachment:databindings_graph.jpg {{attachment:databindings_graph.jpg}}

Objective

It is necessary to support the flow of any data type that is supported by both the client and the provider. With the ability to attach data transformation mediations to wires, this actually becomes a requirement to support any data type that can be mapped from client to provider and back again.

In any interchange there are just two things that are defined: the format of data that will be supplied by the client and the format of data that will be consumed (delivered to) the provider. Neither client or provider needs to be aware of the format of data on the other end or of what gyrations the fabric went though in order to make the connection. As part of making the connection, it is the fabric's job to make the connection as efficient as possible, factoring in the semantic meaning of the data, the policies that need to be applied, and what the different containers support.

All this flexibility just about requires we use the most generic type possible to hold the data being exchanged: a java.lang.Object or a (void*) depending on the runtime. The actual instance used would depend on the actual wire, some examples from Java land being: * POJO (for local pass by reference) * SDO (when supplied by the application) * Axiom OMElement (for the Axis2 binding) * StAX XMLStreamReader (for streamed access to a XML infoset) * ObjectInputStream (for cross-classloader serialization) and so forth.

Each container and transport binding just needs to declare which data formats it can support for each endpoint it manages. The wiring framework need to know about these formats and about what transformations can be engaged in the wire pipeline.

For example, the Axis2 transport may declare that it can support Axiom and StAX for a certain port and the Java container may declare that it can only handle SDOs for an implementation that expects to be passed a DataObject. The wiring framework can resolve this by adding a StAX->SDO transform into the pipeline.

The limitation here is whether a transformation can be constructed to match the formats on either end. If one exists then great, but as the number increases then developing n-squared transforms becomes impractical. A better approach would be to pick the most common formats and require bindings and containers to support those at a minimum, with other point-to-point transforms being added as warranted. (Source: http://mail-archives.apache.org/mod_mbox/ws-tuscany-dev/200603.mbox/%3c4418A53D.2080606@apache.org%3e)

Draft Document

Tuscany-DataBinding.ppt

Implementation Updates

Hi,

I created a prototype to demonstrate how we can support multiple databindings and transform data accross them. The initial drop contains some basic transformer implementations for the following databindings:

Prototype: tuscany-databinding.zip

1) XML String

2) DOM Node

3) StAX XMLStreamReader

4) SDO DataObject/XMLDocument

5) XMLBeans XmlObject

6) JAXB Object

It also comes a test case which shows the following transformation via multiple hops:

1) XMLBeans --> SDO

2) SDO --> DOM

3) SDO --> JAXB

The transformers are registered and selected using the following algorithm.

1) The data transformation capabilities for various databindings can be nicely modeled as a weighted, directed graph with the following rules. (Illustrated in the attached diagram).

[ATTACH]

* Each databinding is mapped to a vertex.

* If databinding A can be transformed to databinding B, then an edge will be added from vertex A to vertex B.

* The weight of the edge is the cost of the transformation from the source to the sink.

2) In the data mediator/interceptor on the wire, if we find out that the data needs to be transformed from databinding A to databinding E. Then we can apply Dijkstra's Shortest Path Algorithm to the graph and figure the most performed path. It can be A-->E, or A-->C-->E depending on the weights. If no path can be found, then the data cannot be mediated.

I also find out there are some interesting issues:

1) In some cases, the target side prefer to provide a callback to receive data pushed from the source side. The SAX ContentHandler is a good example.

2) How to supply context data for the transformations, for example, JAXB requires either the java package names or classes?

3) How do we pipe the data from one output to another input, for example, OutputStream connects to InputStream?

4) How do we integrate the framework with the Tuscany invocation chain (message handler and interceptor)?

Thanks, Raymond

Tuscany/TuscanyJava/SCA_Java/DataMediation (last edited 2009-09-20 22:48:47 by localhost)