Proposal to enhance existing XML functionality in Derby in the following ways:
- Add support for setting/getting XML values from a JDBC 3.0 app (without requiring explicit XMLPARSE and XMLSERIALIZE operators). This includes returning meaningful metadata and defining the proper getXXX/setXXX support.
- Slight modifications to existing XML operators to comply with SQL/XML.
- New XMLQUERY operator for retrieving XML query results.
Specification initially posted to DERBY-688. Relevant discussion is buried in different threads, found here:
Please continue any discussion on the derby-dev mailing list; this page will be updated to summarize the discussion.
Discussion: JDBC 4.0
While JDBC 4.0 defines support for a "java.sql.SQLXML" class (see SQLXML), implementation of that class and associated APIs will not be part of this proposal. I will try to avoid making changes/decisions for this proposal that might interfere with future JDBC 4.0 support for SQLXML, but I will not be making changes that are specific to JDBC 4.0; any such changes will have to be part of a separate effort.
Discussion: JDBC metadata and getXXX/setXXX methods
After reading through the relevant parts of the above-mentioned thread, I plan to code the following behavior for XML in JDBC 3.0:
The JDBC 3.0 type for XML will be java.sql.Types.OTHER, since Types.SQLXML isn't defined until JDBC 4.0. In other words, a call to ResultSetMetaData.getColumnType() on an XML column will return Types.OTHER.
The Java API defines Types.OTHER as "the constant in the Java programming language that indicates that the SQL type is database-specific and gets mapped to a Java object that can be accessed via the methods getObject and setObject". HOWEVER, the get/setObject() methods will not actually be allowed on a Derby XML value for JDBC 3.0. The reason is that there is no standard object type to return (java.sql.SQLXML isn't defined until JDBC 4.0), and use of another type (such as java.sql.Clob) would lead to incompatibility issues down the road, when JDBC 4.0 support is added. Thus, attempts to use getObject on an XML value will result in Derby error 22005: "An attempt was made to get a data value of type 'java.lang.Object' from a data value of type 'XML'"; a corresponding error will be thrown for setObject(), as well. As a corollary to this, calls to ResultSetMetaData.getColumnClassName() will also return an error (when used on an XML column).
NOTE: That said, other relevant getXXX/setXXX methods will be allowed on XML values, namely: get/setString(), get/setAsciiStream(), get/setCharacterStream(), and get/setClob().
The SQL type name that will be returned from metadata (ex. ResultSetMetaData.getTypeName()) will be "XML".
Discussion: Xerces and Xalan dependencies
As part of my work for these XML enhancements, I'm removing the explicit dependency on Xerces that exists in 10.1. Instead, the XML parser will be picked up from the JVM, based on existing JDBC 3.0 API calls (JAXP) in the javax.xml.parsers package. If no such parser is found, we will generate an error message saying that a valid parser is required for use of XML. My reason for removing this hard-coded dependency on Xerces is that different JVMs have different XML parsers and we don't want to force users of Derby to download a specific XML parser if we can avoid it. Instead, we can use the JAXP API to pick up the parser that comes with the JVM. An example of why this is useful can be seen with Sun JVM verses IBM JVM: the former comes with Crimson, the latter with Xerces; by changing Derby to use the JAXP API, we ensure that XML will work with both JVMs without requiring a specific parser to be in the user's classpath.
As for Xalan, I plan to retain Derby's dependency on Xalan for four reasons:
- the JDBC 3.0 API for evaluation of XPath/XQuery expressions is limited to the javax.xml.transform.* packages, which (in my experience) makes it difficult to process the full results of an XPath/XQuery expression, and is also a bit slow;
- Xalan provides a lower-level API for evaluating XPath expressions that, based on some very simple tests I've run, provides much better evaluation performance than the transform API, and allows easier manipulation of the results;
- both Sun and IBM JVMs come with Xalan embedded, which means two of the most commonly used JVMs will work with Xalan-dependent code as they are, requiring no additional downloads.
Xalan is an XPath processor, and it is my hope that at some point in the near future we will be able to find a full XQuery processor to use for (and hopefully embed within?) Derby. Thus, our dependency on Xalan is hopefully not a permanent one.
For JVMs that are earlier than JDBC 3.0 and/or for any JVMs that do not include Xalan (ex. do j9 and Apple JVMs include Xalan?), users who wish to use Derby XML will have to put the missing jar files (JAXP parser implementation and/or Xalan) in their classpaths. That done, everything else should work as normal.