Nonvalidating Strongly Typed Wrapper Over JSR173 Parser
A wrapper over JSR173 XMLStreamReader which will provide ways of getting strongly typed java values (like int, double, Date, QName) from text and attributes. This will primarily be used in the fast unmarshaler.
Assumptions
the implementation doesn't need to validate to keep the current schema type
has to provide fast access to typed data, in case of a backend that keeps typed data like XMLBeans store, it should avoid printing and parsing.
Interface
extends XMLStreamReader
plus provides methods like:
-
public int getIntValue(); public Calendar getCalendarValue(); ...
-
calling getIntValue() should work like XMLStreamReader's getElementText() i.e.:
reads the content of a text-only element, an exception is thrown if this is not a text-only element. Regardless of value of javax.xml.stream.isCoalescing this method always returns coalesced content. ( ex. for <a> 1<!--comment-->0 </a> getIntValue() will return an int with value 10 )
Precondition: the current event is START_ELEMENT.
Postcondition: the current event is the corresponding END_ELEMENT.
plus there is one difference, in case of an inner element the stream will be consumed up to the corresponding END_ELEMENT and then thrown an exception
if the text inside is not lexically correct for the implied schema type an InvalidLexicalValueException will be thrown for being consistent on all methods, the exception will contain the Location of the error.
Methods and implied built-in schema types
for xsd:string and derivates
-
/** Returns the value as a {@link String}. */ String getStringValue() throws XMLStreamException;
for xsd:boolean
-
/** Returns the value as a boolean. */ boolean getBooleanValue() throws XMLStreamException;
for xsd:byte
-
/** Returns the value as a byte. */ public byte getByteValue() throws XMLStreamException;
for xsd:short and derivates
-
/** Returns the value as a short. */ public short getShortValue() throws XMLStreamException;
for xsd:int and derivates
-
/** Returns the value as an int. */ public int getIntValue() throws XMLStreamException;
for xsd:long and derivates
-
/** Returns the value as a long. */ public long getLongValue() throws XMLStreamException;
for xsd:integer and derivates
-
/** Returns the value as a {@link java.math.BigInteger}. */ public BigInteger getBigIntegerValue() throws XMLStreamException;
for xsd:decimal and derivates
-
/** Returns the value as a {@link java.math.BigDecimal}. */ public BigDecimal getBigDecimalValue() throws XMLStreamException;
for xsd:float
-
/** Returns the value as a float. */ public float getFloatValue() throws XMLStreamException;
for xsd:double
-
/** Returns the value as a double. */ public double getDoubleValue() throws XMLStreamException;
for xsd:hexBinary
-
/** Returns the decoded hexbinary value as an InputStream. */ InputStream getHexBinaryValue() throws XMLStreamException;
for xsd:base64Binary
-
/** Returns the decoded base64 value as anInputStream. */ InputStream getBase64Value() throws XMLStreamException;
for all data related schema types: dateTime, time, date, gYearMonth, gYear, gMonth, gDay (for some of them defaults are used, same as GDateSpecification.getCalendar() )
-
/** Returns the value as a {@link java.util.Calendar}. */ Calendar getCalendarValue() throws XMLStreamException;
for all data related schema types: dateTime, time, date, gYearMonth, gYear, gMonth, gDay (for some of them defaults are used, same as GDateSpecification.getDate() )
/** Returns the value as a {@link java.util.Date}. */ Date getDateValue() throws XMLStreamException;
for all data related schema types: dateTime, time, date, gYearMonth, gYear, gMonth, gDay
-
/** Returns the value as a {@link org.apache.xmlbeans.GDate}. */ GDate getGDateValue() throws XMLStreamException;
for xsd:duration
-
/** Returns the value as a {@link org.apache.xmlbeans.GDuration}. */ GDuration getGDurationValue() throws XMLStreamException;
for:xsd:QName
/** Returns the value as a {@link javax.xml.namespace.QName}. */ QName getQNameValue() throws XMLStreamException;
Attributes
Because of the way XMLStreamReader is designed the stream can be positioned on attributes only when a substream represents the content of an element and attributes don't have an element to hang on. For this kind of attributes the above methods should work.
For attributes in an usual document, the interface is providing two more sets of methods, in the same manner of XMLStreamReader's getAttributeValue(int index) and getAttribute(String uri, String local)
-
public int getAttributeIntValue(int index) throws XMLStreamException; public int getAttributeIntValue(String uri, String local) throws XMLStreamException;
same pattern for the rest of the types.
White space
For easier access to extended or xsd:list values getString*Value() has one more form where one can pass in the white space style to be applied.
The three white space styles coresponding to XMLSchema whitespace facet: WS_PRESERVE, WS_REPLACE and WS_COLLAPSE.
-
public String getStringValue(int wsStyle) throws XMLStreamException; public String getAttributeStringValue(int index, int wsStyle) throws XMLStreamException; public String getAttributeStringValue(String uri, String local, int wsStyle) throws XMLStreamException;
Notes
Given the fact that an implementation of this interface will not run validation in the stream, I will not include support for lists, enumeration and unions. They might be possible to introduce but the user has to push the correct schema type. Because usually the back ends are not list/enumeration/union aware, there would be no performance win from this. If interest rises for this, support can be added later.
Implementation optimization: because all numeral parsing code in the JDK takes a String as the input parameter, to avoid the creation of all this Strings an implementation will have to implement those parsing methods on a CharSequence interface. In the same time the whitespace collapsing should be done.