Nonvalidating Strongly Typed Wrapper Over JSR173 Parser
A wrapper over JSR173 XMLStreamReader which will provide ways of getting strongly typed java values (like int, double, Date, QName) from text and attributes. This will primarily be used in the fast unmarshaler. Please send comments to cezar.andrei at bea.com or xmlbeans-dev at xml.apache.org.
Assumptions
- the implementation doesn't need to validate to keep the current schema type
- has to provide fast access to typed data, in case of a backend that keeps typed data like XMLBeans store, it should avoid printing and parsing.
Interface
extends XMLStreamReader
- plus provides methods like:
{{{ public int getIntValue();
- public Calendar getCalendarValue();
- .. }}}
calling getIntValue() should work like XMLStreamReader's getElementText() i.e.:
reads the content of a text-only element, an exception is thrown if this is not a text-only element. Regardless of value of javax.xml.stream.isCoalescing this method always returns coalesced content. ( ex. for <a> 1<!--comment-->0 </a> getIntValue() will return an int with value 10 )
- Precondition: the current event is START_ELEMENT.
- Postcondition: the current event is the corresponding END_ELEMENT.
- plus there is one difference, in case of an inner element the stream will be consumed up to the corresponding END_ELEMENT and then thrown an exception
if the text inside is not lexically correct for the implied schema type an InvalidLexicalValueException will be thrown for being consistent on all methods, the exception will contain the Location of the error.
Methods and implied built-in schema types
{{{ /** Returns the value as a {@link String}. */
- String getStringValue() throws ["XMLStreamException"]; }}}
for xsd:string and derivates
{{{ /** Returns the value as a boolean. */
- boolean getBooleanValue() throws ["XMLStreamException"]; }}}
for xsd:boolean
{{{ /** Returns the value as a byte. */
- public byte getByteValue() throws ["XMLStreamException"]; }}}
for xsd:byte
{{{ /** Returns the value as a short. */
- public short getShortValue() throws ["XMLStreamException"]; }}}
for xsd:short and derivates
{{{ /** Returns the value as an int. */
- public int getIntValue() throws ["XMLStreamException"]; }}}
for xsd:int and derivates
{{{ /** Returns the value as a long. */
- public long getLongValue() throws ["XMLStreamException"]; }}}
for xsd:long and derivates
{{{ /** Returns the value as a {@link java.math.BigInteger}. */
public BigInteger getBigIntegerValue() throws ["XMLStreamException"]; }}}
for xsd:integer and derivates
{{{ /** Returns the value as a {@link java.math.BigDecimal}. */
public BigDecimal getBigDecimalValue() throws ["XMLStreamException"]; }}}
for xsd:decimal and derivates
{{{ /** Returns the value as a float. */
- public float getFloatValue() throws ["XMLStreamException"]; }}}
for xsd:float
{{{ /** Returns the value as a double. */
- public double getDoubleValue() throws ["XMLStreamException"]; }}}
for xsd:double
{{{ /** Returns the decoded hexbinary value as an InputStream. */
InputStream getHexBinaryValue() throws ["XMLStreamException"]; }}}
for xsd:hexBinary
{{{ /** Returns the decoded base64 value as anInputStream. */
InputStream getBase64Value() throws ["XMLStreamException"]; }}}
for xsd:base64Binary
{{{ /** Returns the value as a {@link java.util.Calendar}. */
- Calendar getCalendarValue() throws ["XMLStreamException"]; }}}
for all data related schema types: dateTime, time, date, gYearMonth, gYear, gMonth, gDay (for some of them defaults are used, same as GDateSpecification.getCalendar() ) Should this method return XmlCalendar which extends GregorianCalendar?
{{{ /** Returns the value as a {@link java.util.Date}. */
- Date getDateValue() throws ["XMLStreamException"]; }}}
for all data related schema types: dateTime, time, date, gYearMonth, gYear, gMonth, gDay (for some of them defaults are used, same as GDateSpecification.getDate() )
{{{ /** Returns the value as a {@link org.apache.xmlbeans.GDate}. */
- GDate getGDateValue() throws ["XMLStreamException"]; }}}
for all data related schema types: dateTime, time, date, gYearMonth, gYear, gMonth, gDay {{{ /** Returns the value as a {@link org.apache.xmlbeans.GDuration}. */
- GDuration getGDurationValue() throws ["XMLStreamException"]; }}}
for xsd:duration
{{{ /** Returns the value as a {@link javax.xml.namespace.QName}. */
- QName getQNameValue() throws ["XMLStreamException"]; }}}
for:xsd:QName
Attributes
Because of the way XMLStreamReader is designed the stream can be positioned on attributes only when a substream represents the content of an element and attributes don't have an element to hang on. For this kind of attributes the above methods should work.
For attributes in an usual document, the interface is providing two more sets of methods, in the same manner of XMLStreamReader's getAttributeValue(int index) and getAttribute(String uri, String local)
{{{ public int getAttributeIntValue(int index) throws XMLStreamException;
- public int getAttributeIntValue(String uri, String local) throws ["XMLStreamException"]; }}}
same pattern for the rest of the types.
White space
For easier access to extended or xsd:list values getString*Value() has one more form where one can pass in the white space style to be applied.
The three white space styles coresponding to XMLSchema whitespace facet: WS_PRESERVE, WS_REPLACE and WS_COLLAPSE.
{{{ public String getStringValue(int wsStyle) throws XMLStreamException;
- public String getAttributeStringValue(int index, int wsStyle) throws ["XMLStreamException"]; public String getAttributeStringValue(String uri, String local, int wsStyle) throws ["XMLStreamException"]; }}}
Notes
- Given the fact that an implementation of this interface will not run validation in the stream, I will not include support for lists, enumeration and unions. They might be possible to introduce but the user has to push the correct schema type. Because usually the back ends are not list/enumeration/union aware, there would be no performance win from this. If interest rises for this, support can be added later.
Implementation optimization: because all numeral parsing code in the JDK takes a String as the input parameter, to avoid the creation of all this Strings an implementation will have to implement those parsing methods on a CharSequence interface. In the same time the whitespace collapsing should be done.