StAX

JAXP

StAX


1 .What is StAX?

StAX (Streaming API for XML) standard gives a standard, bidirectional pull parser interface for reading and writing XML documents. 

StAX is an alternative way of parsing XML documents, based on the streaming processing. It provides the possibility to parse and change XML streams as events. SAX also uses event model in its work, the difference between SAX and StAX is that StAX gives the possibility to process events one by one.

The StAX API consists of two API sets: a cursor API and an iterator API.

2. Cursor API

A cursor API is low-level API, which allows to work with XML as the stream of events. The application can verify status of the parser, receive info about the last event and go to the next. The main cursor interfaces are XMLStreamReader and XMLStreamWriter:

public interface XMLStreamReader {
    int next() throws XMLStreamException;
    boolean hasNext() throws XMLStreamException;
    String getText();
    String getLocalName();
    String getNamespaceURI();
    ... 
}

 

public interface XMLStreamWriter {
    void writeStartElement(String localName) throws XMLStreamException;
    void writeEndElement() throws XMLStreamException;
    void writeCharacters(String text) throws XMLStreamException;
    ... 
}

2.1. javax.xml.stream.XMLStreamReader Example

 Let's look at the simple example, which print out the content of simple xml document:

import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import java.io.File;
import java.io.IOException;

public class MyStax1 {
    public static void main(String[] args) throws XMLStreamException, IOException {
        File xmlFile = new File("article.xml");
        Source xmlSource = new StreamSource(xmlFile);
        XMLInputFactory factory = XMLInputFactory.newInstance();
        XMLStreamReader r = factory.createXMLStreamReader(xmlSource);
        try {
            int event = r.getEventType();
            while (true) {
                switch (event) {
                    case XMLStreamConstants.START_ELEMENT:
                        System.out.println("<" + r.getName() + ">");
                        break;
                    case XMLStreamConstants.END_ELEMENT:
                        System.out.println("</" + r.getName() + ">");
                        break;
                    case XMLStreamConstants.CHARACTERS:
                        if (r.isWhiteSpace())
                            break;
                        System.out.println(r.getText());
                        break;
                }
                if (!r.hasNext())
                    break;

                event = r.next();
            }
        } finally {
            r.close();
        }
    }
}

3. Iterator API

An iterator API is a high-level API, which works with XML document stream as a set of discrete event objects. The core of the iterator API consists of XMLEventReader and XMLEventWriter interfaces.

public interface XMLEventReader extends Iterator {
    XMLEvent nextEvent() throws XMLStreamException;
    boolean hasNext();
    XMLEvent peek() throws XMLStreamException;
    String getElementText() throws XMLStreamException;
    XMLEvent nextTag() throws XMLStreamException;
    Object getProperty(java.lang.String name) throws java.lang.IllegalArgumentException;
    void close() throws XMLStreamException;
}

 

public interface XMLEventWriter {
    void flush() throws XMLStreamException;
    void close() throws XMLStreamException;
    void add(XMLEvent e) throws XMLStreamException;
    void add(Attribute attribute) throws XMLStreamException;
    String getPrefix(String uri) throws XMLStreamException;
    void setPrefix(String prefix, String uri) throws XMLStreamException;
    void setDefaultNamespace(String uri) throws XMLStreamException;
    void setNamespaceContext(NamespaceContext context)
    throws XMLStreamException;
    NamespaceContext getNamespaceContext();
}

3.1. javax.xml.stream.XMLEventReader Example

import javax.xml.stream.*;
import javax.xml.stream.events.Characters;
import javax.xml.stream.events.EndElement;
import javax.xml.stream.events.StartElement;
import javax.xml.stream.events.XMLEvent;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import java.io.File;
import java.io.IOException;

public class MyStax2 {
    public static void main(String[] args) throws XMLStreamException, IOException {
        File xmlFile = new File("article.xml");
        Source xmlSource = new StreamSource(xmlFile);
        XMLInputFactory factory = XMLInputFactory.newInstance();
        XMLEventReader r = factory.createXMLEventReader(xmlSource);
        try {
            while (r.hasNext()) {
                XMLEvent event = r.nextEvent();
                switch (event.getEventType()) {
                    case XMLStreamConstants.START_ELEMENT:
                        StartElement startElement = event.asStartElement();
                        System.out.println("<" + startElement.getName().getLocalPart() + ">");
                        break;
                    case XMLStreamConstants.END_ELEMENT:
                        EndElement endElement = event.asEndElement();
                        System.out.println("</" + endElement.getName().getLocalPart() + ">");
                        break;
                    case XMLStreamConstants.CHARACTERS:
                        Characters characters = event.asCharacters();
                        System.out.println(characters);

                }
            }
        } finally {
            r.close();
        }
    }
}

4. Cursor vs. Iterator API

In general, using the iterator API is recommended because it is more flexible, extensible and object-oriented.

Benefits of the Iterator API:

  • Objects created from the XMLEvent subclasses are immutable, and can be used in arrays, lists and maps.
  • It is possible to create subtypes of XMLEvent that are either completely new information items or extensions of existing items but with additional methods.
  • Events from an XML event stream can be added and removed in much simpler ways than with the cursor API.

It is better to use cursor API:

  • If it is used a particularly memory-constrained environment, like Java ME.
  • If performance is a highest priority.

It is better to use iterator API:

  • If it is necessary to create XML processing pipelines.
  • If it is necessary to modify the event stream.
  • If the application should be able to handle pluggable processing of the event stream.

5. Advantages of StAX

A simpler programming model than SAX and more efficient memory management than DOM. If to compare with SAX, StAX can not only read, but also write XML data.

Sources: StAX'ing up XML, Part 1: An introduction to Streaming API for XML (StAX), Describe the functions and capabilities of the APIs included within JAXP

See also JAXP, SAX, DOM, XSLT



0 comments
Leave your comment: