StAX Parser

1 .What is StAX?

StAX (Streaming API for XML) standard gives a standard, bidirectional pull parser interfaces for reading and writing XML documents. 

StAX is an alternative way of parsing XML documents, based on streaming processing. It provides the possibility to parse and change XML streams as events. SAX also uses the event model in its work, the difference between SAX and StAX is that StAX gives the possibility to process events one by one.

The StAX API consists of two API sets: a cursor API and an iterator API.

2. Cursor API

A cursor API is a low-level API, which allows working with XML as the stream of events. The application can verify the status of the parser, receive info about the last event, and go to the next. The main cursor interfaces are XMLStreamReader and XMLStreamWriter:

public interface XMLStreamReader {
    int next() throws XMLStreamException;
    boolean hasNext() throws XMLStreamException;
    String getText();
    String getLocalName();
    String getNamespaceURI();
    ... 
}

 

public interface XMLStreamWriter {
    void writeStartElement(String localName) throws XMLStreamException;
    void writeEndElement() throws XMLStreamException;
    void writeCharacters(String text) throws XMLStreamException;
    ... 
}

2.1. javax.xml.stream .XMLStreamReader Example

 Let's look at the simple example, which prints out the content of a simple XML document:

import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import java.io.File;
import java.io.IOException;

public class MyStax1 {
    public static void main(String[] args) throws XMLStreamException, IOException {
        File xmlFile = new File("article.xml");
        Source xmlSource = new StreamSource(xmlFile);
        XMLInputFactory factory = XMLInputFactory.newInstance();
        XMLStreamReader r = factory.createXMLStreamReader(xmlSource);
        try {
            int event = r.getEventType();
            while (true) {
                switch (event) {
                    case XMLStreamConstants.START_ELEMENT:
                        System.out.println("<" + r.getName() + ">");
                        break;
                    case XMLStreamConstants.END_ELEMENT:
                        System.out.println("</" + r.getName() + ">");
                        break;
                    case XMLStreamConstants.CHARACTERS:
                        if (r.isWhiteSpace())
                            break;
                        System.out.println(r.getText());
                        break;
                }
                if (!r.hasNext())
                    break;

                event = r.next();
            }
        } finally {
            r.close();
        }
    }
}

3. Iterator API

An iterator API is a high-level API, which works with the XML document stream as a set of discrete event objects. The core of the iterator API consists of XMLEventReader and XMLEventWriter interfaces.

public interface XMLEventReader extends Iterator {
    XMLEvent nextEvent() throws XMLStreamException;
    boolean hasNext();
    XMLEvent peek() throws XMLStreamException;
    String getElementText() throws XMLStreamException;
    XMLEvent nextTag() throws XMLStreamException;
    Object getProperty(java.lang.String name) throws java.lang.IllegalArgumentException;
    void close() throws XMLStreamException;
}

 

public interface XMLEventWriter {
    void flush() throws XMLStreamException;
    void close() throws XMLStreamException;
    void add(XMLEvent e) throws XMLStreamException;
    void add(Attribute attribute) throws XMLStreamException;
    String getPrefix(String uri) throws XMLStreamException;
    void setPrefix(String prefix, String uri) throws XMLStreamException;
    void setDefaultNamespace(String uri) throws XMLStreamException;
    void setNamespaceContext(NamespaceContext context)
    throws XMLStreamException;
    NamespaceContext getNamespaceContext();
}

3.1. javax.xml.stream .XMLEventReader Example

import javax.xml.stream.*;
import javax.xml.stream.events.Characters;
import javax.xml.stream.events.EndElement;
import javax.xml.stream.events.StartElement;
import javax.xml.stream.events.XMLEvent;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import java.io.File;
import java.io.IOException;

public class MyStax2 {
    public static void main(String[] args) throws XMLStreamException, IOException {
        File xmlFile = new File("article.xml");
        Source xmlSource = new StreamSource(xmlFile);
        XMLInputFactory factory = XMLInputFactory.newInstance();
        XMLEventReader r = factory.createXMLEventReader(xmlSource);
        try {
            while (r.hasNext()) {
                XMLEvent event = r.nextEvent();
                switch (event.getEventType()) {
                    case XMLStreamConstants.START_ELEMENT:
                        StartElement startElement = event.asStartElement();
                        System.out.println("<" + startElement.getName().getLocalPart() + ">");
                        break;
                    case XMLStreamConstants.END_ELEMENT:
                        EndElement endElement = event.asEndElement();
                        System.out.println("</" + endElement.getName().getLocalPart() + ">");
                        break;
                    case XMLStreamConstants.CHARACTERS:
                        Characters characters = event.asCharacters();
                        System.out.println(characters);

                }
            }
        } finally {
            r.close();
        }
    }
}

4. Cursor vs. Iterator API

In general, using the iterator API is recommended because it is more flexible, extensible, and object-oriented.

Benefits of the Iterator API:

  • Objects created from the XMLEvent subclasses are immutable and can be used in arrays, lists, and maps.
  • It is possible to create subtypes of XMLEvent that are either completely new information items or extensions of existing items but with additional methods.
  • Events from an XML event stream can be added and removed in much simpler ways than with the cursor API.

It is better to use cursor API:

  • If it is used in a particularly memory-constrained environment, like Java ME.
  • If performance is the highest priority.

It is better to use the iterator API:

  • If it is necessary to create XML processing pipelines.
  • If it is necessary to modify the event stream.
  • If the application should be able to handle the pluggable processing of the event stream.

5. Advantages of StAX

A simpler programming model than SAX and more efficient memory management than DOM. If to compare with SAX, StAX can not only read but also write XML data.

Read also:
Trustpilot
Trustpilot
Comments