2

SAX vs DOM parser

In the last series we have explained about DOM parser in the article “Create and read XML files using DOM Parser“. In today’s article we will discuss about SAX Parsing model and also see the difference between SAX and DOM parser.

Simple API for XML (SAX) parsing is different from DOM as it parses the XML files step by step and is event based model. The SAX parser triggers an event when they encounter an opening tag, element or attribute.
Unlike in DOM parser it is advisable to use the SAX parser for parsing large XML documents as it does not load the complete XML file in the memory.  This parser parses node by node so it can read large XML files in smaller parts.

Let’s take a look at the difference between SAX and DOM parsers.

DOMSAX
Tree model parser (Tree of nodes)Event based parser (Sequence of events)
DOM loads the file into the memory and then parse the fileSAX parses the file at it reads i.e. Parses node by node
Has memory constraints since it loads the whole XML file before parsingNo memory constraints as it does not store the XML content in the memory
DOM is read and write (can insert or delete the node)SAX is read only i.e. can’t insert or delete the node
If the XML content is small then prefer DOM parserUse SAX parser when memory content is large
Backward and forward search is possible for searching the tags and evaluation of the information inside the tags. So this gives the ease of navigationSAX reads the XML file from top to bottom and backward navigation is not possible
Slower at runtimeFaster at runtime

How does the SAX parsing works?
The below flowchart provides a very good understanding of the SAX API usage and how the SAX parsing works.

SAX Parsing

SAX Parsing

Create a SAXParserFactory  instance that enables application that enables applications to create the SAXParser.

SAXParserFactory factory = SAXParserFactory.newInstance();

The InputSteam object holds the file data from the target path.

InputStream inputData = new FileInputStream(targetPath);

The XML parsing is done through the parse() method of the SAXParser instance. The SAXParser API wraps the org.xml.sax.XMLReader implementation class. It is the SAXReader that carries on the conversation with the SAX event handlers.
A SAXHandler class is created which is a subclass of the org.xml.sax.helpers.DefaultHandler class. The DefaultHandler class is the base class for SAX2 event handlers.

SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new SaxHandler();
saxParser.parse(inputData, handler);

It provides default implementations for all the callbacks in the four core SAX2 handler classes – Content Handler, Error Handler, DTD Handler and Entity resolver. The ContentHandler is the most important interface in the SAX parser. This interface requires a number of methods that the SAX parser invokes in response to various parsing events.The major event-handling methods are:startDocumentendDocumentstartElementendElement, and characters. Now this means that SAX parser iterates over the XML document and call the above methods respectively when it meets structural elements of the XML like startDocument, startElement, characters, comments, end element, end document. The event driven methods can be overridden in the child class SaxHandler.

public class SaxHandler extends DefaultHandler {
	public void startDocument() {
		System.out.println("Start document: ");
	}

	public void endDocument() {
		System.out.println("End document: ");
	}

	public void startElement(String uri, String localName, String qName,
			Attributes attributes) throws SAXException {
		System.out.println("start element    : " + qName);
	}

	public void endElement(String uri, String localName, String qName)
			throws SAXException {
		System.out.println("end element      : " + qName);
	}

	public void characters(char c[], int start, int length)
			throws SAXException {
		System.out.println("start characters : "
				+ new String(c, start, length));
	}
}

It is the responsibility of the DefaultHandler to extract the values of the XML using the above mentioned event handling methods.

<<Download Sample Code>>

Please keep visiting this space for more interesting articles.

Filed in: Java, Random, XML Tags: , , , , , ,

Get Updates

Share This Post

Related Posts

2 Responses to "SAX vs DOM parser"

  1. chandima says:

    thank you very helpful article. easy to understand and very clear .

Leave a Reply

Submit Comment

© 2014 IdioTechie. All rights reserved.