XML
|
Modified: |
Disclosure Examples derived from Sun Microsystems, Inc. Copyright (c) 2006
Resources
Overview
The eXtensible Markup Language (XML) is a language for defining languages, we will use it to define a hierarchical data structure. A large number of languages used for data interchange are defined in XML, examples include SOAP, RSS, and XHTML.
Many programming languages and applications such as the IE Web browser can parse XML to build a hierarchical tree (DOM) that corresponds to the XML. A parser reads the XML and constructs the corresponding tree. A program such as IE would use a parser to first construct the tree, then reference nodes on the tree to access each element of the XML.
The problem for mobile devices with this approach is that the DOM tree size is directly proportional to the XML document size, for large XML documents, memory requirements may be too great.
SAX is an alternative parser of XML, generating asynchronous events for the start and end of elements, often requiring only memory for a single element rather than the entire document. The result is often faster parsing and smaller memory requirements.
XML (eXtensible Markup Language)
A language for defining and representing languages.
- HTML (and XHTML) is a subset of XML.
- One key use is to define a data structure and data within the structure, useful for exchanging data between different applications and computer systems.
- XML gives semantic meaning to data. For example, <Zip>47150</Zip> can mean the Zip code is 47150.
- Since XML is written as text that is readable and writeable by humans, it is also computer architecture neutral, that is does not depend upon a specific bit-size representation of floating point numbers for example. XML can and is commonly used to communicate data over the Internet without regard to the sending or receiving machine characteristics.
- Many database systems such as Oracle can generate entries from a database in XML form for use by other applications that understand XML without regard to the computer system word size, etc.
- XML combined with XSL can separate implementation of the user interface from the data being presented. More on XSL later.
An example of XML use is to supply structured data to an application. Information for shipping a package from one Zip code to another can be specified in XML with a hierarchy tree as:
<package> <To Signature="Yes">47150</To> <From>47165</From> <Weight>17.0</Weight> <Rate>27.50</Rate> </package>Basic XML Rules
- Hierarchical element structure - XML documents must have a strictly hierarchical tag structure. That is, start tags must have exact corresponding end tags. In XML vocabulary, a pair of start and end tags is called an element. Any element must be completely nested within another.
- Well-formed - <To>47150</To> is well-formed because there is a <To> start tag and a </To> end tag.
- Case sensitive - XML is case sensitive. <to> 47150</To> is invalid.
- Empty - An empty element can be written as <To></To> or equivalently as <To/>.
- Text - A non-empty element can enclose other elements or text. <To>47150</To> encloses text of 47150.
- Attribute - <To Signature="Yes">47150</To> has one attribute, Signature, with the value "Yes".
Parsing in Java
The DOM and SAX parsers are part of the Java ME (Java Mobile) SDK. Install the SDK for access to the DOM and SAX packages.
XML for parsing using DOM and SAX.
<?xml version='1.0' encoding='utf-8'?> <!-- A SAMPLE set of slides --> <slideshow title="Sample Slide Show" date="Date of publication" author="Yours Truly" > <!-- TITLE SLIDE --> <slide type="all"> <title>Wake up to WW!</title> <item>WW Explained</item> </slide> <!-- OVERVIEW --> <slide type="all"> <title>Overview</title> <item>Why <em>WW</em> are great</item> <item/> <item>Who <em>buys</em> WW</item> </slide> </slideshow>
Parsing using DOM
The example is written for Java ME to be executed at the command prompt.
To compile and execute under normal Java ME installation:
Copy the XML and parser code to files Echo.xml and EchoDOM.java respectively.
javac EchoDOM.java
java EchoDOM < Echo.xml
The example:
1) constructs a DOM parser
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();2) Parses XML input from System.in
Document doc = db.parse( new InputSource(System.in) );
doc.getDocumentElement().normalize();3) Builds a list of all the element nodes under <slide> element.
NodeList nodeList = doc.getElementsByTagName( "slide" ); 4) For a single <slide> node, displays the first <title> and <item> element values.
Node node = nodeList.item(i);
Element fstElmnt = (Element) node;
NodeList titleList = fstElmnt.getElementsByTagName("title");
Element titleElement = (Element) titleList.item(0);
titleList = titleElement.getChildNodes();
System.out.println("Title = "+ ((Node) titleList.item(0)).getNodeValue());
|
|
Example - iTunes Top 10
iTunes publishes the ranked listing of downloaded tune as RSS, an XML data structure commonly used for news feeds, weather, etc.
http://ax.phobos.apple.com.edgesuite.net/WebObjects/MZStore.woa/wpa/MRSS/topsongs/limit=10/rss.xml
The class parses the RSS for a specific sequence of element names, title|item|title, to locate the title of each tune. The relevant parts of the RSS feed are listed below.
<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0"xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:itms="http://phobos.apple.com/rss/1.0/modules/itms/"> <channel> <title>iTunes Top 10 Songs</title> <link>http://itunes.apple.com/WebObjects/MZStore.woa/wa/viewTop?id=1&popId=1</link> <description>iTunes Store: Today's Top 10 Songs</description> <image> <url>/images/rss/badge.gif</url> <link>http://www.apple.com/itunes/</link> <title>iTunes Music Store</title> <height>31</height><width>88</width> </image> <item> <title>1. My Life Would Suck Without You - Kelly Clarkson</title> :Copy the parser code to files iTunes.java
javac iTunes.java
java iTunes
iTunes.java
The main function is to parse elements from the RSS feed in the order:
title
|
item
/ / | \ \
title title title title titleoccurring in sequence, the characters of the title element are then displayed.
<title>1. My Life Would Suck Without You - Kelly Clarkson</title>
import java.net.URL;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
public class iTunes {
public static void main(String args[]) throws Exception{
try {
URL url = new URL(
"http://ax.phobos.apple.com.edgesuite.net/WebObjects/MZStore.woa/wpa/MRSS/topsongs/limit=10/rss.xml");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new InputSource( url.openStream() ));
doc.getDocumentElement().normalize();
NodeList nodeList = doc.getElementsByTagName("title"); // Locate first title
Element titleElement = (Element) nodeList.item(0);
nodeList = titleElement.getChildNodes();
System.out.println("Title = " + ((Node) nodeList.item(0)).getNodeValue());
nodeList = doc.getElementsByTagName("item"); // then item
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
Element fstElmnt = (Element) node;
NodeList nameList = fstElmnt.getElementsByTagName("title"); // then title
Element nameElement = (Element) nameList.item(0);
nameList = nameElement.getChildNodes();
System.out.println(((Node) nameList.item(0)).getNodeValue());
}
} catch (Exception e) {
System.out.println("XML Pasing Exception = " + e);
}
}
} |
| Output Title = iTunes Top 10 Songs |
iTunes DOM parsing in Android
The primary difference with the previous example is that Android requires the Internet access to occur in a separate thread, with the UI updated via a Handler.
Parsing is performed on the same thread as Internet.
Each title is parsed to an instance variable array titleString.
When parsing and examining the DOM completes, titleString is copied to an array of TextView objects and added to the activities view.
AndroidManifest.xml requires:
<uses-permission android:name="android.permission.INTERNET"></uses-permission>
package edu.ius.rwisman.iTunesDOM;import android.app.Activity; import android.os.Bundle; import android.widget.LinearLayout; import android.widget.TextView; import android.os.Handler; import java.net.URL; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import org.w3c.dom.Document; import org.w3c.dom.Element; import org.w3c.dom.Node; import org.w3c.dom.NodeList; import org.xml.sax.InputSource; public class ITunesDOMActivity extends Activity { LinearLayout layout; TextView titleView[] = new TextView[10]; String titleString[] = new String[10]; ITunesDOMActivity activity = this; // this is an Activity final Handler handler = new Handler(); final Runnable updateUI = new Runnable() { public void run() { for(int i=0; i<10;i++) { titleView[i] = new TextView( activity ); // new view object titleView[i].setText( titleString[i] ); layout.addView( titleView[i] ); // add to layout } } }; @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); layout = new LinearLayout( this ); //Create a new layout to display the view layout.setOrientation(1); setContentView( layout ); // Set the layout view to display iTunesDisplay(); } public void iTunesDisplay() { new Thread(new Runnable() { public void run() { try { URL url = new URL( "http://ax.phobos.apple.com.edgesuite.net/WebObjects/MZStore.woa/wpa/MRSS/topsongs/limit=10/rss.xml"); DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse(new InputSource(url.openStream())); doc.getDocumentElement().normalize(); NodeList nodeList = doc.getElementsByTagName("title"); Element titleElement = (Element) nodeList.item(0); nodeList = titleElement.getChildNodes(); System.out.println("Title = "+ ((Node) nodeList.item(0)).getNodeValue()); nodeList = doc.getElementsByTagName("item"); for (int i = 0; i < nodeList.getLength(); i++) { Node node = nodeList.item(i); Element fstElmnt = (Element) node; NodeList nameList = fstElmnt.getElementsByTagName("title"); Element nameElement = (Element) nameList.item(0); nameList = nameElement.getChildNodes(); titleString[i] = ((Node) nameList.item(0)).getNodeValue(); } } catch (Exception e) { System.out.println("XML Pasing Exception = " + e); } handler.post( updateUI ); // Completed, call-back to UI thread } }).start(); } }
Parsing using SAX
Below are two examples that parse and echo a well-formed XML document using SAX.
The first, for simplicity, is written for command line execution; the second for Android.
SAX Example using Java ME
The next example is written for Java ME to be executed at the command prompt. To compile and execute under normal Java ME installation:
Copy the XML and parser code to file iTunesSAX.java
javac iTunesSAX.java
java iTunesSAX
The parse locates an element item, then displays any following title elements from the iTunes hierarchy in the RSS feed.
item
/ / | \ \
title title title title titleParser
Over-ride DefaultHandler methods:
- public void startDocument() throws SAXException - Called when a document started.
- public void endDocument() throws SAXException - Called when a document ended.
- public void startElement(String namespaceURI, String sName, String qName, Attributes attrs) throws SAXException - Called when a start element is parsed.
- public void endElement(String namespaceURI, String sName, String qName ) throws SAXException - Called when an end element is parsed.
- public void characters(char[] buf, int offset, int len) throws SAXException - Called with characters of an element or attribute.
import java.net.URL; import java.io.*; import org.xml.sax.*; import org.xml.sax.helpers.DefaultHandler; import javax.xml.parsers.SAXParserFactory; import javax.xml.parsers.SAXParser; public class iTunesSAX extends DefaultHandler { Boolean itemFound = false; String element=""; public static void main(String args[]) throws Exception { URL url = new URL( "http://ax.phobos.apple.com.edgesuite.net/WebObjects/MZStore.woa/wpa/MRSS/topsongs/limit=10/rss.xml"); new iTunesSAX(url.openStream()); } public iTunesSAX( InputStream in ) { SAXParserFactory factory = SAXParserFactory.newInstance(); // Default (non-validating) parser try { SAXParser saxParser = factory.newSAXParser(); // Parse the input saxParser.parse( in, this); } catch (Throwable t) { t.printStackTrace(); } } /* SAX DocumentHandler methods */ public void startDocument() throws SAXException {} public void endDocument() throws SAXException {} public void startElement(String namespaceURI, String sName, // simple name String qName, // qualified name Attributes attrs) throws SAXException { if( sName.equals("item") ) itemFound = true; element = ""; } public void endElement(String namespaceURI, String sName, // simple name String qName // qualified name ) throws SAXException { if( itemFound && sName.equals("title") ) System.out.println(element); } public void characters(char[] buf, int offset, int length) throws SAXException { if( length <= 0) return; element = element + new String(buf, offset, length); } }Output
1. Moves Like Jagger (Studio Recording from "The Voice" Performance) [feat. Christina Aguilera] - Maroon 5
2. Someone Like You - ADELE
3. Pumped Up Kicks - Foster the People
4. Stereo Hearts (feat. Adam Levine) - Gym Class Heroes
5. You Make Me Feel... (feat. Sabi) - Cobra Starship
6. Party Rock Anthem (feat. Lauren Bennett & GoonRock) - LMFAO
7. Without You (feat. Usher) - David Guetta & Usher
8. Yoü and I - Lady GaGa
9. Cheers (Drink to That) - Rihanna
10. Paradise - Coldplay
<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0"xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:itms="http://phobos.apple.com/rss/1.0/modules/itms/"> <channel> <title>iTunes Top 10 Songs</title> <link>http://itunes.apple.com/WebObjects/MZStore.woa/wa/viewTop?id=1&popId=1</link> <description>iTunes Store: Today's Top 10 Songs</description> <image> <url>/images/rss/badge.gif</url> <link>http://www.apple.com/itunes/</link> <title>iTunes Music Store</title> <height>31</height><width>88</width> </image> <item> <title>1. My Life Would Suck Without You - Kelly Clarkson</title> :
SAX Example using Android
A necessary complication is the requirement that Internet operations be performed in a separate thread from the main activity.
In an independent thread, parsing builds an array of the 10 title strings. When parsing is completed, a message is posted to a handler.
The handler thread references the title array of the parser to update the UI.
Keep in mind that the UI thread creates an ITunesSAX object but we want Internet operations and parsing on a separate thread.
One minor bit of trickiness is required to have the parser execute call-backs within the thread used for downloading from the Internet.
class ITunesSAX extends DefaultHandler defines a parser class.
final ITunesSAX iTunesSAX = this; references the parser object, this.
saxParser.parse( in, iTunesSAX); specifies the parser object to use for parsing, executed on the Internet thread.
class ITunesSAX extends DefaultHandler {
public ITunesSAX(final URL url, final Handler handler, final Runnable updateUI) {
final ITunesSAX iTunesSAX = this; // this is an ITunesSAX object
new Thread(new Runnable() {
public void run() {
SAXParserFactory factory = SAXParserFactory.newInstance();
try {
InputStream in = url.openStream();
SAXParser saxParser = factory.newSAXParser();
saxParser.parse( in, iTunesSAX); // Parse the input
Now the
AndroidManifest.xml requires:
<uses-permission android:name="android.permission.INTERNET"></uses-permission>
package edu.ius.rwisman.iTunesSAX; |
<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0"xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:itms="http://phobos.apple.com/rss/1.0/modules/itms/"> <channel> <title>iTunes Top 10 Songs</title> <link>http://itunes.apple.com/WebObjects/MZStore.woa/wa/viewTop?id=1&popId=1</link> <description>iTunes Store: Today's Top 10 Songs</description> <image> <url>/images/rss/badge.gif</url> <link>http://www.apple.com/itunes/</link> <title>iTunes Music Store</title> <height>31</height><width>88</width> </image> <item> <title>1. My Life Would Suck Without You - Kelly Clarkson</title> : |