XML Utils module

Copyright © 1995-2005 Opera Software AS. All rights reserved. This file is part of the Opera web browser. It may not be distributed under any circumstances.
$Id$

Introduction

The XML Utils module provides various utility classes and interfaces for parsing and serializing XML and handling XML names and namespaces.

Current information about the XML Utils module.

API documentation

For detailed information on the module's public API, please refer to the API documentation. The documentation needs to be generated by Doxygen.

Short HOWTO:s

The following are some really brief suggestions on how to do complete some tasks that the XML Utils module might (or might not) help you with. Use it as a list of shortcuts into the API documentation, if you will.

Parse XML (note)
Write a class that inherits XMLTokenHandler, create an XMLParser object passing an object of your class to the XMLParser::Make function, and then start the parsing using either the XMLParser::Load (for loading a URL) or XMLParser::Parse (for parsing text directly.)
Write a class that inherits XMLLanguageParser, create a XMLTokenHandler object using the XMLLanguageParser::MakeTokenHandler function, and then use that XMLTokenHandler object the same way you used your own in the previous method.
Create an XMLFragment object, and call one of the XMLFragment::Parse functions. There is one of parsing a string, one for parsing undecoded data in a ByteBuffer and one for loading and parsing the contents of a file (OpFileDescriptor).
Parse XML into a tree of HTML_Element objects
XML Utils does not do this for you. But there are two interfaces in the logdoc module that may help you, OpElementCallback and OpTreeCallback. Using them typically means subclassing them, using one of the functions OpElementCallback::MakeTokenHandler or OpTreeCallback::MakeTokenHandler to create an XMLTokenHandler object, and then parse XML using that token handler and an XMLParser object you create and use yourself. See documentation in the logdoc information for details. This information here might be inaccurate, if the logdoc module's API:s change!
Generate XML
If you have a tree/subtree of HTML_Element objects, create an XMLSerializer object using the XMLSerializer::MakeToStringSerializer function, and serialize the tree into XML source code using the XMLSerializer::Serialize function.
If you don't have a tree/subtree of HTML_Elements objects, but something else, create an XMLFragment object, build a tree using the functions XMLFragment::OpenElement, XMLFragment::SetAttribute and XMLFragment::AddText (and possibly others), and then generate the XML sourcecode using the XMLFragment::GetXML function (or XMLFragment::GetEncodedXML if you want to have the result encoded using some specific encoding.)
Handle XML names
If you have some string that someone says is a QName (qualified name, "prefix:localpart") and should be handled as an expanded name ({namespace URI, localpart} pair), use the class XMLCompleteNameN (which has a constructor that takes a string on the form "prefix:localpart"), and then the function XMLNamespaceDeclaration::ResolveName to add a namespace URI to it. For that to work, you need an XMLNamespaceDeclaration object that represents the namespace declarations in scope. What that means depends on the situation. If you are parsing XML using the XMLLanguageParser interface, the function XMLLanguageParser::GetCurrentNamespaceDeclaration is probably what you want to use. If you're dealing with an attribute from an HTML_Element, you should probably use the function XMLNamespaceDeclaration::PushAllInScope with that HTML_Element as the argument. Keep in mind that if the element is currently being parsed, there might be unprocessed namespace declarations on that element that changes the meaning of your QName; you might need to delay your processing until all attributes on the element have been added!

Note on "Parse XML"

The three methods described for parsing XML are listed in order of increasing convenience and decreasing efficiency and powerfulness. For instance, only the first method allows you to process comments or avoid allocating one big chunk of text for every character data node (it allows you to receive character data split into appropriately sized chunks directly from the parser.) On the other hand, the XMLFragment method is by far the most convenient for simple parsing and generation tasks. Its main disadvantage for parsing XML is that it parses the entire XML fragment into a data structure at once, where the other methods allow more incremental parsing.