XML Utils module
Copyright © 1995-2005 Opera Software AS. All rights reserved.
This file is part of the Opera web browser. It may not be distributed
under any circumstances.
$Id$
Introduction
The XML Utils module provides various utility classes and
interfaces for parsing and serializing XML and handling XML
names
and
namespaces.
Current
information about the XML Utils module.
API
documentation
For detailed information on the module's public API, please refer to
the
API documentation.
The documentation needs to be generated by Doxygen.
Short HOWTO:s
The following are some really brief suggestions on how to do
complete some tasks that the XML Utils module might (or might not)
help you with. Use it as a list of shortcuts into the API
documentation, if you will.
- Parse XML (note)
-
Write a class that inherits XMLTokenHandler, create an XMLParser
object passing an object of your class to the XMLParser::Make
function, and then start the parsing using either the
XMLParser::Load (for loading a URL) or XMLParser::Parse (for
parsing text directly.)
-
Write a class that inherits XMLLanguageParser, create a
XMLTokenHandler object using the
XMLLanguageParser::MakeTokenHandler function, and then use that
XMLTokenHandler object the same way you used your own in the
previous method.
-
Create an XMLFragment object, and call one of the
XMLFragment::Parse functions. There is one of parsing a string,
one for parsing undecoded data in a ByteBuffer and one for loading
and parsing the contents of a file (OpFileDescriptor).
- Parse XML into a tree of HTML_Element objects
-
XML Utils does not do this for you. But there are two interfaces
in the logdoc module that may help you, OpElementCallback and
OpTreeCallback. Using them typically means subclassing them, using
one of the functions OpElementCallback::MakeTokenHandler or
OpTreeCallback::MakeTokenHandler to create an XMLTokenHandler
object, and then parse XML using that token handler and an
XMLParser object you create and use yourself. See documentation in
the logdoc information for details. This information here might
be inaccurate, if the logdoc module's API:s change!
- Generate XML
-
If you have a tree/subtree of HTML_Element objects, create an
XMLSerializer object using the
XMLSerializer::MakeToStringSerializer function, and serialize the
tree into XML source code using the XMLSerializer::Serialize
function.
-
If you don't have a tree/subtree of HTML_Elements objects, but
something else, create an XMLFragment object, build a tree using
the functions XMLFragment::OpenElement, XMLFragment::SetAttribute
and XMLFragment::AddText (and possibly others), and then generate
the XML sourcecode using the XMLFragment::GetXML function (or
XMLFragment::GetEncodedXML if you want to have the result encoded
using some specific encoding.)
- Handle XML names
-
If you have some string that someone says is a QName (qualified
name, "prefix:localpart") and should be handled as an expanded name
({namespace URI, localpart} pair), use the class XMLCompleteNameN
(which has a constructor that takes a string on the form
"prefix:localpart"), and then the function
XMLNamespaceDeclaration::ResolveName to add a namespace URI to it.
For that to work, you need an XMLNamespaceDeclaration object that
represents the namespace declarations in scope. What that means
depends on the situation. If you are parsing XML using the
XMLLanguageParser interface, the function
XMLLanguageParser::GetCurrentNamespaceDeclaration is probably what
you want to use. If you're dealing with an attribute from an
HTML_Element, you should probably use the function
XMLNamespaceDeclaration::PushAllInScope with that HTML_Element as
the argument. Keep in mind that if the element is currently being
parsed, there might be unprocessed namespace declarations on that
element that changes the meaning of your QName; you might need to
delay your processing until all attributes on the element have been
added!
The three methods described for parsing XML are listed in order of
increasing convenience and decreasing efficiency and powerfulness.
For instance, only the first method allows you to process comments
or avoid allocating one big chunk of text for every character data
node (it allows you to receive character data split into
appropriately sized chunks directly from the parser.) On the other
hand, the XMLFragment method is by far the most convenient for
simple parsing and generation tasks. Its main disadvantage for
parsing XML is that it parses the entire XML fragment into a data
structure at once, where the other methods allow more incremental
parsing.