Here is one example of how to use the extension, XqUSEme:
Let's say I want to grab my own details for this extension at https://addons.mozilla.org/en-US/firefox/addon/5515. While I have the latter page open (this is necessary as described below), I go to Tools->Perform XQuery.
In looking at the source code for the page in the bottom window (or if I had viewed the source before opening this window), I see that there is a <div> in this page with the id, "addon-info" which contains all of the paragraphs describing my extension. I'll use that information in the query.
Since XQuery allows plain XPath as well as more complicated loops, etc., I might try something as simple as the following (the body tags surrounding the heading and results are necessary since the result must in such a case be a well-formed document which has just one root node--thus the result will be well-formed HTML but not valid HTML).
Note that without the DTD, I do not need a default element namespace because Firefox's DOM strips out the xmlns attribute on the root 'html' element when the document is not served as true XHTML.
My extension details:
{
doc()//div[@id='addon-info']
}
]]>
However, in trying this, if I have not stayed with the default behavior in the preferences of automatically stripping the DTDs (for HTML) and if I have not stripped it myself, I will get no results for the details. Since the XHTML DTD has 'xmlns' as a fixed attribute, the XML parser used by Saxon will add that information back to the document representation, so we must define a namespace in our query (see below).
Also, if we want to query a document from remote, as long as it is well-formed (not necessarily even true XHTML served as application/xhtml+xml though), I can grab it by a direct URL reference as well. However, as with the case above, I must reference its XHTML tags with a namespace, since the DTD has 'xmlns' as a fixed attribute on <html>. Grabbing URL's directly is generally not recommended for XHTML, however, as it not only avoids a chance for document clean-up in the case of poorly-formed HTML, but it also taxes the W3C servers which must serve the DTD's live. (Unfortunately, I am not aware of any means to turn off DTD checking in Saxon.)
My extension details:
{
doc("https://addons.mozilla.org/en-US/firefox/addon/5515")//div[@id='addon-info']
}
]]>
In an earlier version of this same site, before XHTML was used, when I did not remove the DTD, I got an error about the entity HTML.version. (For those familiar with XML (or SGML), this should give a clue that the problem of the query lied in the DTD.) Since the XML parser employed by Saxonica doesn't like the old SGML-style comments used in the referenced HTML DTD, it spat out an error. The solution for such files is to simply click the Strip DTD button and try again (or ensure the preferences remain set to strip the DTD).