Search
Overview

Nux vomica refers to a tree native to the East Indies, as well as its nut-like seeds. In homeopathy, it is one of the most commonly prescribed remedies.

Besides that, it is also a small, straightforward, and surprisingly effective open-source extension of the XOM XML library. Nux is geared towards versatile embedded integration and interchange, in particular for high-throughput server container environments (e.g. large-scale Peer-to-Peer messaging network infrastructures over high-bandwidth networks, scalable MOMs, etc). But its simplicity also makes it useful for client side XML query/transformation workflow pipelines. Features include:

  • Seamless W3C XQuery support for XOM.
  • Efficient and flexible pools and factories for XQueries, XSL Transforms, as well as Builders that validate against various schema languages, including W3C XML Schemas, DTDs, RELAX NG, Schematron, etc.
  • For simple and complex continuous queries and/or transformations over very large or infinitely long XML input, a convenient streaming path filter API combines full XQuery support with straightforward filtering.
  • Glue for integration with JAXB and for queries over ill-formed HTML.
  • All this is rock-solid, dependable, well documented, and ships in a jar file that weighs just 60 KB.

Motivation

Have you ever tried to do queries and/or transformations over XML data sources? Chances are that manual SAX/DOM processing was cumbersome at best, that XPath was not powerful or flexible enough, or XSLT perhaps too complicated, and that most related APIs have a steep learning curve, and contain quite a few bugs.

This is where the power and simplicity of XQuery comes in. Nux provides seamless XQuery support for XOM, leveraging the standards compliance, efficiency and maturity of the Saxon engine. Since XQuery is a superset of XPath 2.0 it can also be used with plain XPath expressions as queries. It implements the W3C Working Draft 23 July 2004, and passes several exhaustive test suites.

Have you ever tried to build an XML system that is straightforward, works correctly and processes thousands of small XML messages per second in non-trivial ways? Chances are you've encountered lots of non-obvious obstacles down that path. For that scenario, Nux couples the simplicity and correctness qualities of XOM with efficient and flexible pools and factories for XQueries, XSL Transforms, as well as Builders that validate against various schema languages, including W3C XML Schemas (leveraging Xerces), RELAX NG, Schematron, etc. (leveraging MSV). Glue for integration with JAXB and for queries over ill-formed HTML is also provided.

Example Usage

// find the links of all images in a XHTML-like document XQuery xquery = new XQuery("//*:img/@src", null); // find the links of all JPG images in a XHTML-like document via regular expression // XQuery xquery = new XQuery("//*:img/@src[matches(., '.jpg')]", null); Builder builder = BuilderPool.GLOBAL_POOL.getBuilder(false); Document doc = builder.build(new File("/tmp/test.xml")); Nodes results = xquery.execute(doc).toNodes(); for (int i=0; i < results.size(); i++) { System.out.println("node "+i+": " + results.get(i).toXML()); //System.out.println("node "+i+": " + XOMUtil.toPrettyXML(results.get(i))); }

<bib> { for $b in doc("http://bstore1.example.com/bib.xml")/bib/book where $b/publisher = "Addison-Wesley" and $b/@year > 1991 return <book year="{ $b/@year }"> { $b/title } </book> } </bib>

for $i in doc("items.xml")//item_tuple let $b := doc("bids.xml")//bid_tuple[itemno = $i/itemno] where contains($i/description, "Bicycle") order by $i/itemno return <item_tuple> { $i/itemno } { $i/description } <high_bid>{ max($b/bid) }</high_bid> </item_tuple>
Querying Nasty HTML

If you'd like to query non-XML documents such as the typical HTML that lives out there, you can combine Nux with TagSoup, which is a "SAX-compliant parser that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: nasty and brutish, though quite often far from short". TagSoup plugs into XOM and makes ill-formed HTML appear as well-formed XML. Just add tagsoup.jar (for example part of XOM download) to the classpath and try this:

// find the links of all images in an ill-formed HTML document XQuery xquery = new XQuery("//*:img/@src", null); XMLReader parser = new org.ccil.cowan.tagsoup.Parser(); // tagsoup parser Document doc = new Builder(parser).build("http://www.yahoo.com"); Nodes results = xquery.execute(doc).toNodes(); for (int i=0; i < results.size(); i++) { System.out.println("node "+i+": " + results.get(i).toXML()); //System.out.println("node "+i+": " + XOMUtil.toPrettyXML(results.get(i))); }
Streaming Path Filters for very large XML inputs

For simple and complex continuous queries and/or transformations over very large or infinitely long XML input documents, we provide a convenient streaming path filter API, combining full XQuery support with straightforward filtering.

Related Information

A GUI XQuery Editor helps to learn the query language, and to quickly try out queries during early development stages. Such editors include Oxygen (commercial but with free trial, including Eclipse plugin) and Stylus Studio (commercial but with free trial). Using such rapid prototyping GUIs before deploying into your Nux-based production application can speed up development and early testing. Incidentally, these GUIs also use the Saxon XQuery engine internally, just like Nux.

Easy-to-read tutorials and other material about XQuery includes:


© 2003-2004, Lawrence Berkeley National Laboratory Valid HTML 4.01! Valid CSS!