www.openlinksw.com
docs.openlinksw.com

Book Home

Contents
Preface

XML Support

Rendering SQL Queries as XML (FOR XML Clause)
XML Composing Functions in SQL Statements (SQLX)
Virtuoso XML Services
Querying Stored XML Data
Using UpdateGrams to Modify Data
XML Templates
XML DTD and XML Schemas
XQuery 1.0 Support
Types of XQuery Expressions Details of XQuery Syntax Pre-compilation of XPath and XQuery Expressions
XSLT Transformation
XMLType
Changing XML entities in DOM style

13.8. XQuery 1.0 Support

The Virtuoso Server provides support for the XQuery 1.0 XML Query Language specification. This specification is currently in the working draft stage at the W3C XML Query Working Group working in collaboration with the W3C XSL Working Group. Both the syntax and semantics of XQuery will probably vary from version to version.

In addition to the XQuery 1.0 standard, which describes the language, the XQuery 1.0 and XPath 2.0 Functions and Operators Version 1.0 specification describes a set of built-in functions. As with all W3C in-progress efforts, there is a list of open issues detailing problems and unresolved areas; where these affect Virtuoso's implementation, they are noted below.

This chapter is not an XQuery textbook and does not replace XQuery-related specifications of W3C. Only Virtuoso-specific extensions and differences are described here.

The most important deviation from the standard is that Virtuoso does not provide full type information about data values. As a consequence, "typeswitch" and automatic type conversions are not implemented.

13.8.1. Types of XQuery Expressions

The current draft of XQuery lists 10 groups of XQuery expressions:

Not all groups of expressions are implemented. In some groups, not all kinds of clauses are implemented.

In addition to the standard, Virtuoso supports special cases of FLWR expressions to deal with XML views:

13.8.1.1. Primary Expressions

XQuery processor uses 32-bit integers on 32-bit platforms and 64-bit integers on 64-bit platforms. Similarly, the scale and precision of floating-point operations may vary from platform to platform.

Note that string literals are handled differently in XPath 1.0 and XQuery. "Ben "amp; Jerry"apos;s" denotes the string "Ben " Jerry's" in XQuery and the string "Ben "amp; Jerry"apos;s" in XPath.


13.8.1.2. Path Expressions

Any XPath 1.0 expression is a valid XQuery 1.0 path expression, which the Virtuoso XQuery processor supports. When invoked from the XQuery context, the XPath Processor works in accordance with XSLT rules. There are two major differences between standalone and XQuery/XSLT path expressions. First, the meaning of non-qualified name used as NameTest criterion, as described below. Second, the data type used for attributes varies. In XPath or XQuery mode, if a value is calculated by an attribute:: axis, it is of type attribute entity; in standalone XPath, the string value of the attribute is used instead.

As specified in the XQuery 1.0 standard, a node-set returned by an XPath expression may be used as a sequence of items, where every node of that node-set becomes an item of the sequence. The opposite is not true, however. Not every sequence may be converted into a node-set, even if it is a sequence of nodes. If XPath starts from a function which returns a sequence, an error message "Context node is not an entity" is returned. Fortunately, a variable of type sequence may be used as a node-set if all items of the sequence are nodes.

Obsolete drafts of W3C specification contains description of "pointer operator". Virtuoso continues to support this operator to provide backward compatibility. XQuery processor needs DTD data associated with the XML document in question to distinguish ID attributes from other sorts of attributes and to bookmark elements that have ID locations. For more details, see the description of id() XPATH function. This function uses same DTD data for same purposes, so for any given document, either both id() and "pointer operator" are applicable or both does not work.

Sometimes the "Context node is not an entity" error is signalled if the beginning of the XPath expression is surrounded by parenthesis, even if the expression works fine without these parenthesis. This happens because "(...)" is an "append" operator in XQuery, not just a way to group subexpressions. "append" converts a node-set into a sequence even when it is called with a single argument - that is, without commas inside "(...)". This sequence cannot be used as input for the rest of the XPath expression.

As an syntax extension, special notation of QNames is added and can be used, e.g., in NameTest. An expanded name can be surrounded by delimiters (! and !), like (!http://www.example.com:MyTag!) and this syntax allows names that contain otherwise prohibited characters. This syntax is also useful when the text of the query is generated by software.

Note that the NameTest that consists of an unqualified name has different meanings in Virtuoso XPath and in XQuery. In XPath, NameTest "sample-tag" means "any element whose local-name is equal to sample-tag". In XQuery, the same test means "any element without namespace-uri whose local-name is equal to sample-tag".


13.8.1.3. Sequence Expressions

XQuery sequences are supported not only in XQuery but can also be handled in XPath and XSLT. When the XQuery processor is invoked from SQL and a sequence is returned to the caller, the sequence is automatically converted into a vector of its elements.

Virtuoso supports all sequence operations listed in current W3C paper plus deprecated operations BEFORE and AFTER.

The sequence concatenation operator is available in XPath and XSLT as the append() function. In addition, the tuple() function is available to get the first items of every given argument sequence and return the sequence of these items.

XQuery operators UNION, INTERSECT, EXCEPT are available in XPath and XSLT as functions union(), intersect() and except().


13.8.1.4. Arithmetic, Comparison and Logical operations

Virtuoso shares the implementation of basic arithmetic and comparison operations between XPath, XQuery, XSLT, SQL and Virtuoso/PL processors, so type casting, scale and precision of calculated values are identical across the system. All operators are available in XQuery, in addition, << and >> operators are available in XPath and XSLT as is_before() and is_after() built-in functions.


13.8.1.5. Element Constructors

Virtuoso XQuery supports all XQuery 1.0 direct constructors. Previous versions of W3C draft contained the syntax for placing calculated content into the opening tag of direct element constructor, such as <{concat("calculated-", "element-name")} {concat("calculated-", "attribute-name")}={concat("calculated-", "attribute-value")}>...</> . Thus name of element or attribute, or a value of an attribute can be calculated dynamically. This syntax is still supported. The create-element() XPath function is implemented to make this functionality available in XPath. Additionally, a special function create-attribute() may be used to create a new dynamic attribute entity with value and name calculated, this works similarly to xsl:attribute XSLT instruction.

Similarly, create-comment(), create-element() and create-pi() mimics other XQuery direct constructors in XPath and XSLT.

The XQuery specification states that when sequence of atomic values is covnerted into content of an element constructor, whitespace character is insertedbetween adjacent values.

Unlike previous versions of Virtuoso, current XQuery syntax allows you to use "pure XML notation" inside element constructors. Thus there is no strict need to write 'constant' expression <emp empid="12345"><name>John Smith</name><job>Bubble sorter</job></emp> as it is dynamically calculated text, like <emp empid="12345"><name>{'John Smith'}</name><job>{'Bubble sorter'}</job></emp> It is still may be useful to write 'constant' expression in the old way. This artificial restriction simplifies finding syntax errors, because there are syntactically wrong expressions that are still correct "pure XML notation." Alternatively, CDATA sections also may be used to make it obvious that the string is a constant, not an expression with forgotten braces around it: <emp empid="12345"><name><![CDATA[John Smith]]></name><job><![CDATA[Bubble sorter]]></job></emp>

The current version of Virtuoso does not support the new XQuery syntax for dynamic constructors.


13.8.1.6. FLWR Expressions

FLWR expressions are fully supported by Virtuoso XQuery. Moreover, for() and let() XPath functions are implemented to make this functionality available in XPath and XSLT. In addition, assign() and progn() functions are available to deal with extension functions, especially when extension functions are called for their side effects.

A special xmlview() function allows very effecient access to SQL data from XML views.

Previous XQuery specifications used "sort by" instead of "order by". The difference is that "sort by" was applicable to the final results of the FLWR statement made by RETURN clause whereas "order by" reorders input data for RETURN. Thus, "order by" can sort outputs using data that do not appear in the final result. E.g., an expression can collect items, "order" them by category and title but output only title and price. This was much harder in previous versions of XQuery because it was necessary to prepare an intermediate result that contained title and price and category, then do "sort" by category and title then use one more FLWR expression to form a result that is free from redundand data about category.

Nevertheless, Virtuoso supports both "sort by" and "order by", to keep backward compatibility. Moreover, "sort by" operator can be freely used with no relation to any FLWR subexpression. Typical use of such a simplified notation is <hit-list>{//track[@rating] sort by (@rating descending)}</hit-list> instead of portable <hit-list>{for $t in //track[@rating] order by $t/@rating descending return $t}</hit-list>


13.8.1.7. Ordered and Unordered Expressions

The current version of Virtuoso does not use ordered/unordered hints. Everything is calculated ordered. This will change in the future but it is not advisable to place "unordered" hints for future use because there's no way to validate these hints. It is better to place appropriate comments but not hints.


13.8.1.8. Control Expressions

The if() special function mimics the XQuery operator for use in XPath and XSLT. Functions and() and or() are also control expressions because they calculate arguments in strict left-to-right order and may omit the calculation of some results.


13.8.1.9. Quantified Expressions

Both the SOME and EVERY operators are implemented. The some() and every() XPath functions are implemented to make this functionality available in XPath and XSLT.


13.8.1.10. Expressions That Test or Modify Data types

The operators IS, CASTABLE, CAST, TREAT, TYPESWITCH and VALIDATE are not implemented.


13.8.1.11. FOR Clause Expressions With xmlview() Function

XML views can be queried using FOR Clause from FLWR expressions. The xmlview() function allows XML views to be accessed as if they were XML documents. XPath expressions beginning with the xmlview() function will be translated into SQL statements to avoid redundant data access and to avoid creating a whole XML tree.



13.8.2. Details of XQuery Syntax

Virtuoso XQuery uses some syntax extensions. Most visible is an additional notation for qualified names as described above (name is surrounded by "(!...!)" delimiters. An earlier implementation allowed single-line comments started with "#" or "--" continuing to the end of line, this syntax is now obsolete.

The "default namespace declaration" clause is not currently supported, to make the text of XQuery unambiguous. If used, default namespaces must extend element names but not attribute names. Extension function names must be extended as they have non-default namespace prefixes but the names of basic functions should not be extended by the default namespace. Finally, Virtuoso will not preserve any information about used namespace prefixes, so default namespaces will be converted into non-default when the resulting XML entity is printed.


13.8.3. Pre-compilation of XPath and XQuery Expressions

Virtuoso compiles XPath and XQuery expressions as early as it is possible. E.g. if the first argument of xquery_eval() is a string constant then the SQL compiler will invoke the XQuery compiler to avoid on-demand compilation(s) of this text.

This feature significantly enhances performance of XQuery expressions embedded in SQL. For a simple search on XML document of average size the compilation time can be three times greater than execution time. In addition, the use of sql:column() special XQuery function is possible only when pre-compilation can be done by SQL compiler.

Pre-compilation is impossible if the text of the expression is not a constant. The typical case is passing an XQuery expression as parameter to a function. In this case the expression is compiled during the call of xquery_eval() and stored for future use. If the same string is passed again to the same invocation of xquery_eval() then a stored compiled expression is used.

Only partial pre-compilation is possible if XQuery expression refers to not-yet defined extension functions or to external resources. Partial pre-compilation gives little gain in speed, but it allows the use of sql:column()

The most important fact about pre-compilation is that passing parameters into XQuery statement is much more efficient than printing then into the text of the query. This is similar to SQL queries.

Good and Poor Coding Practices

GOOD The expression is compiled once when SQL query is compiled:

select xquery_eval('count(//abstract)', SOURCE_XML) from LIB..ARTICLES;

GOOD The expression is compiled once when SQL query is compiled:

select xquery_eval('count(//article[@id=$main_id]/abstract)', SOURCE_XML, 1, vector('main_id', MAIN_ID))
  from LIB..ARTICLES;

POOR The expression is compiled once per data row. In addition, a hard-to-find error will occur if a value of MAIN_ID may contain double quote or a backslash character.

select xquery_eval(sprintf('count(//article[@id="%s"]/abstract)', MAIN_ID), SOURCE_XML)
  from LIB..ARTICLES;

GOOD The XQuery expression is compiled once per execution of the SQL query. The SQL compiler pays special attention to queries that import external resources, because the content and availability of these resources may differ from call to call. In addition, importing an external resource is usually not possible during SQL compilation due to deadlock danger, so the compilation is postponed until run time, but this is not too bad anyway. Even in this sophisticated case, XQuery can contain calls of sql:column()

.
select xquery_eval('
    namespace tools="http://www.example.com/lib/tools/"
    import define "http://www.example.com/lib/tools/common.xqr"
    tools:extract-keywords(//abstract)',
  SOURCE_XML)
from LIB..ARTICLES;

GOOD Two XQuery expressions are compiled during SQL compilation.

select
  case
    when SOURCE_IS_DOCBOOK then xquery_eval ('//formalpara[title="See Also"]/para', SOURCE_XML)
    else xquery_eval ('//p[@style="seealso"]', SOURCE_XML)
  end
from LIB..ARTICLES;

POOR Virtuoso can not pre-compile XQquery expressions. Moreover, only one precompiled expression is cached per occurrence of xquery_eval() in the SQL statement so it is possible that an XQuery compiler will start once per data row.

select
  xquery_eval (
    case
      when SOURCE_IS_DOCBOOK then '//formalpara[title="See Also"]/para'
      else '//p[@style="seealso"]'
    end,
  SOURCE_XML)
from LIB..ARTICLES;