Thursday, July 14, 2011

XPath

What is XPath?

• XPath is a syntax for defining parts of an XML document
• XPath uses path expressions to navigate in XML documents
• XPath contains a library of standard functions
• XPath is a major element in XSLT
• XPath uses a compact , non-XML syntax to facilitate use of XPath within URIs and XML attribute values.
In XPath, there are seven kinds of nodes: element, attribute, text, namespace, processing-instruction, comment, and document nodes.
Selecting nodes
XPath uses path expressions to select nodes in an XML document. The node is selected by following a path or steps. The most useful path expressions are listed below....
Expression Description
nodename Selects all child nodes of the named node
/ Selects from the root node
// Selects nodes in the document from the current node that match the selection no matter where they are
. Selects the current node
.. Selects the parent of the current node
@ Selects attributes

AAA -- selects all the child nodes of thr AAA element

XPath as filesystem addressing:-
The basic XPath syntax is similar to filesystem addressing. If the path starts with the slash / , then it represents an absolute path to the required element.
/xxx - select the root element xxx

AAA/xxx -- selects all the xxx elements that are children of AAA

Start with //
If the path starts with // then all elements in the document which fulfill following criteria are selected.
//BBB -- Selects all BBB elements no matter where they are in the document
DDD//BBB -- Selects all BBB elements that are descendant of the DDD element, no matter where they are under the DDD element

Attributes are specified by @ prefix.
//@aks -- Selects all attributes that are named aks
//BBB[@id] -- Select BBB elements which have attribute id

Selecting unknown nodes nodes
XPath wildcards can be used to select unknown XML elements.
Wildcard Description
* matches any element node
@* Matches any attribute node
node() Matches any node of any kind
. Selects the current node

/title/* -- Selects all the child nodes of the title element
//* -- Selects all elements in the document
//title[@*] -- Selects all title elements which have any attribute
//title[not(@*)] -- Select title elements without an attribute

Selecting Several Paths
By using the | operator in an XPath expression you can select several paths.
pathexpr1 | pathexpr2
pathexpr1 or pathexpr2
Location Path Expression
A location path can be absolute or relative.
An absolute location path starts with a slash ( / ) and a relative location path does not. In both cases the location path consists of one or more steps, each separated by a slash:
An absolute location path:

/step/step/...
A relative location path:

step/step/...
Each step is evaluated against the nodes in the current node-set.
A step consists of:
• an axis (defines the tree-relationship between the selected nodes and the current node)
• a node-test (identifies a node within an axis)
• zero or more predicates (to further refine the selected node-set)
Predicates
Predicates are used to find a specific node or a node that contains a specific value.
Predicates are always embedded in square brackets.

In the table below we have listed some path expressions with predicates and the result of the expressions
Path Expression Result
/bookstore/book[1] Selects the first book element that is the child of the bookstore element.
/bookstore/book[last()] Selects the last book element that is the child of the bookstore element
/bookstore/book[last()-1] Selects the last but one book element that is the child of the bookstore element
/bookstore/book[position()<3] Selects the first two book elements that are children of the bookstore element
//title[@lang] Selects all the title elements that have an attribute named lang
//title[@lang='eng'] Selects all the title elements that have an attribute named lang with a value of 'eng'
/bookstore/book[price>35.00] Selects all the book elements of the bookstore element that have a price element with a value greater than 35.00
/bookstore/book[price>35.00]/title Selects all the title elements of the book elements of the bookstore element that have a price element with a value greater than 35.00