saxon:analyze-string
Analyzes a string using a regular expression.
analyze-string($select as xs:string, $regex as xs:string, $matching as function(*), $nonMatching as function(*)) ➔ item()*
Arguments | |||
| $select | xs:string | The input string |
| $regex | xs:string | The regular expression |
| $matching | function(*) | Function to be called for each matching substring |
| $nonMatching | function(*) | Function to be called for each non-matching substring |
Result | item()* |
analyze-string($select as xs:string, $regex as xs:string, $matching as function(*), $nonMatching as function(*), $flags as xs:integer) ➔ item()*
Arguments | |||
| $select | xs:string | The input string |
| $regex | xs:string | The regular expression |
| $matching | function(*) | Function to be called for each matching substring |
| $nonMatching | function(*) | Function to be called for each non-matching substring |
| $flags | xs:integer | The regular expression flags |
Result | item()* |
Namespace
http://saxon.sf.net/
Notes on the Saxon implementation
This function is obsolescent, as a function analyze-string() is available in XPath 3.0. The XPath 3.0 function, instead of using higher-order function callbacks, generates the analyzed string as a marked up XML document.
Details
The action of this function is analagous to the
xsl:analyze-string
instruction in XSLT 2.0. It is
provided to give XQuery users access to
The first argument defines the string to be analyzed. The second argument is the
The third and fourth arguments are function items, called the matching and non-matching
functions respectively. The matching function is called once for each substring of the
input string that matches the regular expression; the non-matching function is called
once for each substring that does not match. These functions may return any sequence.
The final result of the saxon:analyze-string
function is the result of
concatenating these sequences in order.
The matching function takes two arguments. The first argument is the substring that was
matched. The second argument is a sequence, containing the matched subgroups within this
substring. The first item in this sequence corresponds to the value $1
as
supplied to the replace()
function, the second item to $2
, and so on.
The non-matching function takes a single argument, namely the substring that was not matched.
The detailed rules follow xsl:analyze-string
. The regex must not match a
zero-length string, and neither the matching nor non-matching functions will ever be
called to process a zero-length string.
The following example is a "multiple match" example. It takes input like this:
<doc>There was a young fellow called Marlowe</doc>and produces output like this:
<out>Th[e]r[e] was a young f[e]llow call[e]d Marlow[e]</out>The XQuery code to achieve this is:
declare namespace f="f.uri"; declare function f:match ($c, $gps) { concat("[", $c, "]") }; declare function f:non-match ($c) { $c }; <out> { string-join( saxon:analyze-string(doc, "e", f:match#2, f:non-match#1)), "") } </out>The following example is a "single match" example. Here the regex matches the entire
input, and the matching function uses the subgroups to rearrange the result. The input
in this case is the document <doc>12 April 2004</doc>
and the
output is <doc>2004 April 12</doc>
. Here is the query:
This particular example could be achieved using the replace()
function:
the difference is that saxon:analyze-string
can insert markup into the
result, which replace()
cannot do.