Package net.sf.saxon.value
Class Whitespace
- java.lang.Object
-
- net.sf.saxon.value.Whitespace
-
public class Whitespace extends java.lang.Object
This class provides helper methods and constants for handling whitespace
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
Whitespace.Tokenizer
An iterator that splits a string on whitespace boundaries, corresponding to the XPath 3.1 function tokenize#1
-
Field Summary
Fields Modifier and Type Field Description static int
ALL
static int
COLLAPSE
static int
IGNORABLE
static int
NONE
The values NONE, IGNORABLE, and ALL identify which kinds of whitespace text node should be stripped when building a source tree.static int
PRESERVE
The values PRESERVE, REPLACE, and COLLAPSE represent the three options for whitespace normalization.static int
REPLACE
static int
TRIM
static int
UNSPECIFIED
static int
XSLT
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static UnicodeString
applyWhitespaceNormalization(int action, UnicodeString value)
Apply schema-defined whitespace normalization to a stringstatic java.lang.String
collapse(java.lang.CharSequence in)
static UnicodeString
collapse(UnicodeString in)
static java.lang.String
collapseWhitespace(java.lang.String in)
Collapse whitespace as defined in XML Schema.static UnicodeString
collapseWhitespace(UnicodeString in)
Collapse whitespace as defined in XML Schema.static boolean
containsWhitespace(IntIterator codePoints)
Determine if a string contains any whitespacestatic boolean
isAllWhite(UnicodeString content)
Determine if a string is all-whitespacestatic boolean
isWhite(int c)
Determine if a character is whitespacestatic java.lang.String
normalize(java.lang.CharSequence in)
static UnicodeString
normalize(UnicodeString in)
static UnicodeString
normalizeWhitespace(UnicodeString input)
Normalize whitespace as defined in XML Schema.static java.lang.String
removeAllWhitespace(java.lang.String value)
Remove all whitespace characters from a stringstatic UnicodeString
removeLeadingWhitespace(UnicodeString value)
Remove leading whitespace characters from a stringstatic java.lang.String
trim(java.lang.String in)
Trim whitespace: return the input string with leading and trailing whitespace removed.static UnicodeString
trim(UnicodeString in)
Trim whitespace: return the input string with leading and trailing whitespace removedstatic long
trimmedEnd(UnicodeString in)
Get the codepoint offset of the first whitespace character in trailing whitespace in the stringstatic long
trimmedStart(UnicodeString in)
Get the codepoint offset of the first non-whitespace character in the string
-
-
-
Field Detail
-
PRESERVE
public static final int PRESERVE
The values PRESERVE, REPLACE, and COLLAPSE represent the three options for whitespace normalization. They are deliberately chosen in ascending strength order; given a number of whitespace facets, only the strongest needs to be carried out. The option TRIM is used instead of COLLAPSE when all valid values have no interior whitespace; trimming leading and trailing whitespace is then equivalent to the action of COLLAPSE, but faster.- See Also:
- Constant Field Values
-
REPLACE
public static final int REPLACE
- See Also:
- Constant Field Values
-
COLLAPSE
public static final int COLLAPSE
- See Also:
- Constant Field Values
-
TRIM
public static final int TRIM
- See Also:
- Constant Field Values
-
NONE
public static final int NONE
The values NONE, IGNORABLE, and ALL identify which kinds of whitespace text node should be stripped when building a source tree. UNSPECIFIED indicates that no particular request has been made. XSLT indicates that whitespace should be stripped as defined by the xsl:strip-space and xsl:preserve-space declarations in the stylesheet- See Also:
- Constant Field Values
-
IGNORABLE
public static final int IGNORABLE
- See Also:
- Constant Field Values
-
ALL
public static final int ALL
- See Also:
- Constant Field Values
-
UNSPECIFIED
public static final int UNSPECIFIED
- See Also:
- Constant Field Values
-
XSLT
public static final int XSLT
- See Also:
- Constant Field Values
-
-
Method Detail
-
applyWhitespaceNormalization
public static UnicodeString applyWhitespaceNormalization(int action, UnicodeString value)
Apply schema-defined whitespace normalization to a string- Parameters:
action
- the action to be applied: one of PRESERVE, REPLACE, or COLLAPSEvalue
- the value to be normalized- Returns:
- the value after normalization
-
removeAllWhitespace
public static java.lang.String removeAllWhitespace(java.lang.String value)
Remove all whitespace characters from a string- Parameters:
value
- the string from which whitespace is to be removed- Returns:
- the string without its whitespace.
-
removeLeadingWhitespace
public static UnicodeString removeLeadingWhitespace(UnicodeString value)
Remove leading whitespace characters from a string- Parameters:
value
- the string whose leading whitespace is to be removed- Returns:
- the string with leading whitespace removed. This may be the original string if there was no leading whitespace
-
containsWhitespace
public static boolean containsWhitespace(IntIterator codePoints)
Determine if a string contains any whitespace- Parameters:
codePoints
- the string to be tested, as a codepoint iterator- Returns:
- true if the string contains a character that is XML whitespace, that is tab, newline, carriage return, or space
-
isAllWhite
public static boolean isAllWhite(UnicodeString content)
Determine if a string is all-whitespace- Parameters:
content
- the string to be tested- Returns:
- true if the supplied string contains no non-whitespace characters. (So the result is true for a zero-length string.)
-
isWhite
public static boolean isWhite(int c)
Determine if a character is whitespace- Parameters:
c
- the character or codepoint to be tested- Returns:
- true if the character is a whitespace character
-
normalizeWhitespace
public static UnicodeString normalizeWhitespace(UnicodeString input)
Normalize whitespace as defined in XML Schema. Note that this is not the same as the XPath normalize-space() function, which is supported by thecollapseWhitespace(net.sf.saxon.str.UnicodeString)
method- Parameters:
input
- the string to be normalized- Returns:
- a copy of the string in which any whitespace character is replaced by a single space character
-
collapseWhitespace
public static UnicodeString collapseWhitespace(UnicodeString in)
Collapse whitespace as defined in XML Schema. This is equivalent to the XPath normalize-space() function- Parameters:
in
- the string whose whitespace is to be collapsed- Returns:
- the string with any leading or trailing whitespace removed, and any internal sequence of whitespace characters replaced with a single space character.
-
collapseWhitespace
public static java.lang.String collapseWhitespace(java.lang.String in)
Collapse whitespace as defined in XML Schema. This is equivalent to the XPath normalize-space() function- Parameters:
in
- the string whose whitespace is to be collapsed- Returns:
- the string with any leading or trailing whitespace removed, and any internal sequence of whitespace characters replaced with a single space character.
-
trimmedStart
public static long trimmedStart(UnicodeString in)
Get the codepoint offset of the first non-whitespace character in the string- Parameters:
in
- the input string- Returns:
- the index of the first non-whitespace character; or -1 if the string consists entirely of whitespace (including the case where the string is zero-length)
-
trimmedEnd
public static long trimmedEnd(UnicodeString in)
Get the codepoint offset of the first whitespace character in trailing whitespace in the string- Parameters:
in
- the input string- Returns:
- the index of the last non-whitespace character plus one; or zero if the string consists entirely of whitespace
-
trim
public static UnicodeString trim(UnicodeString in)
Trim whitespace: return the input string with leading and trailing whitespace removed- Parameters:
in
- the input string- Returns:
- he input string with leading and trailing whitespace removed
-
trim
public static java.lang.String trim(java.lang.String in)
Trim whitespace: return the input string with leading and trailing whitespace removed. Note that this differs fromString.trim()
because the definition of whitespace is different.- Parameters:
in
- the input string- Returns:
- he input string with leading and trailing whitespace removed
-
collapse
public static UnicodeString collapse(UnicodeString in)
-
collapse
public static java.lang.String collapse(java.lang.CharSequence in)
-
normalize
public static UnicodeString normalize(UnicodeString in)
-
normalize
public static java.lang.String normalize(java.lang.CharSequence in)
-
-