Package net.sf.saxon.serialize.charcode
Class UTF16CharacterSet
- java.lang.Object
-
- net.sf.saxon.serialize.charcode.UTF16CharacterSet
-
- All Implemented Interfaces:
CharacterSet
public class UTF16CharacterSet extends java.lang.Object implements CharacterSet
A class to hold some static constants and methods associated with processing UTF16 and surrogate pairs
-
-
Field Summary
Fields Modifier and Type Field Description static int
NONBMP_MAX
static int
NONBMP_MIN
static char
SURROGATE1_MAX
static char
SURROGATE1_MIN
static char
SURROGATE2_MAX
static char
SURROGATE2_MIN
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static int
combinePair(char high, char low)
Return the non-BMP character corresponding to a given surrogate pair surrogates.static boolean
containsSurrogates(java.lang.CharSequence s)
Test whether a CharSequence contains any surrogates (i.e.static int
firstInvalidChar(java.lang.CharSequence chars, java.util.function.IntPredicate predicate)
Test whether all the characters in a CharSequence are valid XML charactersjava.lang.String
getCanonicalName()
Get the preferred Java name of the character set.static UTF16CharacterSet
getInstance()
Get the singular instance of this classstatic char
highSurrogate(int ch)
Return the high surrogate of a non-BMP characterboolean
inCharset(int c)
Determine if a character is present in the character setstatic boolean
isHighSurrogate(int ch)
Test whether the given character is a high surrogatestatic boolean
isLowSurrogate(int ch)
Test whether the given character is a low surrogatestatic boolean
isSurrogate(int c)
Test whether a given character is a surrogate (high or low)static char
lowSurrogate(int ch)
Return the low surrogate of a non-BMP characterstatic void
main(java.lang.String[] args)
-
-
-
Field Detail
-
NONBMP_MIN
public static final int NONBMP_MIN
- See Also:
- Constant Field Values
-
NONBMP_MAX
public static final int NONBMP_MAX
- See Also:
- Constant Field Values
-
SURROGATE1_MIN
public static final char SURROGATE1_MIN
- See Also:
- Constant Field Values
-
SURROGATE1_MAX
public static final char SURROGATE1_MAX
- See Also:
- Constant Field Values
-
SURROGATE2_MIN
public static final char SURROGATE2_MIN
- See Also:
- Constant Field Values
-
SURROGATE2_MAX
public static final char SURROGATE2_MAX
- See Also:
- Constant Field Values
-
-
Method Detail
-
getInstance
public static UTF16CharacterSet getInstance()
Get the singular instance of this class- Returns:
- the singular instance of this class
-
inCharset
public boolean inCharset(int c)
Description copied from interface:CharacterSet
Determine if a character is present in the character set- Specified by:
inCharset
in interfaceCharacterSet
-
getCanonicalName
public java.lang.String getCanonicalName()
Description copied from interface:CharacterSet
Get the preferred Java name of the character set. Note that Java in many cases also supports a "historic name".- Specified by:
getCanonicalName
in interfaceCharacterSet
-
combinePair
public static int combinePair(char high, char low)
Return the non-BMP character corresponding to a given surrogate pair surrogates.- Parameters:
high
- The high surrogate.low
- The low surrogate.- Returns:
- the Unicode codepoint represented by the surrogate pair
-
highSurrogate
public static char highSurrogate(int ch)
Return the high surrogate of a non-BMP character- Parameters:
ch
- The Unicode codepoint of the non-BMP character to be divided.- Returns:
- the first character in the surrogate pair
-
lowSurrogate
public static char lowSurrogate(int ch)
Return the low surrogate of a non-BMP character- Parameters:
ch
- The Unicode codepoint of the non-BMP character to be divided.- Returns:
- the second character in the surrogate pair
-
isSurrogate
public static boolean isSurrogate(int c)
Test whether a given character is a surrogate (high or low)- Parameters:
c
- the character to test- Returns:
- true if the character is the high or low half of a surrogate pair
-
isHighSurrogate
public static boolean isHighSurrogate(int ch)
Test whether the given character is a high surrogate- Parameters:
ch
- The character to test.- Returns:
- true if the character is the first character in a surrogate pair
-
isLowSurrogate
public static boolean isLowSurrogate(int ch)
Test whether the given character is a low surrogate- Parameters:
ch
- The character to test.- Returns:
- true if the character is the second character in a surrogate pair
-
containsSurrogates
public static boolean containsSurrogates(java.lang.CharSequence s)
Test whether a CharSequence contains any surrogates (i.e. any non-BMP characters- Parameters:
s
- the string to be tested
-
firstInvalidChar
public static int firstInvalidChar(java.lang.CharSequence chars, java.util.function.IntPredicate predicate)
Test whether all the characters in a CharSequence are valid XML characters- Parameters:
chars
- the character sequence to be testedpredicate
- the predicate that all characters must satisfy- Returns:
- the codepoint of the first invalid character in the character sequence (according to the supplied predicate); or -1 if all characters in the character sequence are valid
-
main
public static void main(java.lang.String[] args)
-
-