System Programming Interfaces
Strings
Extensive changes have been made to the internal representation of strings. There are a number of motivations
        for this, including better Unicode support, capitalizing on improvements to string handling in Java 9, and most
        notably, efficiency on the .NET platform: Saxon's previous extensive use of the Java CharSequence interface 
            appeared to translate to very inefficient code on .NET.
Most uses of CharSequence have been replaced by a new class net.sf.saxon.str.UnicodeString (which also replaces the old class
            net.sf.saxon.regex.UnicodeString). The UnicodeString class has a number of implementations.
            All of them are designed to be codepoint-addressible: they expose an indexable array of 32-bit codepoint values, and
            never use surrogate pairs. The implementations of UnicodeString include:
- Twine8: a string consisting entirely of codepoints in the range 1-255, held in an array with one byte per character.
- Twine16: a string consisting entirely of codepoints in the range 1-65535, held in an array with two bytes per character.
- Twine24: a string of arbitrary codepoints, held in an array with three bytes per character.
- Slice8: a sub-range of an array using one byte per character.
- Slice16: a sub-range of an array using two bytes per character.
- Slice24: a sub-range of an array using two bytes per character.
- BMPString: a wrapper around a Java/C# string known to contain no surrogate pairs.
- 
               ZenoString: a composite string held as a list of segments, each of
                which is itself a UnicodeString. The name derives from the algorithm used to combine segments, which results in segments having progressively decreasing lengths towards the end of the string.
- 
               StringView: a wrapper around an arbitrary Java/C# string. (This
                stores the string both in its native Java/C# form, and using a "real" codepoint-addressible implementation of UnicodeString, which is constructed lazily when it is first required.)
The interface to the UnicodeString class is future-proofed to accommodate strings containing more than 2^31
        characters. That's not to say that Saxon can now support such long strings everywhere (for example, the regular expression
        engine cannot handle such strings); but the groundwork has been laid.
The method Item.getStringValueCS(), which returned the string value of an item as a CharSequence,
            is dropped, and is replaced by a new method Item.getUnicodeStringValue() which returns the value as a
        UnicodeString.
The effect of toString() on atomic values has changed: it now returns the result of casting the value
        to a string (which is the same as the result of getStringValue()). To obtain the previous effect of the 
            toString() method, use the new show() method.
Unicode normalization of strings (for example in the fn:normalize-unicode() function) now uses the JDK class
            java.text.Normalizer rather than code derived from the Unicode Consortium's implementation. This appears
            to be substantially faster.
Sequence and SequenceIterator
The SequenceIterator.next() method no longer throws checked
            (XPathException) exceptions. Instead, if a dynamic error occurs, an UncheckedXPathException is thrown. This change makes the
            SequenceIterator class play better with modern Java facilities such as streams and functional interfaces.
The SequenceIterator.getProperties() method is dropped. Instead, to determine whether a SequenceIterator supports look-ahead, first test whether it is an
            instance of LookaheadIterator, then cast it to
            LookaheadIterator and call the supportsHasNext() method; and similarly for
            GroundedIterator and LastPositionFinder. For example:
NodeInfo
The method NodeInfo.iterateAxis(int axisNumber, Predicate<? super NodeInfo> nodeTest)
        is replaced by NodeInfo.iterateAxis(int axisNumber, NodePredicate predicate).
The reason for this is to facilitate conversion of the source code from Java to C#. In Java, a functional interface can be satisfied both by a lambda expression and by a concrete implementation class; in C#, classes and delegates are not interchangeable in the same way. The introduction of NodePredicate solves this by providing a concrete implementation that allows a lambda expression to be supplied as an argument. Similar changes have been made in some other, less visible, areas.
Miscellaneous
A number of internal changes have been made to facilitate conversion of the source code from Java to C#. These should only affect applications that use very low-level interfaces within Saxon.
For example, in some data objects such as ParseOptions, some properties were maintained
            as three-valued fields of type java.lang.Boolean (true, false, or null - meaning unspecified). C# booleans
            do not have a null in the value space so the representation has changed, typically to Optional<Boolean>.
            The same applies to enumeration types where there was a need for a "null" in the value space; in some cases an extra
            enumeration constant such as UNKNOWN has been added.
Code designed to implement or use JAXP interfaces was previously scattered around the product rather liberally. Because JAXP interfaces exist in Java but not in C#, this code has often been moved into separate modules that are platform-specific.
Tracing and Diagnostics
A new class SystemLogger simplifies the task of sending all Saxon
            progress messages (as well as <xsl:message> output) to a supplied
            java.util.logging.Logger. This can be achieved using a call such as: