Class ArrayBasedUnicodeEscaper


  • @Beta
    @GwtCompatible
    public abstract class ArrayBasedUnicodeEscaper
    extends UnicodeEscaper
    A UnicodeEscaper that uses an array to quickly look up replacement characters for a given code point. An additional safe range is provided that determines whether code points without specific replacements are to be considered safe and left unescaped or should be escaped in a general way.

    A good example of usage of this class is for HTML escaping where the replacement array contains information about the named HTML entities such as & and " while escapeUnsafe(int) is overridden to handle general escaping of the form &#NNNNN;.

    The size of the data structure used by ArrayBasedUnicodeEscaper is proportional to the highest valued code point that requires escaping. For example a replacement map containing the single character '\u1000' will require approximately 16K of memory. If you need to create multiple escaper instances that have the same character replacement mapping consider using ArrayBasedEscaperMap.

    Since:
    15.0
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.lang.String escape​(java.lang.String s)
      Returns the escaped form of a given literal string.
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • escape

        public final java.lang.String escape​(java.lang.String s)
        Description copied from class: UnicodeEscaper
        Returns the escaped form of a given literal string.

        If you are escaping input in arbitrary successive chunks, then it is not generally safe to use this method. If an input string ends with an unmatched high surrogate character, then this method will throw IllegalArgumentException. You should ensure your input is valid UTF-16 before calling this method.

        Note: When implementing an escaper it is a good idea to override this method for efficiency by inlining the implementation of UnicodeEscaper.nextEscapeIndex(CharSequence, int, int) directly. Doing this for PercentEscaper more than doubled the performance for unescaped strings (as measured by CharEscapersBenchmark).

        Overrides:
        escape in class UnicodeEscaper
        Parameters:
        s - the literal string to be escaped
        Returns:
        the escaped form of string