Class HyphenationTree

  • All Implemented Interfaces:
    java.lang.Cloneable, PatternConsumer

    public class HyphenationTree
    extends TernaryTree
    implements PatternConsumer
    This tree structure stores the hyphenation patterns in an efficient way for fast lookup. It provides the provides the method to hyphenate a word. This class has been taken from the Apache FOP project (http://xmlgraphics.apache.org/fop/). They have been slightly modified.
    • Constructor Summary

      Constructors 
      Constructor Description
      HyphenationTree()  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void addClass​(java.lang.String chargroup)
      Add a character class to the tree.
      void addException​(java.lang.String word, java.util.ArrayList<java.lang.Object> hyphenatedword)
      Add an exception to the tree.
      void addPattern​(java.lang.String pattern, java.lang.String ivalue)
      Add a pattern to the tree.
      java.lang.String findPattern​(java.lang.String pat)  
      Hyphenation hyphenate​(char[] w, int offset, int len, int remainCharCount, int pushCharCount)
      Hyphenate word and return an array of hyphenation points.
      Hyphenation hyphenate​(java.lang.String word, int remainCharCount, int pushCharCount)
      Hyphenate word and return a Hyphenation object.
      void loadPatterns​(java.io.File f)
      Read hyphenation patterns from an XML file.
      void loadPatterns​(org.xml.sax.InputSource source)
      Read hyphenation patterns from an XML file.
      void printStats​(java.io.PrintStream out)  
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • HyphenationTree

        public HyphenationTree()
    • Method Detail

      • loadPatterns

        public void loadPatterns​(java.io.File f)
                          throws java.io.IOException
        Read hyphenation patterns from an XML file.
        Parameters:
        f - the filename
        Throws:
        java.io.IOException - In case the parsing fails
      • loadPatterns

        public void loadPatterns​(org.xml.sax.InputSource source)
                          throws java.io.IOException
        Read hyphenation patterns from an XML file.
        Parameters:
        source - the InputSource for the file
        Throws:
        java.io.IOException - In case the parsing fails
      • findPattern

        public java.lang.String findPattern​(java.lang.String pat)
      • hyphenate

        public Hyphenation hyphenate​(java.lang.String word,
                                     int remainCharCount,
                                     int pushCharCount)
        Hyphenate word and return a Hyphenation object.
        Parameters:
        word - the word to be hyphenated
        remainCharCount - Minimum number of characters allowed before the hyphenation point.
        pushCharCount - Minimum number of characters allowed after the hyphenation point.
        Returns:
        a Hyphenation object representing the hyphenated word or null if word is not hyphenated.
      • hyphenate

        public Hyphenation hyphenate​(char[] w,
                                     int offset,
                                     int len,
                                     int remainCharCount,
                                     int pushCharCount)
        Hyphenate word and return an array of hyphenation points.
        Parameters:
        w - char array that contains the word
        offset - Offset to first character in word
        len - Length of word
        remainCharCount - Minimum number of characters allowed before the hyphenation point.
        pushCharCount - Minimum number of characters allowed after the hyphenation point.
        Returns:
        a Hyphenation object representing the hyphenated word or null if word is not hyphenated.
      • addClass

        public void addClass​(java.lang.String chargroup)
        Add a character class to the tree. It is used by PatternParser as callback to add character classes. Character classes define the valid word characters for hyphenation. If a word contains a character not defined in any of the classes, it is not hyphenated. It also defines a way to normalize the characters in order to compare them with the stored patterns. Usually pattern files use only lower case characters, in this case a class for letter 'a', for example, should be defined as "aA", the first character being the normalization char.
        Specified by:
        addClass in interface PatternConsumer
        Parameters:
        chargroup - character group
      • addException

        public void addException​(java.lang.String word,
                                 java.util.ArrayList<java.lang.Object> hyphenatedword)
        Add an exception to the tree. It is used by PatternParser class as callback to store the hyphenation exceptions.
        Specified by:
        addException in interface PatternConsumer
        Parameters:
        word - normalized word
        hyphenatedword - a vector of alternating strings and hyphen objects.
      • addPattern

        public void addPattern​(java.lang.String pattern,
                               java.lang.String ivalue)
        Add a pattern to the tree. Mainly, to be used by PatternParser class as callback to add a pattern to the tree.
        Specified by:
        addPattern in interface PatternConsumer
        Parameters:
        pattern - the hyphenation pattern
        ivalue - interletter weight values indicating the desirability and priority of hyphenating at a given point within the pattern. It should contain only digit characters. (i.e. '0' to '9').
      • printStats

        public void printStats​(java.io.PrintStream out)
        Overrides:
        printStats in class TernaryTree