Class CompactLabelToOrdinal


  • public class CompactLabelToOrdinal
    extends LabelToOrdinal
    This is a very efficient LabelToOrdinal implementation that uses a CharBlockArray to store all labels and a configurable number of HashArrays to reference the labels.

    Since the HashArrays don't handle collisions, a CollisionMap is used to store the colliding labels.

    This data structure grows by adding a new HashArray whenever the number of collisions in the CollisionMap exceeds loadFactor * LabelToOrdinal.getMaxOrdinal(). Growing also includes reinserting all colliding labels into the HashArrays to possibly reduce the number of collisions. For setting the loadFactor see CompactLabelToOrdinal(int, float, int).

    This data structure has a much lower memory footprint (~30%) compared to a Java HashMap<String, Integer>. It also only uses a small fraction of objects a HashMap would use, thus limiting the GC overhead. Ingestion speed was also ~50% faster compared to a HashMap for 3M unique labels.

    • Field Detail

      • DefaultLoadFactor

        public static final float DefaultLoadFactor
        Default maximum load factor.
        See Also:
        Constant Field Values
    • Constructor Detail

      • CompactLabelToOrdinal

        public CompactLabelToOrdinal​(int initialCapacity,
                                     float loadFactor,
                                     int numHashArrays)
        Sole constructor.
    • Method Detail

      • sizeOfMap

        public int sizeOfMap()
        How many labels.
      • addLabel

        public void addLabel​(FacetLabel label,
                             int ordinal)
        Description copied from class: LabelToOrdinal
        Adds a new label if its not yet in the table. Throws an IllegalArgumentException if the same label with a different ordinal was previoulsy added to this table.
        Specified by:
        addLabel in class LabelToOrdinal