Class LogMergePolicy

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable, java.lang.Cloneable
    Direct Known Subclasses:
    LogByteSizeMergePolicy, LogDocMergePolicy

    public abstract class LogMergePolicy
    extends MergePolicy

    This class implements a MergePolicy that tries to merge segments into levels of exponentially increasing size, where each level has fewer segments than the value of the merge factor. Whenever extra segments (beyond the merge factor upper bound) are encountered, all segments within the level are merged. You can get or set the merge factor using getMergeFactor() and setMergeFactor(int) respectively.

    This class is abstract and requires a subclass to define the MergePolicy.size(org.apache.lucene.index.SegmentCommitInfo) method which specifies how a segment's size is determined. LogDocMergePolicy is one subclass that measures size by document count in the segment. LogByteSizeMergePolicy is another subclass that measures size as the total byte size of the file(s) for the segment.

    • Field Detail

      • LEVEL_LOG_SPAN

        public static final double LEVEL_LOG_SPAN
        Defines the allowed range of log(size) for each level. A level is computed by taking the max segment log size, minus LEVEL_LOG_SPAN, and finding all segments falling within that range.
        See Also:
        Constant Field Values
      • DEFAULT_MERGE_FACTOR

        public static final int DEFAULT_MERGE_FACTOR
        Default merge factor, which is how many segments are merged at a time
        See Also:
        Constant Field Values
      • DEFAULT_MAX_MERGE_DOCS

        public static final int DEFAULT_MAX_MERGE_DOCS
        Default maximum segment size. A segment of this size or larger will never be merged. @see setMaxMergeDocs
        See Also:
        Constant Field Values
    • Constructor Detail

      • LogMergePolicy

        public LogMergePolicy()
        Sole constructor. (For invocation by subclass constructors, typically implicit.)
    • Method Detail

      • getMergeFactor

        public int getMergeFactor()

        Returns the number of segments that are merged at once and also controls the total number of segments allowed to accumulate in the index.

      • setMergeFactor

        public void setMergeFactor​(int mergeFactor)
        Determines how often segment indices are merged by addDocument(). With smaller values, less RAM is used while indexing, and searches are faster, but indexing speed is slower. With larger values, more RAM is used during indexing, and while searches is slower, indexing is faster. Thus larger values (> 10) are best for batch index creation, and smaller values (< 10) for indices that are interactively maintained.
      • setCalibrateSizeByDeletes

        public void setCalibrateSizeByDeletes​(boolean calibrateSizeByDeletes)
        Sets whether the segment size should be calibrated by the number of deletes when choosing segments for merge.
      • getCalibrateSizeByDeletes

        public boolean getCalibrateSizeByDeletes()
        Returns true if the segment size should be calibrated by the number of deletes when choosing segments for merge.
      • close

        public void close()
        Description copied from class: MergePolicy
        Release all resources for the policy.
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in interface java.io.Closeable
        Specified by:
        close in class MergePolicy
      • findForcedMerges

        public MergePolicy.MergeSpecification findForcedMerges​(SegmentInfos infos,
                                                               int maxNumSegments,
                                                               java.util.Map<SegmentCommitInfo,​java.lang.Boolean> segmentsToMerge)
                                                        throws java.io.IOException
        Returns the merges necessary to merge the index down to a specified number of segments. This respects the maxMergeSizeForForcedMerge setting. By default, and assuming maxNumSegments=1, only one segment will be left in the index, where that segment has no deletions pending nor separate norms, and it is in compound file format if the current useCompoundFile setting is true. This method returns multiple merges (mergeFactor at a time) so the MergeScheduler in use may make use of concurrency.
        Specified by:
        findForcedMerges in class MergePolicy
        Parameters:
        infos - the total set of segments in the index
        maxNumSegments - requested maximum number of segments in the index (currently this is always 1)
        segmentsToMerge - contains the specific SegmentInfo instances that must be merged away. This may be a subset of all SegmentInfos. If the value is True for a given SegmentInfo, that means this segment was an original segment present in the to-be-merged index; else, it was a segment produced by a cascaded merge.
        Throws:
        java.io.IOException
      • findForcedDeletesMerges

        public MergePolicy.MergeSpecification findForcedDeletesMerges​(SegmentInfos segmentInfos)
                                                               throws java.io.IOException
        Finds merges necessary to force-merge all deletes from the index. We simply merge adjacent segments that have deletes, up to mergeFactor at a time.
        Specified by:
        findForcedDeletesMerges in class MergePolicy
        Parameters:
        segmentInfos - the total set of segments in the index
        Throws:
        java.io.IOException
      • setMaxMergeDocs

        public void setMaxMergeDocs​(int maxMergeDocs)

        Determines the largest segment (measured by document count) that may be merged with other segments. Small values (e.g., less than 10,000) are best for interactive indexing, as this limits the length of pauses while indexing to a few seconds. Larger values are best for batched indexing and speedier searches.

        The default value is Integer.MAX_VALUE.

        The default merge policy (LogByteSizeMergePolicy) also allows you to set this limit by net size (in MB) of the segment, using LogByteSizeMergePolicy.setMaxMergeMB(double).

      • getMaxMergeDocs

        public int getMaxMergeDocs()
        Returns the largest segment (measured by document count) that may be merged with other segments.
        See Also:
        setMaxMergeDocs(int)
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object