Class AggregateSummaryStatistics

  • All Implemented Interfaces:
    java.io.Serializable, StatisticalSummary

    public class AggregateSummaryStatistics
    extends java.lang.Object
    implements StatisticalSummary, java.io.Serializable

    An aggregator for SummaryStatistics from several data sets or data set partitions. In its simplest usage mode, the client creates an instance via the zero-argument constructor, then uses createContributingStatistics() to obtain a SummaryStatistics for each individual data set / partition. The per-set statistics objects are used as normal, and at any time the aggregate statistics for all the contributors can be obtained from this object.

    Clients with specialized requirements can use alternative constructors to control the statistics implementations and initial values used by the contributing and the internal aggregate SummaryStatistics objects.

    A static aggregate(Collection) method is also included that computes aggregate statistics directly from a Collection of SummaryStatistics instances.

    When createContributingStatistics() is used to create SummaryStatistics instances to be aggregated concurrently, the created instances' SummaryStatistics.addValue(double) methods must synchronize on the aggregating instance maintained by this class. In multithreaded environments, if the functionality provided by aggregate(Collection) is adequate, that method should be used to avoid unecessary computation and synchronization delays.

    Since:
    2.0
    See Also:
    Serialized Form
    • Constructor Detail

      • AggregateSummaryStatistics

        public AggregateSummaryStatistics()
        Initializes a new AggregateSummaryStatistics with default statistics implementations.
      • AggregateSummaryStatistics

        public AggregateSummaryStatistics​(SummaryStatistics prototypeStatistics)
        Initializes a new AggregateSummaryStatistics with the specified statistics object as a prototype for contributing statistics and for the internal aggregate statistics. This provides for customized statistics implementations to be used by contributing and aggregate statistics.
        Parameters:
        prototypeStatistics - a SummaryStatistics serving as a prototype both for the internal aggregate statistics and for contributing statistics obtained via the createContributingStatistics() method. Being a prototype means that other objects are initialized by copying this object's state. If null, a new, default statistics object is used. Any statistic values in the prototype are propagated to contributing statistics objects and (once) into these aggregate statistics.
        See Also:
        createContributingStatistics()
      • AggregateSummaryStatistics

        public AggregateSummaryStatistics​(SummaryStatistics prototypeStatistics,
                                          SummaryStatistics initialStatistics)
        Initializes a new AggregateSummaryStatistics with the specified statistics object as a prototype for contributing statistics and for the internal aggregate statistics. This provides for different statistics implementations to be used by contributing and aggregate statistics and for an initial state to be supplied for the aggregate statistics.
        Parameters:
        prototypeStatistics - a SummaryStatistics serving as a prototype both for the internal aggregate statistics and for contributing statistics obtained via the createContributingStatistics() method. Being a prototype means that other objects are initialized by copying this object's state. If null, a new, default statistics object is used. Any statistic values in the prototype are propagated to contributing statistics objects, but not into these aggregate statistics.
        initialStatistics - a SummaryStatistics to serve as the internal aggregate statistics object. If null, a new, default statistics object is used.
        See Also:
        createContributingStatistics()
    • Method Detail

      • getMax

        public double getMax()
        Returns the maximum of the available values. This version returns the maximum over all the aggregated data.
        Specified by:
        getMax in interface StatisticalSummary
        Returns:
        The max or Double.NaN if no values have been added.
        See Also:
        StatisticalSummary.getMax()
      • getMin

        public double getMin()
        Returns the minimum of the available values. This version returns the minimum over all the aggregated data.
        Specified by:
        getMin in interface StatisticalSummary
        Returns:
        The min or Double.NaN if no values have been added.
        See Also:
        StatisticalSummary.getMin()
      • getN

        public long getN()
        Returns the number of available values. This version returns a count of all the aggregated data.
        Specified by:
        getN in interface StatisticalSummary
        Returns:
        The number of available values
        See Also:
        StatisticalSummary.getN()
      • getStandardDeviation

        public double getStandardDeviation()
        Returns the standard deviation of the available values.. This version returns the standard deviation of all the aggregated data.
        Specified by:
        getStandardDeviation in interface StatisticalSummary
        Returns:
        The standard deviation, Double.NaN if no values have been added or 0.0 for a single value set.
        See Also:
        StatisticalSummary.getStandardDeviation()
      • getSum

        public double getSum()
        Returns the sum of the values that have been added to Univariate.. This version returns a sum of all the aggregated data.
        Specified by:
        getSum in interface StatisticalSummary
        Returns:
        The sum or Double.NaN if no values have been added
        See Also:
        StatisticalSummary.getSum()
      • getVariance

        public double getVariance()
        Returns the variance of the available values.. This version returns the variance of all the aggregated data.
        Specified by:
        getVariance in interface StatisticalSummary
        Returns:
        The variance, Double.NaN if no values have been added or 0.0 for a single value set.
        See Also:
        StatisticalSummary.getVariance()
      • getSumOfLogs

        public double getSumOfLogs()
        Returns the sum of the logs of all the aggregated data.
        Returns:
        the sum of logs
        See Also:
        SummaryStatistics.getSumOfLogs()
      • getGeometricMean

        public double getGeometricMean()
        Returns the geometric mean of all the aggregated data.
        Returns:
        the geometric mean
        See Also:
        SummaryStatistics.getGeometricMean()
      • getSumsq

        public double getSumsq()
        Returns the sum of the squares of all the aggregated data.
        Returns:
        The sum of squares
        See Also:
        SummaryStatistics.getSumsq()
      • getSecondMoment

        public double getSecondMoment()
        Returns a statistic related to the Second Central Moment. Specifically, what is returned is the sum of squared deviations from the sample mean among the all of the aggregated data.
        Returns:
        second central moment statistic
        See Also:
        SummaryStatistics.getSecondMoment()
      • createContributingStatistics

        public SummaryStatistics createContributingStatistics()
        Creates and returns a SummaryStatistics whose data will be aggregated with those of this AggregateSummaryStatistics.
        Returns:
        a SummaryStatistics whose data will be aggregated with those of this AggregateSummaryStatistics. The initial state is a copy of the configured prototype statistics.
      • aggregate

        public static StatisticalSummaryValues aggregate​(java.util.Collection<SummaryStatistics> statistics)
        Computes aggregate summary statistics. This method can be used to combine statistics computed over partitions or subsamples - i.e., the StatisticalSummaryValues returned should contain the same values that would have been obtained by computing a single StatisticalSummary over the combined dataset.

        Returns null if the collection is empty or null.

        Parameters:
        statistics - collection of SummaryStatistics to aggregate
        Returns:
        summary statistics for the combined dataset