5 class OnTheFlyStatistics:
7 Computes the sample mean and variance of provided data incrementally.
9 The algorithm used is described in the "updating formulas" (1.3) in the paper
10 "Algorithms for Computing the Sample Variance: Analysis and Recommendations"
11 by Tony F. Chan, Gene H. Golub, and Randall J. LeVeque,
12 The American Statistician, August 1983, Vol. 37, No. 3, pp. 242-247
15 See also "Formulas for Robust, One-Pass Parallel Computation of Covariances
16 and Arbitrary-Order Statistical Moments" by Philippe Pébay, Sandia Report
20 def __init__(self, mean = None, sampleVariance = None, sampleSize = None):
24 The parameters `mean`, `sampleVariance`, and `sampleSize` can either be
25 `None`, in which case an instance without any data points is
26 constructed, or all three parameters can be specified, in which case
27 the instance is constructed as if data were given that result in the
31 parameters = [mean, sampleVariance, sampleSize]
34 for parameter
in parameters:
35 if parameter
is not None:
44 for parameter
in parameters:
50 self.
S = sampleVariance * (self.
n - 1)
55 Adds a datum to the sample.
58 delta = datum - self.
M
61 self.
M = self.
M + delta / self.
n
62 self.
S = self.
S + delta * (datum - self.
M)
67 Merges the given sample with this one.
69 This assumes that both samples are drawn from the same population.
92 self.
addDatum(sample.getSampleMean())
96 for s
in [self, sample]:
97 mergedSampleSize += s.getSampleSize()
100 for s
in [self, sample]:
101 mergedMean += s.getSampleSize() * s.getSampleMean()
102 mergedMean /= mergedSampleSize
105 for s
in [self, sample]:
106 mergedS += (s.getSampleSize() - 1) * s.getSampleVariance()
109 m = sample.getSampleSize()
116 self.
n = mergedSampleSize
123 Merges the given samples with this one.
125 This assumes that all samples are drawn from the same population.
128 for sample
in samples:
134 Returns the number of data points added so far.
141 Returns the mean of all the values added so far.
143 If no values have been added so far, an exception is thrown.
147 raise ValueError(
"Tried to get mean without having supplied data.")
153 Returns the unbiased sample variance of all the values added so far.
155 The returned value contains Bessel's correction, i.e. the sum of
156 squares of differences is divided by \f$ n - 1 \f$ rather than
157 \f$ n \f$, where \f$ n \f$ is the sample size.
159 If fewer than two values have been added so far, an exception is thrown.
163 raise ValueError(
"Tried to get variance without having supplied enough data.")
165 return self.
S / (self.
n - 1)
169 Returns the unbiased sample standard deviation of all the values added
172 The returned value contains Bessel's correction, i.e. the sum of
173 squares of differences is divided by \f$ n - 1 \f$ rather than
174 \f$ n \f$, where \f$ n \f$ is the sample size.
176 If fewer than two values have been added so far, an exception is thrown.
184 Returns the standard error of the mean, i.e. the unbiased sample
185 standard deviation divided by the square root of the sample size.
194 Returns a `str` that contains the state of this instance.
196 @see unserializeFromString
202 ret += str(self.
n) +
";"
203 ret += str(self.
M) +
";"
211 Discards the current state, and loads the state specified in the given
215 Throws if `state` is of the wrong type.
217 Throws if `state` does not encode a valid state.
220 The state to load. Must be a `str` created by
224 if not isinstance(state, str):
227 parts = state.split(
";")
228 if int(parts[0]) != 1:
253 The equality operator.
255 Returns whether the state of this object is the same as the state of the
256 `rhs` object; if `rhs` is not of this instance's type, `NotImplemented`
259 @param[in] rhs The right-hand-side instance to compare to.
262 if not isinstance(rhs, self.__class__):
263 return NotImplemented
265 return self.
__dict__ == rhs.__dict__
270 The inequality operator.
272 Returns whether the state of this object differs from the state of the
273 `rhs` object; if `rhs` is not of this instance's type, `NotImplemented`
276 @param[in] rhs The right-hand-side instance to compare to.
279 if not isinstance(rhs, self.__class__):
280 return NotImplemented
282 return not self == rhs
287 Returns a hash of this object that depends on its state, and nothing
296 Returns a `str` describing the data of this instance.
299 ret =
"OnTheFlyStatistics: {"
304 ret +=
", Standard Deviation: "