|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.commons.math3.random.EmpiricalDistribution
public class EmpiricalDistribution
Represents an empirical probability distribution -- a probability distribution derived from observed data without making any assumptions about the functional form of the population distribution that the data come from.
An EmpiricalDistribution
maintains data structures, called
distribution digests, that describe empirical distributions and
support the following operations:
EmpiricalDistribution
to build grouped
frequency histograms representing the input data or to generate random values
"like" those in the input file -- i.e., the values generated will follow the
distribution of the values in the file.
The implementation uses what amounts to the Variable Kernel Method with Gaussian smoothing:
Digesting the input file
binCount
"bins."USAGE NOTES:
binCount
is set by default to 1000. A good rule of thumb
is to set the bin count to approximately the length of the input file divided
by 10.
Nested Class Summary | |
---|---|
private class |
EmpiricalDistribution.ArrayDataAdapter
DataAdapter for data provided as array of doubles. |
private class |
EmpiricalDistribution.DataAdapter
Provides methods for computing sampleStats and
beanStats abstracting the source of data. |
private class |
EmpiricalDistribution.DataAdapterFactory
Factory of DataAdapter objects. |
private class |
EmpiricalDistribution.StreamDataAdapter
DataAdapter for data provided through some input stream |
Field Summary | |
---|---|
private int |
binCount
number of bins |
private List<SummaryStatistics> |
binStats
List of SummaryStatistics objects characterizing the bins |
static int |
DEFAULT_BIN_COUNT
Default bin count |
private double |
delta
Grid size |
private boolean |
loaded
is the distribution loaded? |
private double |
max
Max loaded value |
private double |
min
Min loaded value |
private RandomDataImpl |
randomData
RandomDataImpl instance to use in repeated calls to getNext() |
private SummaryStatistics |
sampleStats
Sample statistics |
private static long |
serialVersionUID
Serializable version identifier |
private double[] |
upperBounds
upper bounds of subintervals in (0,1) "belonging" to the bins |
Constructor Summary | |
---|---|
EmpiricalDistribution()
Creates a new EmpiricalDistribution with the default bin count. |
|
EmpiricalDistribution(int binCount)
Creates a new EmpiricalDistribution with the specified bin count. |
|
EmpiricalDistribution(int binCount,
RandomDataImpl randomData)
Creates a new EmpiricalDistribution with the specified bin count using the provided RandomDataImpl instance as the source of random data. |
|
EmpiricalDistribution(int binCount,
RandomGenerator generator)
Creates a new EmpiricalDistribution with the specified bin count using the provided RandomGenerator as the source of random data. |
|
EmpiricalDistribution(RandomDataImpl randomData)
Creates a new EmpiricalDistribution with default bin count using the provided RandomDataImpl as the source of random data. |
|
EmpiricalDistribution(RandomGenerator generator)
Creates a new EmpiricalDistribution with default bin count using the provided RandomGenerator as the source of random data. |
Method Summary | |
---|---|
private void |
fillBinStats(Object in)
Fills binStats array (second pass through data file). |
private int |
findBin(double value)
Returns the index of the bin to which the given value belongs |
int |
getBinCount()
Returns the number of bins. |
List<SummaryStatistics> |
getBinStats()
Returns a List of SummaryStatistics instances containing
statistics describing the values in each of the bins. |
double[] |
getGeneratorUpperBounds()
Returns a fresh copy of the array of upper bounds of the subintervals of [0,1] used in generating data from the empirical distribution. |
double |
getNextValue()
Generates a random value from this distribution. |
StatisticalSummary |
getSampleStats()
Returns a StatisticalSummary describing this distribution. |
double[] |
getUpperBounds()
Returns a fresh copy of the array of upper bounds for the bins. |
boolean |
isLoaded()
Property indicating whether or not the distribution has been loaded. |
void |
load(double[] in)
Computes the empirical distribution from the provided array of numbers. |
void |
load(File file)
Computes the empirical distribution from the input file. |
void |
load(URL url)
Computes the empirical distribution using data read from a URL. |
void |
reSeed(long seed)
Reseeds the random number generator used by getNextValue() . |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final int DEFAULT_BIN_COUNT
private static final long serialVersionUID
private final List<SummaryStatistics> binStats
private SummaryStatistics sampleStats
private double max
private double min
private double delta
private final int binCount
private boolean loaded
private double[] upperBounds
private final RandomDataImpl randomData
Constructor Detail |
---|
public EmpiricalDistribution()
public EmpiricalDistribution(int binCount)
binCount
- number of binspublic EmpiricalDistribution(int binCount, RandomGenerator generator)
RandomGenerator
as the source of random data.
binCount
- number of binsgenerator
- random data generator (may be null, resulting in default JDK generator)public EmpiricalDistribution(RandomGenerator generator)
RandomGenerator
as the source of random data.
generator
- random data generator (may be null, resulting in default JDK generator)public EmpiricalDistribution(int binCount, RandomDataImpl randomData)
RandomDataImpl
instance as the source of random data.
binCount
- number of binsrandomData
- random data generator (may be null, resulting in default JDK generator)public EmpiricalDistribution(RandomDataImpl randomData)
RandomDataImpl
as the source of random data.
randomData
- random data generator (may be null, resulting in default JDK generator)Method Detail |
---|
public void load(double[] in) throws NullArgumentException
in
- the input data array
NullArgumentException
- if in is nullpublic void load(URL url) throws IOException, NullArgumentException
url
- url of the input file
IOException
- if an IO error occurs
NullArgumentException
- if url is nullpublic void load(File file) throws IOException, NullArgumentException
file
- the input file
IOException
- if an IO error occurs
NullArgumentException
- if file is nullprivate void fillBinStats(Object in) throws IOException
in
- object providing access to the data
IOException
- if an IO error occursprivate int findBin(double value)
value
- the value whose bin we are trying to find
public double getNextValue() throws MathIllegalStateException
MathIllegalStateException
- if the distribution has not been loadedpublic StatisticalSummary getSampleStats()
StatisticalSummary
describing this distribution.
Preconditions:
IllegalStateException
- if the distribution has not been loadedpublic int getBinCount()
public List<SummaryStatistics> getBinStats()
SummaryStatistics
instances containing
statistics describing the values in each of the bins. The list is
indexed on the bin number.
public double[] getUpperBounds()
Returns a fresh copy of the array of upper bounds for the bins.
Bins are:
[min,upperBounds[0]],(upperBounds[0],upperBounds[1]],...,
(upperBounds[binCount-2], upperBounds[binCount-1] = max].
Note: In versions 1.0-2.0 of commons-math, this method
incorrectly returned the array of probability generator upper
bounds now returned by getGeneratorUpperBounds()
.
public double[] getGeneratorUpperBounds()
Returns a fresh copy of the array of upper bounds of the subintervals of [0,1] used in generating data from the empirical distribution. Subintervals correspond to bins with lengths proportional to bin counts.
In versions 1.0-2.0 of commons-math, this array was (incorrectly) returned
by getUpperBounds()
.
public boolean isLoaded()
public void reSeed(long seed)
getNextValue()
.
seed
- random generator seed
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |