Package weka.associations
Class PriorEstimation
java.lang.Object
weka.associations.PriorEstimation
- All Implemented Interfaces:
Serializable
,RevisionHandler
Class implementing the prior estimattion of the predictive apriori algorithm
for mining association rules.
Reference: T. Scheffer (2001). Finding Association Rules That Trade Support
Optimally against Confidence. Proc of the 5th European Conf.
on Principles and Practice of Knowledge Discovery in Databases (PKDD'01),
pp. 424-435. Freiburg, Germany: Springer-Verlag.
- Version:
- $Revision: 1.7 $
- Author:
- Stefan Mutter (mutter@cs.waikato.ac.nz)
- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionPriorEstimation
(Instances instances, int numRules, int numIntervals, boolean car) Constructor -
Method Summary
Modifier and TypeMethodDescriptionfinal RuleItem
addCons
(int[] itemArray) generates a class association rule out of a given premise.final void
buildDistribution
(double conf, double length) updates the distribution of the confidence values.final double
calculatePriorSum
(boolean weighted, double mPoint) calculates the numerator and the denominator of the prior equationfinal Hashtable
Method to estimate the prior probabilitiesfinal double
findIntervall
(double conf) searches the mid point of the interval a given confidence value falls intofinal void
Calculates the prior distribution.final double[]
returns an ordered array of all mid pointsReturns the revision string.static final double
logbinomialCoefficient
(int upperIndex, int lowerIndex) Method that calculates the base 2 logarithm of a binomial coefficientdouble
midPoint
(double size, int number) calculates the mid point of an intervalfinal void
split the interval [0,1] into a predefined number of intervals and calculates their mid pointsfinal int[]
randomCARule
(int maxLength, int actualLength, Random randNum) Constructs an item set of certain length randomly.final int[]
randomRule
(int maxLength, int actualLength, Random randNum) Constructs an item set of certain length randomly.final RuleItem
splitItemSet
(int premiseLength, int[] itemArray) splits an item set into premise and consequence and constructs therefore an association rule.final void
updateCounters
(ItemSet itemSet) updates the support count of an item set
-
Constructor Details
-
PriorEstimation
Constructor- Parameters:
instances
- the instances to be used for generating the associationsnumRules
- the number of random rules used for generating the priornumIntervals
- the number of intervals to discretise [0,1]car
- flag indicating whether standard or class association rules are mined
-
-
Method Details
-
generateDistribution
Calculates the prior distribution.- Throws:
Exception
- if prior can't be estimated successfully
-
randomRule
Constructs an item set of certain length randomly. This method is used for standard association rule mining.- Parameters:
maxLength
- the number of attributes of the instancesactualLength
- the number of attributes that should be present in the item setrandNum
- the random number generator- Returns:
- a randomly constructed item set in form of an int array
-
randomCARule
Constructs an item set of certain length randomly. This method is used for class association rule mining.- Parameters:
maxLength
- the number of attributes of the instancesactualLength
- the number of attributes that should be present in the item setrandNum
- the random number generator- Returns:
- a randomly constructed item set in form of an int array
-
buildDistribution
public final void buildDistribution(double conf, double length) updates the distribution of the confidence values. For every confidence value the interval to which it belongs is searched and the confidence is added to the confidence already found in this interval.- Parameters:
conf
- the confidence of the randomly created rulelength
- the legnth of the randomly created rule
-
findIntervall
public final double findIntervall(double conf) searches the mid point of the interval a given confidence value falls into- Parameters:
conf
- the confidence of a rule- Returns:
- the mid point of the interval the confidence belongs to
-
calculatePriorSum
public final double calculatePriorSum(boolean weighted, double mPoint) calculates the numerator and the denominator of the prior equation- Parameters:
weighted
- indicates whether the numerator or the denominator is calculatedmPoint
- the mid Point of an interval- Returns:
- the numerator or denominator of the prior equation
-
logbinomialCoefficient
public static final double logbinomialCoefficient(int upperIndex, int lowerIndex) Method that calculates the base 2 logarithm of a binomial coefficient- Parameters:
upperIndex
- upper Inedx of the binomial coefficientlowerIndex
- lower index of the binomial coefficient- Returns:
- the base 2 logarithm of the binomial coefficient
-
estimatePrior
Method to estimate the prior probabilities- Returns:
- a hashtable containing the prior probabilities
- Throws:
Exception
- throws exception if the prior cannot be calculated
-
midPoints
public final void midPoints()split the interval [0,1] into a predefined number of intervals and calculates their mid points -
midPoint
public double midPoint(double size, int number) calculates the mid point of an interval- Parameters:
size
- the size of each intervalnumber
- the number of the interval. The intervals are numbered from 0 to m_numIntervals.- Returns:
- the mid point of the interval
-
getMidPoints
public final double[] getMidPoints()returns an ordered array of all mid points- Returns:
- an ordered array of doubles conatining all midpoints
-
splitItemSet
splits an item set into premise and consequence and constructs therefore an association rule. The length of the premise is given. The attributes for premise and consequence are chosen randomly. The result is a RuleItem.- Parameters:
premiseLength
- the length of the premiseitemArray
- a (randomly generated) item set- Returns:
- a randomly generated association rule stored in a RuleItem
-
addCons
generates a class association rule out of a given premise. It randomly chooses a class label as consequence.- Parameters:
itemArray
- the (randomly constructed) premise of the class association rule- Returns:
- a class association rule stored in a RuleItem
-
updateCounters
updates the support count of an item set- Parameters:
itemSet
- the item set
-
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Returns:
- the revision
-