Package weka.classifiers.trees
Class SimpleCart
java.lang.Object
weka.classifiers.Classifier
weka.classifiers.RandomizableClassifier
weka.classifiers.trees.SimpleCart
- All Implemented Interfaces:
Serializable
,Cloneable
,AdditionalMeasureProducer
,CapabilitiesHandler
,OptionHandler
,Randomizable
,RevisionHandler
,TechnicalInformationHandler
public class SimpleCart
extends RandomizableClassifier
implements AdditionalMeasureProducer, TechnicalInformationHandler
Class implementing minimal cost-complexity pruning.
Note when dealing with missing values, use "fractional instances" method instead of surrogate split method.
For more information, see:
Leo Breiman, Jerome H. Friedman, Richard A. Olshen, Charles J. Stone (1984). Classification and Regression Trees. Wadsworth International Group, Belmont, California. BibTeX:
Note when dealing with missing values, use "fractional instances" method instead of surrogate split method.
For more information, see:
Leo Breiman, Jerome H. Friedman, Richard A. Olshen, Charles J. Stone (1984). Classification and Regression Trees. Wadsworth International Group, Belmont, California. BibTeX:
@book{Breiman1984, address = {Belmont, California}, author = {Leo Breiman and Jerome H. Friedman and Richard A. Olshen and Charles J. Stone}, publisher = {Wadsworth International Group}, title = {Classification and Regression Trees}, year = {1984} }Valid options are:
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-M <min no> The minimal number of instances at the terminal nodes. (default 2)
-N <num folds> The number of folds used in the minimal cost-complexity pruning. (default 5)
-U Don't use the minimal cost-complexity pruning. (default yes).
-H Don't use the heuristic method for binary split. (default true).
-A Use 1 SE rule to make pruning decision. (default no).
-C Percentage of training data size (0-1]. (default 1).
- Version:
- $Revision: 10491 $
- Author:
- Haijian Shi (hs69@cs.waikato.ac.nz)
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
buildClassifier
(Instances data) Build the classifier.void
Updates the alpha field for all nodes.double[]
distributionForInstance
(Instance instance) Computes class probabilities for instance using the decision tree.Return an enumeration of the measure names.Returns default capabilities of the classifier.boolean
Get if use heuristic search for nominal attributes in multi-class problems.double
getMeasure
(String additionalMeasureName) Returns the value of the named measure.double
Get minimal number of instances at the terminal nodes.int
Set number of folds in internal cross-validation.String[]
Gets the current settings of the classifier.Returns the revision string.double
Get training set size.Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.boolean
Get if use the 1SE rule to choose final model.boolean
Get if use minimal cost-complexity pruning.Return a description suitable for displaying in the explorer/experimenter.Returns the tip text for this propertyReturns an enumeration describing the available options.static void
Main method.double
Return number of tree size.Returns the tip text for this propertyvoid
Updates the numIncorrectModel field for all nodes when subtree (to be pruned) is rooted.Returns the tip text for this propertyint
Method to count the number of inner nodes in the tree.int
Compute number of leaf nodes.int
numNodes()
Compute size of the tree.void
prune
(double alpha) Prunes the original tree using the CART pruning scheme, given a cost-complexity parameter alpha.int
Method for performing one fold in the cross-validation of minimal cost-complexity pruning.void
setHeuristic
(boolean value) Set if use heuristic search for nominal attributes in multi-class problems.void
setMinNumObj
(double value) Set minimal number of instances at the terminal nodes.void
setNumFoldsPruning
(int value) Set number of folds in internal cross-validation.void
setOptions
(String[] options) Parses a given list of options.void
setSizePer
(double value) Set training set size.void
setUseOneSE
(boolean value) Set if use the 1SE rule to choose final model.void
setUsePrune
(boolean value) Set if use minimal cost-complexity pruning.Returns the tip text for this propertytoString()
Prints the decision tree using the protected toString method from below.void
Updates the numIncorrectTree field for all nodes.Returns the tip text for this propertyReturn the tip text for this propertyMethods inherited from class weka.classifiers.RandomizableClassifier
getSeed, seedTipText, setSeed
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, setDebug
-
Constructor Details
-
SimpleCart
public SimpleCart()
-
-
Method Details
-
globalInfo
Return a description suitable for displaying in the explorer/experimenter.- Returns:
- a description suitable for displaying in the explorer/experimenter
-
getTechnicalInformation
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformation
in interfaceTechnicalInformationHandler
- Returns:
- the technical information about this class
-
getCapabilities
Returns default capabilities of the classifier.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Overrides:
getCapabilities
in classClassifier
- Returns:
- the capabilities of this classifier
- See Also:
-
buildClassifier
Build the classifier.- Specified by:
buildClassifier
in classClassifier
- Parameters:
data
- the training instances- Throws:
Exception
- if something goes wrong
-
prune
Prunes the original tree using the CART pruning scheme, given a cost-complexity parameter alpha.- Parameters:
alpha
- the cost-complexity parameter- Throws:
Exception
- if something goes wrong
-
prune
Method for performing one fold in the cross-validation of minimal cost-complexity pruning. Generates a sequence of alpha-values with error estimates for the corresponding (partially pruned) trees, given the test set of that fold.- Parameters:
alphas
- array to hold the generated alpha-valueserrors
- array to hold the corresponding error estimatestest
- test set of that fold (to obtain error estimates)- Returns:
- the iteration of the pruning
- Throws:
Exception
- if something goes wrong
-
modelErrors
Updates the numIncorrectModel field for all nodes when subtree (to be pruned) is rooted. This is needed for calculating the alpha-values.- Throws:
Exception
- if something goes wrong
-
treeErrors
Updates the numIncorrectTree field for all nodes. This is needed for calculating the alpha-values.- Throws:
Exception
- if something goes wrong
-
calculateAlphas
Updates the alpha field for all nodes.- Throws:
Exception
- if something goes wrong
-
distributionForInstance
Computes class probabilities for instance using the decision tree.- Overrides:
distributionForInstance
in classClassifier
- Parameters:
instance
- the instance for which class probabilities is to be computed- Returns:
- the class probabilities for the given instance
- Throws:
Exception
- if something goes wrong
-
toString
Prints the decision tree using the protected toString method from below. -
numNodes
public int numNodes()Compute size of the tree.- Returns:
- size of the tree
-
numInnerNodes
public int numInnerNodes()Method to count the number of inner nodes in the tree.- Returns:
- the number of inner nodes
-
numLeaves
public int numLeaves()Compute number of leaf nodes.- Returns:
- number of leaf nodes
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classRandomizableClassifier
- Returns:
- an enumeration of all the available options.
-
setOptions
Parses a given list of options. Valid options are:-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-M <min no> The minimal number of instances at the terminal nodes. (default 2)
-N <num folds> The number of folds used in the minimal cost-complexity pruning. (default 5)
-U Don't use the minimal cost-complexity pruning. (default yes).
-H Don't use the heuristic method for binary split. (default true).
-A Use 1 SE rule to make pruning decision. (default no).
-C Percentage of training data size (0-1]. (default 1).
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classRandomizableClassifier
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an options is not supported
-
getOptions
Gets the current settings of the classifier.- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classRandomizableClassifier
- Returns:
- the current setting of the classifier
-
enumerateMeasures
Return an enumeration of the measure names.- Specified by:
enumerateMeasures
in interfaceAdditionalMeasureProducer
- Returns:
- an enumeration of the measure names
-
measureTreeSize
public double measureTreeSize()Return number of tree size.- Returns:
- number of tree size
-
getMeasure
Returns the value of the named measure.- Specified by:
getMeasure
in interfaceAdditionalMeasureProducer
- Parameters:
additionalMeasureName
- the name of the measure to query for its value- Returns:
- the value of the named measure
- Throws:
IllegalArgumentException
- if the named measure is not supported
-
minNumObjTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setMinNumObj
public void setMinNumObj(double value) Set minimal number of instances at the terminal nodes.- Parameters:
value
- minimal number of instances at the terminal nodes
-
getMinNumObj
public double getMinNumObj()Get minimal number of instances at the terminal nodes.- Returns:
- minimal number of instances at the terminal nodes
-
numFoldsPruningTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumFoldsPruning
public void setNumFoldsPruning(int value) Set number of folds in internal cross-validation.- Parameters:
value
- number of folds in internal cross-validation.
-
getNumFoldsPruning
public int getNumFoldsPruning()Set number of folds in internal cross-validation.- Returns:
- number of folds in internal cross-validation.
-
usePruneTipText
Return the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setUsePrune
public void setUsePrune(boolean value) Set if use minimal cost-complexity pruning.- Parameters:
value
- if use minimal cost-complexity pruning
-
getUsePrune
public boolean getUsePrune()Get if use minimal cost-complexity pruning.- Returns:
- if use minimal cost-complexity pruning
-
heuristicTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setHeuristic
public void setHeuristic(boolean value) Set if use heuristic search for nominal attributes in multi-class problems.- Parameters:
value
- if use heuristic search for nominal attributes in multi-class problems
-
getHeuristic
public boolean getHeuristic()Get if use heuristic search for nominal attributes in multi-class problems.- Returns:
- if use heuristic search for nominal attributes in multi-class problems
-
useOneSETipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setUseOneSE
public void setUseOneSE(boolean value) Set if use the 1SE rule to choose final model.- Parameters:
value
- if use the 1SE rule to choose final model
-
getUseOneSE
public boolean getUseOneSE()Get if use the 1SE rule to choose final model.- Returns:
- if use the 1SE rule to choose final model
-
sizePerTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setSizePer
public void setSizePer(double value) Set training set size.- Parameters:
value
- training set size
-
getSizePer
public double getSizePer()Get training set size.- Returns:
- training set size
-
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classClassifier
- Returns:
- the revision
-
main
Main method.- Parameters:
args
- the options for the classifier
-