Package weka.classifiers.trees
Class SimpleCart
- java.lang.Object
-
- weka.classifiers.Classifier
-
- weka.classifiers.RandomizableClassifier
-
- weka.classifiers.trees.SimpleCart
-
- All Implemented Interfaces:
java.io.Serializable,java.lang.Cloneable,AdditionalMeasureProducer,CapabilitiesHandler,OptionHandler,Randomizable,RevisionHandler,TechnicalInformationHandler
public class SimpleCart extends RandomizableClassifier implements AdditionalMeasureProducer, TechnicalInformationHandler
Class implementing minimal cost-complexity pruning.
Note when dealing with missing values, use "fractional instances" method instead of surrogate split method.
For more information, see:
Leo Breiman, Jerome H. Friedman, Richard A. Olshen, Charles J. Stone (1984). Classification and Regression Trees. Wadsworth International Group, Belmont, California. BibTeX:@book{Breiman1984, address = {Belmont, California}, author = {Leo Breiman and Jerome H. Friedman and Richard A. Olshen and Charles J. Stone}, publisher = {Wadsworth International Group}, title = {Classification and Regression Trees}, year = {1984} }Valid options are:-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-M <min no> The minimal number of instances at the terminal nodes. (default 2)
-N <num folds> The number of folds used in the minimal cost-complexity pruning. (default 5)
-U Don't use the minimal cost-complexity pruning. (default yes).
-H Don't use the heuristic method for binary split. (default true).
-A Use 1 SE rule to make pruning decision. (default no).
-C Percentage of training data size (0-1]. (default 1).
- Version:
- $Revision: 10491 $
- Author:
- Haijian Shi (hs69@cs.waikato.ac.nz)
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description SimpleCart()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidbuildClassifier(Instances data)Build the classifier.voidcalculateAlphas()Updates the alpha field for all nodes.double[]distributionForInstance(Instance instance)Computes class probabilities for instance using the decision tree.java.util.EnumerationenumerateMeasures()Return an enumeration of the measure names.CapabilitiesgetCapabilities()Returns default capabilities of the classifier.booleangetHeuristic()Get if use heuristic search for nominal attributes in multi-class problems.doublegetMeasure(java.lang.String additionalMeasureName)Returns the value of the named measure.doublegetMinNumObj()Get minimal number of instances at the terminal nodes.intgetNumFoldsPruning()Set number of folds in internal cross-validation.java.lang.String[]getOptions()Gets the current settings of the classifier.java.lang.StringgetRevision()Returns the revision string.doublegetSizePer()Get training set size.TechnicalInformationgetTechnicalInformation()Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.booleangetUseOneSE()Get if use the 1SE rule to choose final model.booleangetUsePrune()Get if use minimal cost-complexity pruning.java.lang.StringglobalInfo()Return a description suitable for displaying in the explorer/experimenter.java.lang.StringheuristicTipText()Returns the tip text for this propertyjava.util.EnumerationlistOptions()Returns an enumeration describing the available options.static voidmain(java.lang.String[] args)Main method.doublemeasureTreeSize()Return number of tree size.java.lang.StringminNumObjTipText()Returns the tip text for this propertyvoidmodelErrors()Updates the numIncorrectModel field for all nodes when subtree (to be pruned) is rooted.java.lang.StringnumFoldsPruningTipText()Returns the tip text for this propertyintnumInnerNodes()Method to count the number of inner nodes in the tree.intnumLeaves()Compute number of leaf nodes.intnumNodes()Compute size of the tree.voidprune(double alpha)Prunes the original tree using the CART pruning scheme, given a cost-complexity parameter alpha.intprune(double[] alphas, double[] errors, Instances test)Method for performing one fold in the cross-validation of minimal cost-complexity pruning.voidsetHeuristic(boolean value)Set if use heuristic search for nominal attributes in multi-class problems.voidsetMinNumObj(double value)Set minimal number of instances at the terminal nodes.voidsetNumFoldsPruning(int value)Set number of folds in internal cross-validation.voidsetOptions(java.lang.String[] options)Parses a given list of options.voidsetSizePer(double value)Set training set size.voidsetUseOneSE(boolean value)Set if use the 1SE rule to choose final model.voidsetUsePrune(boolean value)Set if use minimal cost-complexity pruning.java.lang.StringsizePerTipText()Returns the tip text for this propertyjava.lang.StringtoString()Prints the decision tree using the protected toString method from below.voidtreeErrors()Updates the numIncorrectTree field for all nodes.java.lang.StringuseOneSETipText()Returns the tip text for this propertyjava.lang.StringusePruneTipText()Return the tip text for this property-
Methods inherited from class weka.classifiers.RandomizableClassifier
getSeed, seedTipText, setSeed
-
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, setDebug
-
-
-
-
Method Detail
-
globalInfo
public java.lang.String globalInfo()
Return a description suitable for displaying in the explorer/experimenter.- Returns:
- a description suitable for displaying in the explorer/experimenter
-
getTechnicalInformation
public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformationin interfaceTechnicalInformationHandler- Returns:
- the technical information about this class
-
getCapabilities
public Capabilities getCapabilities()
Returns default capabilities of the classifier.- Specified by:
getCapabilitiesin interfaceCapabilitiesHandler- Overrides:
getCapabilitiesin classClassifier- Returns:
- the capabilities of this classifier
- See Also:
Capabilities
-
buildClassifier
public void buildClassifier(Instances data) throws java.lang.Exception
Build the classifier.- Specified by:
buildClassifierin classClassifier- Parameters:
data- the training instances- Throws:
java.lang.Exception- if something goes wrong
-
prune
public void prune(double alpha) throws java.lang.ExceptionPrunes the original tree using the CART pruning scheme, given a cost-complexity parameter alpha.- Parameters:
alpha- the cost-complexity parameter- Throws:
java.lang.Exception- if something goes wrong
-
prune
public int prune(double[] alphas, double[] errors, Instances test) throws java.lang.ExceptionMethod for performing one fold in the cross-validation of minimal cost-complexity pruning. Generates a sequence of alpha-values with error estimates for the corresponding (partially pruned) trees, given the test set of that fold.- Parameters:
alphas- array to hold the generated alpha-valueserrors- array to hold the corresponding error estimatestest- test set of that fold (to obtain error estimates)- Returns:
- the iteration of the pruning
- Throws:
java.lang.Exception- if something goes wrong
-
modelErrors
public void modelErrors() throws java.lang.ExceptionUpdates the numIncorrectModel field for all nodes when subtree (to be pruned) is rooted. This is needed for calculating the alpha-values.- Throws:
java.lang.Exception- if something goes wrong
-
treeErrors
public void treeErrors() throws java.lang.ExceptionUpdates the numIncorrectTree field for all nodes. This is needed for calculating the alpha-values.- Throws:
java.lang.Exception- if something goes wrong
-
calculateAlphas
public void calculateAlphas() throws java.lang.ExceptionUpdates the alpha field for all nodes.- Throws:
java.lang.Exception- if something goes wrong
-
distributionForInstance
public double[] distributionForInstance(Instance instance) throws java.lang.Exception
Computes class probabilities for instance using the decision tree.- Overrides:
distributionForInstancein classClassifier- Parameters:
instance- the instance for which class probabilities is to be computed- Returns:
- the class probabilities for the given instance
- Throws:
java.lang.Exception- if something goes wrong
-
toString
public java.lang.String toString()
Prints the decision tree using the protected toString method from below.- Overrides:
toStringin classjava.lang.Object- Returns:
- a textual description of the classifier
-
numNodes
public int numNodes()
Compute size of the tree.- Returns:
- size of the tree
-
numInnerNodes
public int numInnerNodes()
Method to count the number of inner nodes in the tree.- Returns:
- the number of inner nodes
-
numLeaves
public int numLeaves()
Compute number of leaf nodes.- Returns:
- number of leaf nodes
-
listOptions
public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptionsin interfaceOptionHandler- Overrides:
listOptionsin classRandomizableClassifier- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(java.lang.String[] options) throws java.lang.ExceptionParses a given list of options. Valid options are:-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-M <min no> The minimal number of instances at the terminal nodes. (default 2)
-N <num folds> The number of folds used in the minimal cost-complexity pruning. (default 5)
-U Don't use the minimal cost-complexity pruning. (default yes).
-H Don't use the heuristic method for binary split. (default true).
-A Use 1 SE rule to make pruning decision. (default no).
-C Percentage of training data size (0-1]. (default 1).
- Specified by:
setOptionsin interfaceOptionHandler- Overrides:
setOptionsin classRandomizableClassifier- Parameters:
options- the list of options as an array of strings- Throws:
java.lang.Exception- if an options is not supported
-
getOptions
public java.lang.String[] getOptions()
Gets the current settings of the classifier.- Specified by:
getOptionsin interfaceOptionHandler- Overrides:
getOptionsin classRandomizableClassifier- Returns:
- the current setting of the classifier
-
enumerateMeasures
public java.util.Enumeration enumerateMeasures()
Return an enumeration of the measure names.- Specified by:
enumerateMeasuresin interfaceAdditionalMeasureProducer- Returns:
- an enumeration of the measure names
-
measureTreeSize
public double measureTreeSize()
Return number of tree size.- Returns:
- number of tree size
-
getMeasure
public double getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure.- Specified by:
getMeasurein interfaceAdditionalMeasureProducer- Parameters:
additionalMeasureName- the name of the measure to query for its value- Returns:
- the value of the named measure
- Throws:
java.lang.IllegalArgumentException- if the named measure is not supported
-
minNumObjTipText
public java.lang.String minNumObjTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setMinNumObj
public void setMinNumObj(double value)
Set minimal number of instances at the terminal nodes.- Parameters:
value- minimal number of instances at the terminal nodes
-
getMinNumObj
public double getMinNumObj()
Get minimal number of instances at the terminal nodes.- Returns:
- minimal number of instances at the terminal nodes
-
numFoldsPruningTipText
public java.lang.String numFoldsPruningTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumFoldsPruning
public void setNumFoldsPruning(int value)
Set number of folds in internal cross-validation.- Parameters:
value- number of folds in internal cross-validation.
-
getNumFoldsPruning
public int getNumFoldsPruning()
Set number of folds in internal cross-validation.- Returns:
- number of folds in internal cross-validation.
-
usePruneTipText
public java.lang.String usePruneTipText()
Return the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setUsePrune
public void setUsePrune(boolean value)
Set if use minimal cost-complexity pruning.- Parameters:
value- if use minimal cost-complexity pruning
-
getUsePrune
public boolean getUsePrune()
Get if use minimal cost-complexity pruning.- Returns:
- if use minimal cost-complexity pruning
-
heuristicTipText
public java.lang.String heuristicTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setHeuristic
public void setHeuristic(boolean value)
Set if use heuristic search for nominal attributes in multi-class problems.- Parameters:
value- if use heuristic search for nominal attributes in multi-class problems
-
getHeuristic
public boolean getHeuristic()
Get if use heuristic search for nominal attributes in multi-class problems.- Returns:
- if use heuristic search for nominal attributes in multi-class problems
-
useOneSETipText
public java.lang.String useOneSETipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setUseOneSE
public void setUseOneSE(boolean value)
Set if use the 1SE rule to choose final model.- Parameters:
value- if use the 1SE rule to choose final model
-
getUseOneSE
public boolean getUseOneSE()
Get if use the 1SE rule to choose final model.- Returns:
- if use the 1SE rule to choose final model
-
sizePerTipText
public java.lang.String sizePerTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setSizePer
public void setSizePer(double value)
Set training set size.- Parameters:
value- training set size
-
getSizePer
public double getSizePer()
Get training set size.- Returns:
- training set size
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevisionin interfaceRevisionHandler- Overrides:
getRevisionin classClassifier- Returns:
- the revision
-
main
public static void main(java.lang.String[] args)
Main method.- Parameters:
args- the options for the classifier
-
-