Package weka.classifiers.meta
Class Bagging
- java.lang.Object
-
- All Implemented Interfaces:
java.io.Serializable,java.lang.Cloneable,AdditionalMeasureProducer,CapabilitiesHandler,OptionHandler,Randomizable,RevisionHandler,TechnicalInformationHandler,WeightedInstancesHandler
public class Bagging extends RandomizableIteratedSingleClassifierEnhancer implements WeightedInstancesHandler, AdditionalMeasureProducer, TechnicalInformationHandler
Class for bagging a classifier to reduce variance. Can do classification and regression depending on the base learner.
For more information, see
Leo Breiman (1996). Bagging predictors. Machine Learning. 24(2):123-140. BibTeX:@article{Breiman1996, author = {Leo Breiman}, journal = {Machine Learning}, number = {2}, pages = {123-140}, title = {Bagging predictors}, volume = {24}, year = {1996} }Valid options are:-P Size of each bag, as a percentage of the training set size. (default 100)
-O Calculate the out of bag error.
-S <num> Random number seed. (default 1)
-I <num> Number of iterations. (default 10)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.REPTree)
Options specific to classifier weka.classifiers.trees.REPTree:
-M <minimum number of instances> Set minimum number of instances per leaf (default 2).
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-N <number of folds> Number of folds for reduced error pruning (default 3).
-S <seed> Seed for random data shuffling (default 1).
-P No pruning.
-L Maximum tree depth (default -1, no maximum)
Options after -- are passed to the designated classifier.- Version:
- $Revision: 11572 $
- Author:
- Eibe Frank (eibe@cs.waikato.ac.nz), Len Trigg (len@reeltwo.com), Richard Kirkby (rkirkby@cs.waikato.ac.nz)
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description Bagging()Constructor.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.StringbagSizePercentTipText()Returns the tip text for this propertyvoidbuildClassifier(Instances data)Bagging method.java.lang.StringcalcOutOfBagTipText()Returns the tip text for this propertydouble[]distributionForInstance(Instance instance)Calculates the class membership probabilities for the given test instance.java.util.EnumerationenumerateMeasures()Returns an enumeration of the additional measure names.intgetBagSizePercent()Gets the size of each bag, as a percentage of the training set size.booleangetCalcOutOfBag()Get whether the out of bag error is calculated.doublegetMeasure(java.lang.String additionalMeasureName)Returns the value of the named measure.java.lang.String[]getOptions()Gets the current settings of the Classifier.java.lang.StringgetRevision()Returns the revision string.TechnicalInformationgetTechnicalInformation()Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.java.lang.StringglobalInfo()Returns a string describing classifierjava.util.EnumerationlistOptions()Returns an enumeration describing the available options.static voidmain(java.lang.String[] argv)Main method for testing this class.doublemeasureOutOfBagError()Gets the out of bag error that was calculated as the classifier was built.voidsetBagSizePercent(int newBagSizePercent)Sets the size of each bag, as a percentage of the training set size.voidsetCalcOutOfBag(boolean calcOutOfBag)Set whether the out of bag error is calculated.voidsetOptions(java.lang.String[] options)Parses a given list of options.java.lang.StringtoString()Returns description of the bagged classifier.-
Methods inherited from class weka.classifiers.RandomizableIteratedSingleClassifierEnhancer
getSeed, seedTipText, setSeed
-
Methods inherited from class weka.classifiers.IteratedSingleClassifierEnhancer
getNumIterations, numIterationsTipText, setNumIterations
-
Methods inherited from class weka.classifiers.SingleClassifierEnhancer
classifierTipText, getCapabilities, getClassifier, setClassifier
-
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, setDebug
-
-
-
-
Method Detail
-
globalInfo
public java.lang.String globalInfo()
Returns a string describing classifier- Returns:
- a description suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformationin interfaceTechnicalInformationHandler- Returns:
- the technical information about this class
-
listOptions
public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptionsin interfaceOptionHandler- Overrides:
listOptionsin classRandomizableIteratedSingleClassifierEnhancer- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(java.lang.String[] options) throws java.lang.ExceptionParses a given list of options. Valid options are:-P Size of each bag, as a percentage of the training set size. (default 100)
-O Calculate the out of bag error.
-S <num> Random number seed. (default 1)
-I <num> Number of iterations. (default 10)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.REPTree)
Options specific to classifier weka.classifiers.trees.REPTree:
-M <minimum number of instances> Set minimum number of instances per leaf (default 2).
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-N <number of folds> Number of folds for reduced error pruning (default 3).
-S <seed> Seed for random data shuffling (default 1).
-P No pruning.
-L Maximum tree depth (default -1, no maximum)
Options after -- are passed to the designated classifier.- Specified by:
setOptionsin interfaceOptionHandler- Overrides:
setOptionsin classRandomizableIteratedSingleClassifierEnhancer- Parameters:
options- the list of options as an array of strings- Throws:
java.lang.Exception- if an option is not supported
-
getOptions
public java.lang.String[] getOptions()
Gets the current settings of the Classifier.- Specified by:
getOptionsin interfaceOptionHandler- Overrides:
getOptionsin classRandomizableIteratedSingleClassifierEnhancer- Returns:
- an array of strings suitable for passing to setOptions
-
bagSizePercentTipText
public java.lang.String bagSizePercentTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getBagSizePercent
public int getBagSizePercent()
Gets the size of each bag, as a percentage of the training set size.- Returns:
- the bag size, as a percentage.
-
setBagSizePercent
public void setBagSizePercent(int newBagSizePercent)
Sets the size of each bag, as a percentage of the training set size.- Parameters:
newBagSizePercent- the bag size, as a percentage.
-
calcOutOfBagTipText
public java.lang.String calcOutOfBagTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setCalcOutOfBag
public void setCalcOutOfBag(boolean calcOutOfBag)
Set whether the out of bag error is calculated.- Parameters:
calcOutOfBag- whether to calculate the out of bag error
-
getCalcOutOfBag
public boolean getCalcOutOfBag()
Get whether the out of bag error is calculated.- Returns:
- whether the out of bag error is calculated
-
measureOutOfBagError
public double measureOutOfBagError()
Gets the out of bag error that was calculated as the classifier was built.- Returns:
- the out of bag error
-
enumerateMeasures
public java.util.Enumeration enumerateMeasures()
Returns an enumeration of the additional measure names.- Specified by:
enumerateMeasuresin interfaceAdditionalMeasureProducer- Returns:
- an enumeration of the measure names
-
getMeasure
public double getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure.- Specified by:
getMeasurein interfaceAdditionalMeasureProducer- Parameters:
additionalMeasureName- the name of the measure to query for its value- Returns:
- the value of the named measure
- Throws:
java.lang.IllegalArgumentException- if the named measure is not supported
-
buildClassifier
public void buildClassifier(Instances data) throws java.lang.Exception
Bagging method.- Overrides:
buildClassifierin classIteratedSingleClassifierEnhancer- Parameters:
data- the training data to be used for generating the bagged classifier.- Throws:
java.lang.Exception- if the classifier could not be built successfully
-
distributionForInstance
public double[] distributionForInstance(Instance instance) throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.- Overrides:
distributionForInstancein classClassifier- Parameters:
instance- the instance to be classified- Returns:
- preedicted class probability distribution
- Throws:
java.lang.Exception- if distribution can't be computed successfully
-
toString
public java.lang.String toString()
Returns description of the bagged classifier.- Overrides:
toStringin classjava.lang.Object- Returns:
- description of the bagged classifier as a string
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevisionin interfaceRevisionHandler- Overrides:
getRevisionin classClassifier- Returns:
- the revision
-
main
public static void main(java.lang.String[] argv)
Main method for testing this class.- Parameters:
argv- the options
-
-