Package weka.clusterers
Class sIB
java.lang.Object
weka.clusterers.AbstractClusterer
weka.clusterers.RandomizableClusterer
weka.clusterers.sIB
- All Implemented Interfaces:
Serializable
,Cloneable
,Clusterer
,CapabilitiesHandler
,OptionHandler
,Randomizable
,RevisionHandler
,TechnicalInformationHandler
Cluster data using the sequential information bottleneck algorithm.
Note: only hard clustering scheme is supported. sIB assign for each instance the cluster that have the minimum cost/distance to the instance. The trade-off beta is set to infinite so 1/beta is zero.
For more information, see:
Noam Slonim, Nir Friedman, Naftali Tishby: Unsupervised document classification using sequential information maximization. In: Proceedings of the 25th International ACM SIGIR Conference on Research and Development in Information Retrieval, 129-136, 2002. BibTeX:
Note: only hard clustering scheme is supported. sIB assign for each instance the cluster that have the minimum cost/distance to the instance. The trade-off beta is set to infinite so 1/beta is zero.
For more information, see:
Noam Slonim, Nir Friedman, Naftali Tishby: Unsupervised document classification using sequential information maximization. In: Proceedings of the 25th International ACM SIGIR Conference on Research and Development in Information Retrieval, 129-136, 2002. BibTeX:
@inproceedings{Slonim2002, author = {Noam Slonim and Nir Friedman and Naftali Tishby}, booktitle = {Proceedings of the 25th International ACM SIGIR Conference on Research and Development in Information Retrieval}, pages = {129-136}, title = {Unsupervised document classification using sequential information maximization}, year = {2002} }Valid options are:
-I <num> maximum number of iterations (default 100).
-M <num> minimum number of changes in a single iteration (default 0).
-N <num> number of clusters. (default 2).
-R <num> number of restarts. (default 5).
-U set not to normalize the data (default true).
-V set to output debug info (default false).
-S <num> Random number seed. (default 1)
- Version:
- $Revision: 5538 $
- Author:
- Noam Slonim, Anna Huang
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
buildClusterer
(Instances data) Generates a clusterer.int
clusterInstance
(Instance instance) Cluster a given instance, this is the method defined in Clusterer interface do nothing but just return the cluster assigned to itReturns the tip text for this propertyReturns default capabilities of the clusterer.boolean
getDebug()
Get debug modeint
Get the max number of iterationsint
get the minimum number of changesboolean
Get whether to normalize instances to unify prior probability before building the clustererint
Get the number of clustersint
Get the number of restartsString[]
Gets the current settings.Returns the revision string.Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.Returns a string describing this clustererReturns an enumeration describing the available options.static void
Returns the tip text for this property.Returns the tip text for this property.Returns the tip text for this property.int
Get the number of clustersReturns the tip text for this property.Returns the tip text for this property.void
setDebug
(boolean v) Set debug mode - verbose outputvoid
setMaxIterations
(int i) Set the max number of iterationsvoid
setMinChange
(int m) set the minimum number of changesvoid
setNotUnifyNorm
(boolean b) Set whether to normalize instances to unify prior probability before building the clusterervoid
setNumClusters
(int n) Set the number of clustersvoid
setNumRestarts
(int i) Set the number of restartsvoid
setOptions
(String[] options) Parses a given list of options.toString()
Methods inherited from class weka.clusterers.RandomizableClusterer
getSeed, seedTipText, setSeed
Methods inherited from class weka.clusterers.AbstractClusterer
distributionForInstance, forName, makeCopies, makeCopy
-
Constructor Details
-
sIB
public sIB()
-
-
Method Details
-
buildClusterer
Generates a clusterer.- Specified by:
buildClusterer
in interfaceClusterer
- Specified by:
buildClusterer
in classAbstractClusterer
- Parameters:
data
- the training instances- Throws:
Exception
- if something goes wrong
-
clusterInstance
Cluster a given instance, this is the method defined in Clusterer interface do nothing but just return the cluster assigned to it- Specified by:
clusterInstance
in interfaceClusterer
- Overrides:
clusterInstance
in classAbstractClusterer
- Parameters:
instance
- the instance to be assigned to a cluster- Returns:
- the number of the assigned cluster as an integer
- Throws:
Exception
- if instance could not be clustered successfully
-
setOptions
Parses a given list of options. Valid options are:-I <num> maximum number of iterations (default 100).
-M <num> minimum number of changes in a single iteration (default 0).
-N <num> number of clusters. (default 2).
-R <num> number of restarts. (default 5).
-U set not to normalize the data (default true).
-V set to output debug info (default false).
-S <num> Random number seed. (default 1)
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classRandomizableClusterer
- Parameters:
options
- the list of options as an array of strings- Throws:
Exception
- if an option is not supported
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classRandomizableClusterer
- Returns:
- an enumeration of all the available options.
-
getOptions
Gets the current settings.- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classRandomizableClusterer
- Returns:
- an array of strings suitable for passing to setOptions()
-
debugTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setDebug
public void setDebug(boolean v) Set debug mode - verbose output- Parameters:
v
- true for verbose output
-
getDebug
public boolean getDebug()Get debug mode- Returns:
- true if debug mode is set
-
maxIterationsTipText
Returns the tip text for this property.- Returns:
- tip text for this property
-
setMaxIterations
public void setMaxIterations(int i) Set the max number of iterations- Parameters:
i
- max number of iterations
-
getMaxIterations
public int getMaxIterations()Get the max number of iterations- Returns:
- max number of iterations
-
minChangeTipText
Returns the tip text for this property.- Returns:
- tip text for this property
-
setMinChange
public void setMinChange(int m) set the minimum number of changes- Parameters:
m
- the minimum number of changes
-
getMinChange
public int getMinChange()get the minimum number of changes- Returns:
- the minimum number of changes
-
numClustersTipText
Returns the tip text for this property.- Returns:
- tip text for this property
-
setNumClusters
public void setNumClusters(int n) Set the number of clusters- Parameters:
n
- number of clusters
-
getNumClusters
public int getNumClusters()Get the number of clusters- Returns:
- the number of clusters
-
numberOfClusters
public int numberOfClusters()Get the number of clusters- Specified by:
numberOfClusters
in interfaceClusterer
- Specified by:
numberOfClusters
in classAbstractClusterer
- Returns:
- the number of clusters
-
numRestartsTipText
Returns the tip text for this property.- Returns:
- tip text for this property
-
setNumRestarts
public void setNumRestarts(int i) Set the number of restarts- Parameters:
i
- number of restarts
-
getNumRestarts
public int getNumRestarts()Get the number of restarts- Returns:
- number of restarts
-
notUnifyNormTipText
Returns the tip text for this property.- Returns:
- tip text for this property
-
setNotUnifyNorm
public void setNotUnifyNorm(boolean b) Set whether to normalize instances to unify prior probability before building the clusterer- Parameters:
b
- true to normalize, otherwise false
-
getNotUnifyNorm
public boolean getNotUnifyNorm()Get whether to normalize instances to unify prior probability before building the clusterer- Returns:
- true if set to normalize, false otherwise
-
globalInfo
Returns a string describing this clusterer- Returns:
- a description of the clusterer suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformation
in interfaceTechnicalInformationHandler
- Returns:
- the technical information about this class
-
getCapabilities
Returns default capabilities of the clusterer.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Specified by:
getCapabilities
in interfaceClusterer
- Overrides:
getCapabilities
in classAbstractClusterer
- Returns:
- the capabilities of this clusterer
- See Also:
-
toString
-
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classAbstractClusterer
- Returns:
- the revision
-
main
-