Package org.biojava.bio.seq.io
Class WordTokenization
java.lang.Object
org.biojava.utils.Unchangeable
org.biojava.bio.seq.io.WordTokenization
- All Implemented Interfaces:
Serializable
,Annotatable
,SymbolTokenization
,Changeable
- Direct Known Subclasses:
CrossProductTokenization
,DoubleTokenization
,IntegerTokenization
,NameTokenization
,SubIntegerTokenization
public abstract class WordTokenization
extends Unchangeable
implements SymbolTokenization, Serializable
Base class for tokenizations which accept whitespace-separated
`words'. Splits at whitespace, except when it is quoted by
either double-quotes ("), brackets (), or square brackets [].
- Since:
- 1.2
- Author:
- Thomas Down, Greg Cox, Keith James
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.biojava.bio.Annotatable
Annotatable.AnnotationForwarder
Nested classes/interfaces inherited from interface org.biojava.bio.seq.io.SymbolTokenization
SymbolTokenization.TokenType
-
Field Summary
Fields inherited from interface org.biojava.bio.Annotatable
ANNOTATION
Fields inherited from interface org.biojava.bio.seq.io.SymbolTokenization
CHARACTER, FIXEDWIDTH, SEPARATED, UNKNOWN
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionThe alphabet to which this tokenization applies.Should return the associated annotation object.Determine the style of tokenization represented by this object.parseStream
(SeqIOListener siol) Return an object which can parse an arbitrary character stream into symbols.protected Symbol[]
protected List
splitString
(String str) Return a string representation of a list of symbols.Methods inherited from class org.biojava.utils.Unchangeable
addChangeListener, addChangeListener, addForwarder, getForwarders, getListeners, isUnchanging, removeChangeListener, removeChangeListener, removeForwarder
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.biojava.utils.Changeable
addChangeListener, addChangeListener, isUnchanging, removeChangeListener, removeChangeListener
Methods inherited from interface org.biojava.bio.seq.io.SymbolTokenization
parseToken, tokenizeSymbol
-
Constructor Details
-
WordTokenization
-
-
Method Details
-
getAlphabet
Description copied from interface:SymbolTokenization
The alphabet to which this tokenization applies.- Specified by:
getAlphabet
in interfaceSymbolTokenization
-
getTokenType
Description copied from interface:SymbolTokenization
Determine the style of tokenization represented by this object.- Specified by:
getTokenType
in interfaceSymbolTokenization
-
getAnnotation
Description copied from interface:Annotatable
Should return the associated annotation object.- Specified by:
getAnnotation
in interfaceAnnotatable
- Returns:
- an Annotation object, never null
-
tokenizeSymbolList
public String tokenizeSymbolList(SymbolList sl) throws IllegalSymbolException, IllegalAlphabetException Description copied from interface:SymbolTokenization
Return a string representation of a list of symbols.- Specified by:
tokenizeSymbolList
in interfaceSymbolTokenization
- Parameters:
sl
- A SymbolList- Throws:
IllegalAlphabetException
- if alphabets don't matchIllegalSymbolException
-
parseStream
Description copied from interface:SymbolTokenization
Return an object which can parse an arbitrary character stream into symbols.- Specified by:
parseStream
in interfaceSymbolTokenization
- Parameters:
siol
- The listener which gets notified of parsed symbols.
-
splitString
- Throws:
IllegalSymbolException
-
parseString
- Throws:
IllegalSymbolException
-