Package org.biojava.bio.symbol
This package is not intended to have strong biological ties. It is here to make programming things like dynamic-programming much easier. It also handles serialization of well-known alphabets so that applicable singleton properties of alphabets and Symbols are maintained.
All coordinates are in 'bio-coordinates' - that is - legal indexes start from 1 and a range is inclusive (4 to 7 includes 4, 5, 6 and 7).
A Symbol is a single token. The Symbol maintains a name, a token (char), and an Annotation bundle. A set of Symbols is represented by an Alphabet instance. If the Alphabet can guarantee that there are only ever a finite number of Symbols contained with in it, then it must implement FiniteAlphabet. The Symbol objects within a FiniteAlphabet can be tested for equality by comparing their references directly. A SymbolList is a string over the Symbols from a single Alphabet instance. This allows you to represent a sequence of tokens, such as DNA nucleotides, or stock-market prices.
CrossProductAlphabet and CrossProductSymbol allow alphabets and symbols to be represented that are the combination of two or more alphabets and symbols under cross-product. For example, the CrossProduct alphabet DNA x DNA would contain all di-nucleotides. DNA x DNA x DNA x Protein would contain all combinations of three nucleotides and a single amino-acid. Dice x Coin would contain every possible combination of dice roles (1..6) and of coin flips (Heads, Tails) as the Symbol objects (1, Heads), (1, Tails), (2, Heads) ... (6, Tails). If any one of the Alphabets that make up the source of a CrossProductAlphabet is not finite, then the resulting CrossProductAlphabet will not be finite either.
Locations within a SymbolList can be represented by a Location object. This interface defines a sub-set of points that are within the Location. This uses bio-coordinates, and defines all the operations that you are likely to need to build your own Locations (union, intersection and the like).
-
ClassDescriptionAn abstract implementation of
Alphabet
.An abstract implementation ofLocation
.AbstractLocation
decorator (wrapper).an abstract class implementing basic functionality of a translation table that translates Symbols from one Alphabet to another.Base class for simple contiguous Location implementations.an abstract class implementing basic functionality of a translation table that translates Symbols from one Alphabet to another.The base-class for Symbol implementations.Abstract helper implementation of the SymbolList core interface.The set of AtomicSymbols which can be concatenated together to make a SymbolList.Map between Symbols and index numbers.Utility methods for working with Alphabets.A symbol that is not ambiguous.A symbol that can be represented as a string of Symbols.Between view onto an underlying Location instance.SymbolList implementation using constant-size chunks.Circular view onto an underlying Location instance.An utility class for codon preferencesPacking utility class for DNA.APacking
implementation which handles the DNA alphabet, without any support for ambiguity symbols.An efficient implementation of an Alphabet over the infinite set of double values.A range of double values.A single double value.A class to represent a contiguous range of double symbols.Symbol list which just consists of non-informative symbols.Encapsulates an edit operation on a SymbolList.An alphabet over a finite set of Symbols.An atomic symbol consisting only of itself.A 'fuzzy' location a-la Embl fuzzy locations.Determines how aFuzzyLocation
should be treated when used as a normalLocation
.FuzzyPointLocation
represents two types of EMBL-style partially-defined locations.Determines how aFuzzyPointLocation
should be treated when used as a normalLocation
.This extends SymbolList with API for manipulating, inserting and deleting gaps.The exception to indicate that an invalid alphabet has been used.The exception to indicate that a symbol is not valid within a context.An efficient implementation of an Alphabet over the infinite set of integer values.A single int value.A class to represent a finite contiguous subset of the infinite IntegerAlphabetA set of integers, often used to represent positions on biological sequences.Tools class containing a number of operators for working withLocation
objects.A translation table that will handle the many-to-one mappings that you see, for example, with genetic codes.Produced by LocationTools as a result of union operations.MotifTools
contains utility methods for sequence motifs.A SymbolList that stores symbols as bit-patterns in an array of longs.This class makes PackedSymbolLists.An encapsulation of the way symbols map to bit-patterns.A factory that is used to maintain associations between alphabets and preferred bit-packings for them.A location representing a single point.A simple implementation of Location that contains all points between getMin and getMax inclusive.An alignment that relabels another alignment.A translation table that can also translate from the target to source alphabet.A simple no-frills implementation of the FiniteAlphabet interface.A basic implementation of AtomicSymbol.a simple no-frills implementation of the CodonPref object that encapsulates codon preference data.This implementation of GappedSymbolList wraps a SymbolList, allowing you to insert gaps.An aligned block.A genetic code translation table representing a translation table in the DDBJ/EMBL/GenBank Feature Table (appendix V).A no-frills implementation of a translation table that maps between two alphabets.A no-frills implementation of TranslationTable that uses a Map to map from symbols in a finite source alphabet into a target alphabet.Basic implementation of SymbolList.This class makes SimpleSymbolLists.Class that implements the SymbolPropertyTable interfaceA no-frills implementation of TranslationTable that uses a Map to map from symbols in a finite source alphabet into a target alphabet.An alphabet that contains a single atomic symbol.Soft masking is usually displayed by making the masked regions somehow different from the non masked regions.Implementations will define how soft masking looks.Suffix tree implementation.A node in the suffix tree.A single symbol.A sequence of symbols that belong to an alphabet.This interface exists to hide implementational details of SymbolLists when making chunked symbol lists.Tools class for constructing views ofSymbolList
objects.class for maintaining properties associated with a symbolEncapsulates the mapping from a source to a destination alphabet.A suffix tree is an efficient method for encoding the frequencies of motifs in a sequence.end Tree modification methodsan object to return statistics about the frequency of the wobble base in a set of synonymous codons.