Class PackedSymbolList

All Implemented Interfaces:
Serializable, SymbolList, Changeable

public class PackedSymbolList extends AbstractSymbolList implements Serializable

A SymbolList that stores symbols as bit-patterns in an array of longs.

Bit-packed symbol lists are space efficient compared to the usual pointer storage model employed by implementations like SimpleSymbolList. This comes at the cost of encoding/decoding symbols from the storage. In practice, the decrease in memory when storing large sequences makes applications go quicker because of issues like page swapping.

Symbols can be mapped to and from bit-patterns. The Pattern interface encapsulates this. A SymbolList can then be stored by writing these bit-patterns into memory. This implementation stores the bits in the long elements of an array. The first symbol will be packed into bits 0 through packing.wordLength()-1 of the long at index 0.

Example Usage

 SymbolList symL = ...;
 SymbolList packed = new PackedSymbolList(
   PackingFactory.getPacking(symL.getAlphabet(), true),
   symL
 );
 
Author:
Matthew Pocock, David Huen (new constructor for Symbol arrays and some speedups)
See Also:
  • Constructor Details

    • PackedSymbolList

      public PackedSymbolList(Packing packing, long[] syms, int length)

      Create a new PackedSymbolList directly from a bit pattern.

      Warning: This is a risky developer method. You must be sure that the syms array is packed in a way that is consistent with the packing. Also, it is your responsibility to ensure that the length is sensible.

      Parameters:
      packing - the Packing used
      syms - a long array containing already packed symbols
      length - the length of the sequence packed in symbols
    • PackedSymbolList

      public PackedSymbolList(Packing packing, SymbolList symList) throws IllegalAlphabetException

      Create a new PackedSymbolList as a packed copy of another symbol list.

      This will create a new and independent symbol list that is a copy of the symbols in symList. Both lists can be modified independently.

      Parameters:
      packing - the way to bit-pack symbols
      symList - the SymbolList to copy
      Throws:
      IllegalAlphabetException
    • PackedSymbolList

      public PackedSymbolList(Packing packing, Symbol[] symbols, int length, Alphabet alfa) throws IllegalAlphabetException, IllegalArgumentException

      Create a new PackedSymbolList from an array of Symbols.

      This will create a new and independent SymbolList formed from the the symbol array.

      Parameters:
      packing - the way to bit-pack symbols
      symbols - an array of Symbols
      length - the number of Symbols to process from symbols
      alfa - the alphabet from which the Symbols are drawn
      Throws:
      IllegalAlphabetException
      IllegalArgumentException
  • Method Details

    • getAlphabet

      Description copied from interface: SymbolList
      The alphabet that this SymbolList is over.

      Every symbol within this SymbolList is a member of this alphabet. alphabet.contains(symbol) == true for each symbol that is within this sequence.

      Specified by:
      getAlphabet in interface SymbolList
      Returns:
      the alphabet
    • length

      public int length()
      Description copied from interface: SymbolList
      The number of symbols in this SymbolList.
      Specified by:
      length in interface SymbolList
      Returns:
      the length
    • symbolAt

      public Symbol symbolAt(int indx)
      Description copied from interface: SymbolList
      Return the symbol at index, counting from 1.
      Specified by:
      symbolAt in interface SymbolList
      Parameters:
      indx - the offset into this SymbolList
      Returns:
      the Symbol at that index
    • getSyms

      public long[] getSyms()

      Return the long array within which the symbols are bit-packed.

      Warning: This is a risky developer method. This is the actual array that this object uses to store the bits representing symbols. You should not modify this in any way. If you do, you will modify the symbols returned by symbolAt(). This methd is provided primarily as an easy way for developers to extract the bit pattern for storage in such a way as it could be fetched later and fed into the appropriate constructor.

      Returns:
      the actual long array used to store bit-packed symbols