Interface SymbolList

All Superinterfaces:
Changeable
All Known Subinterfaces:
Alignment, ARAlignment, GappedSequence, GappedSymbolList, RichSequence, Sequence, StatePath, UnequalLengthAlignment
All Known Implementing Classes:
AbstractSymbolList, AbstractULAlignment, AbstractULAlignment.SubULAlignment, AlignmentPair, AssembledSymbolList, ChunkedSymbolList, CircularView, DummySequence, DummySymbolList, FlexibleAlignment, InfinitelyAmbiguousSymbolList, NewAssembledSymbolList, NewSimpleAssembly, PackedSymbolList, PhredSequence, RelabeledAlignment, RevCompSequence, SimilarityPairFeature.EmptyPairwiseAlignment, SimpleAlignment, SimpleAssembly, SimpleGappedSequence, SimpleGappedSymbolList, SimpleRichSequence, SimpleSequence, SimpleStatePath, SimpleSymbolList, SubSequence, ThinRichSequence, ViewSequence

public interface SymbolList extends Changeable
A sequence of symbols that belong to an alphabet.

This uses biological coordinates (1 to length).

Author:
Matthew Pocock, Mark Schreiber, Francois Pepin
  • Field Details

    • EDIT

      static final ChangeType EDIT
      Signals that the SymbolList is being edited. The getChange field of the event should contain the SymbolList.Edit object describing the change.
    • EMPTY_LIST

      static final SymbolList EMPTY_LIST
      A useful object that represents an empty symbol list, to avoid returning null.
  • Method Details

    • getAlphabet

      The alphabet that this SymbolList is over.

      Every symbol within this SymbolList is a member of this alphabet. alphabet.contains(symbol) == true for each symbol that is within this sequence.

      Returns:
      the alphabet
    • length

      int length()
      The number of symbols in this SymbolList.
      Returns:
      the length
    • symbolAt

      Return the symbol at index, counting from 1.
      Parameters:
      index - the offset into this SymbolList
      Returns:
      the Symbol at that index
      Throws:
      IndexOutOfBoundsException - if index is less than 1, or greater than the length of the symbol list
    • toList

      Returns a List of symbols.

      This is an immutable list of symbols. Do not edit it.

      Returns:
      a List of Symbols
    • iterator

      An Iterator over all Symbols in this SymbolList.

      This is an ordered iterator over the Symbols. It cannot be used to edit the underlying symbols.

      Returns:
      an iterator
    • subList

      SymbolList subList(int start, int end) throws IndexOutOfBoundsException
      Return a new SymbolList for the symbols start to end inclusive.

      The resulting SymbolList will count from 1 to (end-start + 1) inclusive, and refer to the symbols start to end of the original sequence.

      Parameters:
      start - the first symbol of the new SymbolList
      end - the last symbol (inclusive) of the new SymbolList
      Throws:
      IndexOutOfBoundsException
    • seqString

      Stringify this symbol list.

      It is expected that this will use the symbol's token to render each symbol. It should be parsable back into a SymbolList using the default token parser for this alphabet.

      Returns:
      a string representation of the symbol list
    • subStr

      String subStr(int start, int end) throws IndexOutOfBoundsException
      Return a region of this symbol list as a String.

      This should use the same rules as seqString.

      Parameters:
      start - the first symbol to include
      end - the last symbol to include
      Returns:
      the string representation
      Throws:
      IndexOutOfBoundsException - if either start or end are not within the SymbolList
    • edit

      Apply an edit to the SymbolList as specified by the edit object.

      Description

      All edits can be broken down into a series of operations that change contiguous blocks of the sequence. This represent a one of those operations.

      When applied, this Edit will replace 'length' number of symbols starting a position 'pos' by the SymbolList 'replacement'. This allow to do insertions (length=0), deletions (replacement=SymbolList.EMPTY_LIST) and replacements (length>=1 and replacement.length()>=1).

      The pos and pos+length should always be valid positions on the SymbolList to:

      • be edited (between 0 and symL.length()+1).
      • To append to a sequence, pos=symL.length()+1, pos=0.
      • To insert something at the beginning of the sequence, set pos=1 and length=0.

      Examples

       SymbolList seq = DNATools.createDNA("atcaaaaacgctagc");
       System.out.println(seq.seqString());
      
       // delete 5 bases from position 4
       Edit ed = new Edit(4, 5, SymbolList.EMPTY_LIST);
       seq.edit(ed);
       System.out.println(seq.seqString());
      
       // delete one base from the start
       ed = new Edit(1, 1, SymbolList.EMPTY_LIST);
       seq.edit(ed);
      
       // delete one base from the end
       ed = new Edit(seq.length(), 1, SymbolList.EMPTY_LIST);
       seq.edit(ed);
       System.out.println(seq.seqString());
      
       // overwrite 2 bases from position 3 with "tt"
       ed = new Edit(3, 2, DNATools.createDNA("tt"));
       seq.edit(ed);
       System.out.println(seq.seqString());
      
       // add 6 bases to the start
       ed = new Edit(1, 0, DNATools.createDNA("aattgg");
       seq.edit(ed);
       System.out.println(seq.seqString());
      
       // add 4 bases to the end
       ed = new Edit(seq.length() + 1, 0, DNATools.createDNA("tttt"));
       seq.edit(ed);
       System.out.println(seq.seqString());
      
       // full edit
       ed = new Edit(3, 2, DNATools.createDNA("aatagaa");
       seq.edit(ed);
       System.out.println(seq.seqString());
       
      Parameters:
      edit - the Edit to perform
      Throws:
      IndexOutOfBoundsException - if the edit does not lie within the SymbolList
      IllegalAlphabetException - if the SymbolList to insert has an incompatible alphabet
      ChangeVetoException - if either the SymboList does not support the edit, or if the change was vetoed