Class DNATools

java.lang.Object
org.biojava.bio.seq.DNATools

public final class DNATools extends Object
Useful functionality for processing DNA sequences.
Author:
Matthew Pocock, Keith James (docs), Mark Schreiber, David Huen, Richard Holland
  • Method Details

    • a

      public static AtomicSymbol a()
    • g

      public static AtomicSymbol g()
    • c

      public static AtomicSymbol c()
    • t

      public static AtomicSymbol t()
    • n

      public static Symbol n()
    • m

      public static Symbol m()
    • r

      public static Symbol r()
    • w

      public static Symbol w()
    • s

      public static Symbol s()
    • y

      public static Symbol y()
    • k

      public static Symbol k()
    • v

      public static Symbol v()
    • h

      public static Symbol h()
    • d

      public static Symbol d()
    • b

      public static Symbol b()
    • getDNA

      public static FiniteAlphabet getDNA()
      Return the DNA alphabet.
      Returns:
      a flyweight version of the DNA alphabet
    • getDNAxDNA

      public static FiniteAlphabet getDNAxDNA()
      Gets the (DNA x DNA) Alphabet
      Returns:
      a flyweight version of the (DNA x DNA) alphabet
    • getCodonAlphabet

      Gets the (DNA x DNA x DNA) Alphabet
      Returns:
      a flyweight version of the (DNA x DNA x DNA) alphabet
    • createDNA

      public static SymbolList createDNA(String dna) throws IllegalSymbolException
      Return a new DNA SymbolList for dna.
      Parameters:
      dna - a String to parse into DNA
      Returns:
      a SymbolList created form dna
      Throws:
      IllegalSymbolException - if dna contains any non-DNA characters
    • createDNASequence

      public static Sequence createDNASequence(String dna, String name) throws IllegalSymbolException
      Return a new DNA Sequence for dna.
      Parameters:
      dna - a String to parse into DNA
      name - a String to use as the name
      Returns:
      a Sequence created form dna
      Throws:
      IllegalSymbolException - if dna contains any non-DNA characters
    • createGappedDNASequence

      Get a new dna as a GappedSequence
      Throws:
      IllegalSymbolException
    • index

      public static int index(Symbol sym) throws IllegalSymbolException
      Return an integer index for a symbol - compatible with forIndex.

      The index for a symbol is stable accross virtual machines invalid input: '&' invocations.

      Parameters:
      sym - the Symbol to index
      Returns:
      the index for that symbol
      Throws:
      IllegalSymbolException - if sym is not a member of the DNA alphabet
    • forIndex

      public static Symbol forIndex(int index) throws IndexOutOfBoundsException
      Return the symbol for an index - compatible with index.

      The index for a symbol is stable accross virtual machines invalid input: '&' invocations.

      Parameters:
      index - the index to look up
      Returns:
      the symbol at that index
      Throws:
      IndexOutOfBoundsException - if index is not between 0 and 3
    • complement

      public static Symbol complement(Symbol sym) throws IllegalSymbolException
      Complement the symbol.
      Parameters:
      sym - the symbol to complement
      Returns:
      a Symbol that is the complement of sym
      Throws:
      IllegalSymbolException - if sym is not a member of the DNA alphabet
    • forSymbol

      public static Symbol forSymbol(char token) throws IllegalSymbolException
      Retrieve the symbol for a symbol.
      Parameters:
      token - the char to look up
      Returns:
      the symbol for that char
      Throws:
      IllegalSymbolException - if the char is not a valid IUB dna code
    • complement

      Retrieve a complement view of list.
      Parameters:
      list - the SymbolList to complement
      Returns:
      a SymbolList that is the complement
      Throws:
      IllegalAlphabetException - if list is not a complementable alphabet
    • reverseComplement

      Retrieve a reverse-complement view of list.
      Parameters:
      list - the SymbolList to complement
      Returns:
      a SymbolList that is the complement
      Throws:
      IllegalAlphabetException - if list is not a complementable alphabet
    • flip

      Returns a SymbolList that is reverse complemented if the strand is negative, and the origninal one if it is not.
      Parameters:
      list - the SymbolList to view
      strand - the Strand to use
      Returns:
      the apropreate view of the SymbolList
      Throws:
      IllegalAlphabetException - if list is not a complementable alphabet
    • complementTable

      Get a translation table for complementing DNA symbols.
      Since:
      1.1
    • dnaToken

      public static char dnaToken(Symbol sym) throws IllegalSymbolException
      Get a single-character token for a DNA symbol
      Throws:
      IllegalSymbolException - if sym is not a member of the DNA alphabet
    • getDNADistribution

      public static Distribution getDNADistribution(double fractionGC)
      return a SimpleDistribution of specified GC content.
      Parameters:
      fractionGC - (G+C) content as a fraction.
    • getDNAxDNADistribution

      public static Distribution getDNAxDNADistribution(double fractionGC0, double fractionGC1)
      return a (DNA x DNA) cross-product Distribution with specified DNA contents in each component Alphabet.
      Parameters:
      fractionGC0 - (G+C) content of first sequence as a fraction.
      fractionGC1 - (G+C) content of second sequence as a fraction.
    • toRNA

      public static SymbolList toRNA(SymbolList syms) throws IllegalAlphabetException
      Converts a SymbolList from the DNA Alphabet to the RNA Alphabet.
      Parameters:
      syms - the SymbolList to convert to RNA
      Returns:
      a view on syms where Symbols have been converted to RNA. Most significantly t's are now u's. The 5' to 3' order of the Symbols is conserved.
      Throws:
      IllegalAlphabetException - if syms is not DNA.
      Since:
      1.4
    • transcribeToRNA

      Transcribes DNA to RNA. The method more closely represents the biological reality than toRNA(SymbolList syms) does. The presented DNA SymbolList is assumed to be the template strand in the 5' to 3' orientation. The resulting RNA is transcribed from this template effectively a reverse complement in the RNA alphabet. The method is equivalent to calling reverseComplement() and toRNA() in sequence.

      If you are dealing with cDNA sequences that you want converted to RNA you would be better off calling toRNA(SymbolList syms)

      Parameters:
      syms - the SymbolList to convert to RNA
      Returns:
      a view on syms where Symbols have been converted to RNA.
      Throws:
      IllegalAlphabetException - if syms is not DNA.
      Since:
      1.4
    • toProtein

      Convenience method that directly converts a DNA sequence to RNA then to protein. The translated protein is from the +1 reading frame of the SymbolList. The whole SymbolList is translated although up to 2 DNA residues may be truncated if full codons cannot be formed.
      Parameters:
      syms - the sequence to be translated.
      Returns:
      the translated protein sequence.
      Throws:
      IllegalAlphabetException - if syms is not from the DNA alphabet.
      Since:
      1.5.1
    • toProtein

      public static SymbolList toProtein(SymbolList syms, int start, int end) throws IllegalAlphabetException
      Convenience method to translate a region of a DNA sequence directly into protein. While the start and end can be specified if the length of the specified region is not evenly divisible by three then the translated region will be truncated until a full terminal codon can be formed.
      Parameters:
      syms - the DNA sequence to be translated.
      start - the location to begin translation.
      end - the end of the translated region.
      Returns:
      the translated protein sequence.
      Throws:
      IllegalAlphabetException - if syms is not from the DNA alphabet.
      Since:
      1.5.1