Class Utils


  • public class Utils
    extends java.lang.Object
    Version:
    $Id: Utils.java 918 2008-06-04 01:28:08Z twobeers $
    Author:
    Andrew Rambaut, Alexei Drummond
    • Method Detail

      • translate

        public static Sequence translate​(Sequence sequence,
                                         GeneticCode geneticCode)
        Translates a given Sequence to a corresponding Sequence under the given genetic code. Simply a utility function that calls AminoAcidState[] translate(final State[] states, GeneticCode geneticCode)
        Parameters:
        sequence - the Sequence.
        geneticCode -
        Returns:
      • translate

        public static Sequence translate​(Sequence sequence,
                                         GeneticCode geneticCode,
                                         int readingFrame)
        Translates a given Sequence to a corresponding Sequence under the given genetic code. Simply a utility function that calls AminoAcidState[] translate(final State[] states, GeneticCode geneticCode)
        Parameters:
        sequence - the Sequence.
        geneticCode -
        readingFrame -
        Returns:
      • translate

        public static AminoAcidState[] translate​(State[] states,
                                                 GeneticCode geneticCode)
        Translates each of a given sequence of NucleotideStates or CodonStates to the AminoAcidState corresponding to it under the given genetic code. Translation doesn't stop at stop codons; these are translated to AminoAcids.STOP_STATE. If translating from NucleotideState and the number of states is not a multiple of 3, then the excess states at the end are silently dropped.
        Parameters:
        states - States to translate; must all be of the same type, either NucleotideState or CodonState.
        geneticCode -
        Returns:
      • translate

        public static AminoAcidState[] translate​(State[] states,
                                                 GeneticCode geneticCode,
                                                 int readingFrame)
        Translates each of a given sequence of NucleotideStates or CodonStates to the AminoAcidState corresponding to it under the given genetic code. Translation doesn't stop at stop codons; these are translated to AminoAcids.STOP_STATE. If translating from NucleotideState and the number of states is not a multiple of 3, then the excess states at the end are silently dropped.
        Parameters:
        states - States to translate; must all be of the same type, either NucleotideState or CodonState.
        geneticCode -
        readingFrame -
        Returns:
      • isPredominantlyRNA

        public static boolean isPredominantlyRNA​(java.lang.CharSequence sequenceString,
                                                 int maximumNonGapsToLookAt)
        Is the given NucleotideSequence predominantly RNA? (i.e the more occurrences of "U" than "T")
        Parameters:
        sequenceString - the sequence string to inspect to determine if it's RNA
        maximumNonGapsToLookAt - for performance reasons, only look at a maximum of this many non-gap residues in deciding if the sequence is predominantly RNA. Can be -1 or Integer.MAX_VALUE to look at the entire sequence.
        Returns:
        true if the given NucleotideSequence predominantly RNA
      • reverseComplement

        public static java.lang.String reverseComplement​(java.lang.String nucleotideSequence)
      • reverseComplementWithGaps

        public static java.lang.String reverseComplementWithGaps​(java.lang.String nucleotideSequence)
      • translateCharSequence

        public static java.lang.String translateCharSequence​(java.lang.CharSequence nucleotideSequence,
                                                             GeneticCode geneticCode)
        Translates the given nucleotideSequence into an amino acid sequence string, using the given geneticCode. The translation is done triplet by triplet, starting with the triplet that is at index 0..2 in nucleotideSequence, then the one at index 3..5 etc. until there are less than 3 nucleotides left.

        This method uses translate(State[],GeneticCode) to do the translation, hence it shares some properties with that method: 1.) Any excess nucleotides at the end will be silently discarded, 2.) Translation doesn't stop at stop codons; instead, they are translated to "*", which is AminoAcids.STOP_STATE's code.

        Parameters:
        nucleotideSequence - nucleotide sequence to translate
        geneticCode - genetic code to use for the translation
        Returns:
        A string with length nucleotideSequence.length() / 3 (rounded down), the translation of nucleotideSequence with the given genetic code
      • translate

        public static java.lang.String translate​(java.lang.String nucleotideSequence,
                                                 GeneticCode geneticCode)
        A wrapper for translateCharSequence(CharSequence,GeneticCode) that takes a nucleotide sequence as a String only rather than a CharSequence. This is to preserve backwards compatibility with existing compiled code.
        Parameters:
        nucleotideSequence - nucleotide sequence string to translate
        geneticCode - genetic code to use for the translation
        Returns:
        A string with length nucleotideSequence.length() / 3 (rounded down), the translation of nucleotideSequence with the given genetic code
      • stripGaps

        public static State[] stripGaps​(State[] sequence)
        Strips a sequence of gaps
        Parameters:
        sequence - the sequence
        Returns:
        the stripped sequence
      • stripStates

        public static State[] stripStates​(State[] sequence,
                                          java.util.List<State> stripStates)
        Strips a sequence of any states given
        Parameters:
        sequence - the sequence
        stripStates - the states to strip
        Returns:
        an array of states
      • replaceStates

        public static State[] replaceStates​(State[] sequence,
                                            java.util.List<State> searchStates,
                                            State replaceState)
        Searchers and replaces a sequence of any states given
        Parameters:
        sequence - the sequence
        searchStates - the states to search for
        Returns:
        an array of states
      • reverse

        public static State[] reverse​(State[] sequence)
      • getStateIndices

        public static byte[] getStateIndices​(State[] sequence)
      • getGaplessLocation

        public static int getGaplessLocation​(Sequence sequence,
                                             int gappedLocation)
        Gets the site location index for this sequence excluding any gaps. The location is indexed from 0.
        Parameters:
        sequence - the sequence
        gappedLocation - the location including gaps
        Returns:
        the location without gaps.
      • getGappedLocation

        public static int getGappedLocation​(Sequence sequence,
                                            int gaplessLocation)
        Gets the site location index for this sequence that corresponds to a location given excluding all gaps. The first non-gapped site in the sequence has a gaplessLocation of 0.
        Parameters:
        sequence - the sequence
        gaplessLocation -
        Returns:
        the site location including gaps
      • guessSequenceType

        public static SequenceType guessSequenceType​(java.lang.CharSequence seq)
        Guess type of sequence from contents.
        Parameters:
        seq - the sequence
        Returns:
        SequenceType.NUCLEOTIDE or SequenceType.AMINO_ACID, if sequence is believed to be of that type. If the sequence contains characters that are valid for neither of these two sequence types, then this method returns null.
      • getStopCodonCount

        public static int getStopCodonCount​(Sequence sequence)
        Counts the number of stop codons in an amino acid sequence
        Parameters:
        sequence - the sequence string to count stop codons
        Returns:
        the number of stop codons
      • cleanSequence

        public static State[] cleanSequence​(java.lang.CharSequence seq,
                                            SequenceType type)
        Produce a clean sequence filtered of spaces and digits.
        Parameters:
        seq - the sequence
        type - the sequence type
        Returns:
        An array of valid states of SequenceType (may be shorter than the original sequence)
      • toString

        public static java.lang.String toString​(State[] states)