Class HgvsProtein

java.lang.Object
org.snpeff.snpEffect.Hgvs
org.snpeff.snpEffect.HgvsProtein

public class HgvsProtein extends Hgvs
Coding change in HGVS notation (amino acid changes) References: http://www.hgvs.org/mutnomen/recs.html
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static boolean
     

    Fields inherited from class org.snpeff.snpEffect.Hgvs

    duplication, genome, hgvsTrId, marker, MAX_SEQUENCE_LEN_HGVS, strandMinus, strandPlus, tr, variant, variantEffect
  • Constructor Summary

    Constructors
    Constructor
    Description
    HgvsProtein(VariantEffect variantEffect)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    protected String
    aaCode(char aa1Letter)
     
    protected String
    aaCode(String aa1Letter)
    Use one letter / three letter AA codes Most times we want to vonvert to 3 letter code HGVS: the three-letter amino acid code is prefered (see Discussion), with "*" designating a translation termination codon; for clarity we this page describes changes using the three-letter amino acid
    protected String
    del()
    Deletions remove one or more amino acid residues from the protein and are described using "del" after an indication of the first and last amino acid(s) deleted separated by a "_" (underscore).
    protected String
    Mixed variants Deletion/insertions (indels) replace one or more amino acid residues with one or more other amino acid residues.
    protected String
    dup()
    Duplications
    protected String
    fs()
    Frame shifts are a special type of amino acid deletion/insertion affecting an amino acid between the first (initiation, ATG) and last codon (termination, stop), replacing the normal C-terminal sequence with one encoded by another reading frame (specified 2013-10-11).
    protected String
    ins()
    Insertions Insertions add one or more amino acid residues between two existing amino acids and this insertion is not a copy of a sequence immediately 5'-flanking (see Duplication).
    protected boolean
    Is this variant a duplication Reference: http://www.hgvs.org/mutnomen/disc.html#dupins ...the description "dup" (see Standards) may by definition only be used when the additional copy is directly 3'-flanking of the original copy (tandem duplication)
    protected String
    pos(int codonNum)
    Protein position
    protected String
    pos(int start, int end)
     
    protected String
    pos(Transcript tr, int codonNum)
    Protein position
    protected String
    pos(Transcript tr, int start, int end)
    Position string given two coordinates
    protected String
    Position for deletions
    protected String
    Position for 'delins'
    protected String
    Position for 'duplications' (a special kind of insertion)
    protected String
    Frame shifts ....are described using ...
    protected String
    Position for insertions
    protected String
    Position: SNP or NMP
    protected String
    SNP or MNP changes
     
    protected String
    Translocation nomenclature.
    protected String
    Return "p." string with/without transcript ID, according to user command line options.

    Methods inherited from class org.snpeff.snpEffect.Hgvs

    initStrand, parseTranscript, removeTranscript

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
  • Field Details

    • debug

      public static boolean debug
  • Constructor Details

  • Method Details

    • aaCode

      protected String aaCode(char aa1Letter)
    • aaCode

      protected String aaCode(String aa1Letter)
      Use one letter / three letter AA codes Most times we want to vonvert to 3 letter code HGVS: the three-letter amino acid code is prefered (see Discussion), with "*" designating a translation termination codon; for clarity we this page describes changes using the three-letter amino acid
    • del

      protected String del()
      Deletions remove one or more amino acid residues from the protein and are described using "del" after an indication of the first and last amino acid(s) deleted separated by a "_" (underscore). Deletions remove either a small internal segment of the protein (in-frame deletion), part of the N-terminus of the protein (initiation codon change) or the entire C-terminal part of the protein (nonsense change). A nonsense change is a special type of deletion removing the entire C-terminal part of a protein starting at the site of the variant (specified 2013-03-16). 1) in-frame deletions - are described using "del" after an indication of the first and last amino acid(s) deleted separated, by a "_" (underscore). p.Gln8del in the sequence MKMGHQQQCC denotes a Glutamine-8 (Gln, Q) deletion to MKMGHQQCC p.(Cys28_Met30del) denotes RNA nor protein was analysed but the predicted change is a deletion of three amino acids, from Cysteine-28 to Methionine-30 2) initiating methionine change (Met1) causing a N-terminal deletion (see Discussion, see Examples) NOTE: changes extending the N-terminal protein sequence are described as an extension p.0 - no protein is produced (experimental data should be available) NOTE: this change is not described as p.Met1_Leu833del, i.e. as a deletion removing the entire protein coding sequence p.Met1? - denotes that amino acid Methionine-1 (translation initiation site) is changed and that it is unclear what the consequence of this change is p.Met1_Lys45del - a new translation initiation site is activated (at Met46) 3) nonsense variant - are a special type of amino acid deletion removing the entire C-terminal part of a protein starting at the site of the variant. A nonsense change is described using the format p.Trp26Ter (alternatively p.Trp26*). The description does not include the deletion at protein level from the site of the change to the C-terminal end of the protein (stop codon) like p.Trp26_Leu833del (the deletion of amino acid residue Trp26 to the last amino acid of the protein Leu833). p.(Trp26Ter) indicates RNA nor protein was analysed but amino acid Tryptophan26 (Trp, W) is predicted to change to a stop codon (Ter) (alternatively p.(W26*) or p.(Trp26*))
    • delins

      protected String delins()
      Mixed variants Deletion/insertions (indels) replace one or more amino acid residues with one or more other amino acid residues. Deletion/insertions are described using "delins" as a deletion followed by an insertion after an indication of the amino acid(s) flanking the site of the deletion/insertion separated by a "_" (underscore, see Discussion). Frame shifts are a special type of amino acid deletion/insertion affecting an amino acid between the first (initiation, ATG) and last codon (termination, stop), replacing the normal C-terminal sequence with one encoded by another reading frame (specified 2013-10-11). A frame shift is described using "fs" after the first amino acid affected by the change. Descriptions either use a short ("fs") or long ("fsTer#") description. The description of frame shifts does not include the deletion at protein level from the site of the frame shift to the natural end of the protein (stop codon). The inserted amino acid residues are not described, only the total length of the new shifted frame is given (i.e. including the first amino acid changed).
    • dup

      protected String dup()
      Duplications
    • fs

      protected String fs()
      Frame shifts are a special type of amino acid deletion/insertion affecting an amino acid between the first (initiation, ATG) and last codon (termination, stop), replacing the normal C-terminal sequence with one encoded by another reading frame (specified 2013-10-11). A frame shift is described using "fs" after the first amino acid affected by the change. Descriptions either use a short ("fs") or long ("fsTer#") description
    • ins

      protected String ins()
      Insertions Insertions add one or more amino acid residues between two existing amino acids and this insertion is not a copy of a sequence immediately 5'-flanking (see Duplication). Insertions are described using "ins" after an indication of the amino acids flanking the insertion site, separated by a "_" (underscore) and followed by a description of the amino acid(s) inserted. Since for large insertions the amino acids can be derived from the DNA and/or RNA descriptions they need not to be described exactly but the total number may be given (like "ins17"). Examples: 1) p.Lys2_Met3insGlnSerLys denotes that the sequence GlnSerLys (QSK) was inserted between amino acids Lysine-2 (Lys, K) and Methionine-3 (Met, M), changing MKMGHQQQCC to MKQSKMGHQQQCC 2) p.Trp182_Gln183ins17 describes a variant that inserts 17 amino acids between amino acids Trp182 and Gln183 NOTE: it must be possible to deduce the 17 inserted amino acids from the description given at DNA or RNA level
    • isDuplication

      protected boolean isDuplication()
      Is this variant a duplication Reference: http://www.hgvs.org/mutnomen/disc.html#dupins ...the description "dup" (see Standards) may by definition only be used when the additional copy is directly 3'-flanking of the original copy (tandem duplication)
    • pos

      protected String pos(int codonNum)
      Protein position
    • pos

      protected String pos(int start, int end)
    • pos

      protected String pos(Transcript tr, int codonNum)
      Protein position
    • pos

      protected String pos(Transcript tr, int start, int end)
      Position string given two coordinates
    • posDel

      protected String posDel()
      Position for deletions
    • posDelIns

      protected String posDelIns()
      Position for 'delins'
    • posDup

      protected String posDup()
      Position for 'duplications' (a special kind of insertion)
    • posFs

      protected String posFs()
      Frame shifts ....are described using ... the change of the first amino acid affected ... the description does not include a description of the deletion from the site of the change
    • posIns

      protected String posIns()
      Position for insertions
    • posSnpOrMnp

      protected String posSnpOrMnp()
      Position: SNP or NMP
    • snpOrMnp

      protected String snpOrMnp()
      SNP or MNP changes
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • translocation

      protected String translocation()
      Translocation nomenclature. From HGVS: Translocations at protein level occur when a translocation at DNA level leads to the production of a fusion protein, joining the N-terminal end of the protein on one chromosome to the C-terminal end of the protein on the other chromosome (and vice versa). No recommendations have been made sofar to describe protein translocations. t(X;17)(DMD:p.Met1_Val1506; SGCA:p.Val250_*387) describes a fusion protein resulting from a translocation between the chromosomes X and 17; the fusion protein contains an N-terminal segment of DMD (dystrophin, amino acids Methionine-1 to Valine-1506), and a C-terminal segment of SGCA (alpha-sarcoglycan, amino acids Valine-250 to the stop codon at 387)
    • typeOfReference

      protected String typeOfReference()
      Return "p." string with/without transcript ID, according to user command line options.