Class HaplotypeProbabilities

java.lang.Object
picard.fingerprint.HaplotypeProbabilities
Direct Known Subclasses:
CappedHaplotypeProbabilities, HaplotypeProbabilitiesFromGenotype, HaplotypeProbabilitiesFromGenotypeLikelihoods, HaplotypeProbabilitiesFromSequence, HaplotypeProbabilityOfNormalGivenTumor

public abstract class HaplotypeProbabilities extends Object
Abstract class for storing and calculating various likelihoods and probabilities for haplotype alleles given evidence.
  • Constructor Details

    • HaplotypeProbabilities

      protected HaplotypeProbabilities(HaplotypeBlock haplotypeBlock)
  • Method Details

    • getHaplotype

      public HaplotypeBlock getHaplotype()
      Returns the haplotype for which the probabilities apply.
    • getPriorProbablities

      public double[] getPriorProbablities()
    • getPosteriorProbabilities

      public double[] getPosteriorProbabilities()
      Returns the probabilities, in order, of the AA, Aa and aa haplotypes. Mathematically, this is P(H | D, F) where and H is the vector of possible haplotypes {AA,Aa,aa}. D is the data seen by the class, and F is the population frequency of each genotype. Returns the posterior normalized probabilities using the population frequency as a prior.
    • getPosteriorLikelihoods

      public double[] getPosteriorLikelihoods()
      Returns the probabilities, in order, of the AA, Aa and aa haplotypes. Mathematically, this is P(H | D, F) where and H is the vector of possible haplotypes {AA,Aa,aa}. D is the data seen by the class, and F is the population frequency of each genotype. Returns the unnormalized likelihoods using the population frequency as a prior.
    • getLikelihoods

      public abstract double[] getLikelihoods()
      Returns the likelihoods, in order, of the AA, Aa and aa haplotypes given the evidence

      Mathematically this is P(evidence | haplotype) where haplotype={AA,Aa,aa}. Will be normalized.

    • getLogLikelihoods

      public double[] getLogLikelihoods()
    • getRepresentativeSnp

      public abstract Snp getRepresentativeSnp()
      Returns a representative SNP for this haplotype. Different subclasses may implement this in different ways, but should do so in a deterministic/repeatable fashion.
    • getObsAllele1

      public int getObsAllele1()
      Returns the number of observations of alleles supporting the first/major haplotype allele. Strictly this doesn't make sense for all subclasses, but it's nice to have it part of the API so a default implementation is provided here.
      Returns:
      int
    • getObsAllele2

      public int getObsAllele2()
      Returns the number of observations of alleles supporting the second/minor haplotype allele. Strictly this doesn't make sense for all subclasses, but it's nice to have it part of the API so a default implementation is provided here.
      Returns:
      int
    • getTotalObs

      public int getTotalObs()
      Returns the total number of observations of any allele. Strictly this doesn't make sense for all subclasses, but it's nice to have it part of the API so a default implementation is provided here.
      Returns:
      int
    • hasEvidence

      public boolean hasEvidence()
      Returns true if evidence has been added, false if the probabilities are just the priors.
    • merge

      public abstract HaplotypeProbabilities merge(HaplotypeProbabilities other)
      Merges in the likelihood information from the supplied haplotype probabilities object.
    • getMostLikelyHaplotype

      public DiploidHaplotype getMostLikelyHaplotype()
      Gets the most likely haplotype given the probabilities.
    • getMostLikelyGenotype

      public DiploidGenotype getMostLikelyGenotype(Snp snp)
      Gets the genotype for this Snp given the most likely haplotype.
    • scaledEvidenceProbabilityUsingGenotypeFrequencies

      public double scaledEvidenceProbabilityUsingGenotypeFrequencies(double[] genotypeFrequencies)
      This function returns the scaled probability of the evidence collected given a vector of priors on the haplotype using the internal likelihood, which may be scaled by an unknown factor. This factor causes the result to be scaled, hence the name.

      Mathematically:

      P(Evidence| P(h_i)=F_i) = \sum_i P(Evidence | h_i) P(h_i) = \sum_i P(Evidence | h_i) F_i = c * \sum_i Likelihood_i * F_i

      Here, h_i are the three possible haplotypes, F_i are the given priors, and Likelihood_i are the stored likelihoods which are scaled from the actually likelihoods by an unknown factor, c. Note that the calculation ignores the internal haplotype probabilities (i.e. priors)

      Parameters:
      genotypeFrequencies - vector of (possibly scaled) probabilities of the three haplotypes
      Returns:
      P(evidence | P_h)) / c
    • shiftedLogEvidenceProbabilityUsingGenotypeFrequencies

      public double shiftedLogEvidenceProbabilityUsingGenotypeFrequencies(double[] genotypeFrequencies)
    • shiftedLogEvidenceProbabilityGivenOtherEvidence

      public double shiftedLogEvidenceProbabilityGivenOtherEvidence(HaplotypeProbabilities otherHp)
      returns the log-probability the evidence, using as priors the posteriors of another object
      Parameters:
      otherHp - an additional HaplotypeProbabilities object representing the same underlying HaplotypeBlock
      Returns:
      log10(P(evidence| P(h_i)=P(h_i|otherHp) ) + c where c is an unknown constant
    • shiftedLogEvidenceProbability

      public double shiftedLogEvidenceProbability()
      Returns log (p(evidence)) + c assuming that the prior on haplotypes is given by the internal haplotypeFrequencies
    • getLodMostProbableGenotype

      public double getLodMostProbableGenotype()
      Returns the LOD score between the most probable haplotype and the second most probable.
    • deepCopy

      public abstract HaplotypeProbabilities deepCopy()