Package picard.analysis.replicates
Class IndependentReplicateMetric
java.lang.Object
htsjdk.samtools.metrics.MetricBase
picard.analysis.MergeableMetricBase
picard.analysis.replicates.IndependentReplicateMetric
@DocumentedFeature(groupName="Metrics",
summary="Metrics")
public class IndependentReplicateMetric
extends MergeableMetricBase
A class to store information relevant for biological rate estimation
-
Nested Class Summary
Nested classes/interfaces inherited from class picard.analysis.MergeableMetricBase
MergeableMetricBase.MergeByAdding, MergeableMetricBase.MergeByAssertEquals, MergeableMetricBase.MergingIsManual, MergeableMetricBase.NoMergingIsDerived, MergeableMetricBase.NoMergingKeepsValue
-
Field Summary
FieldsModifier and TypeFieldDescriptionthe rate of heterogeneity within doubleton sets.the rate of homogeneity within doubleton sets.The biological duplication rate (as a fraction of the duplicates sets) calculated from tripleton sets.Given the UMIs one can estimate the rate of biological duplication directly, as this would be the rate of having different UMIs in all duplicate sets.The number of doubletons where the two reads matched the alternate.The number of tripletons where the two reads matched the alternate.The number of alternate alleles in the reads.The number of sets where the UMIs had poor quality bases and were not used for any comparisons.The number of doubletons where the two reads had different bases in the locus.The number of tripletons where at least one of the reads didn't match either allele of the het site.The number of duplicate sets examined.The number of sets of size exactly 2 found.The number of sets of size exactly 3 found.the number of sets where the UMIs had good quality bases and were used for any comparisons.The number of UMIs that are match within Bi-sets that come from different alleles.The number of UMIs that are match within Bi-sets that come from the same alleles.The number of tripletons where the two reads had different bases in the locus.The number of tripletons where at least one of the reads didn't match either allele of the het site.The number of bi-sets with mismatching UMIs and opposite orientation.The number of bi-sets with mismatching UMIs and same orientation.The number of UMIs that are different within Bi-sets that come from different alleles.The number of UMIs that are different within Bi-sets that come from the same alleles.The number of reads in duplicate of sizes greater than 3.The number of doubletons where the two reads matched the reference.The number of tripletons where the two reads matched the reference.The number of reference alleles in the reads.The count of sites used.The count of sites in which a third allele was found.The total number of reads over the het sites.When the UMIs mismatch, we expect about the same number of different alleles as the same (assuming that different UMI implies biological duplicate) thus, this value should be near 0.5When the alleles are different, we know that this is a biological duplication, thus we expect nearly all the UMIs to be different (allowing for equality due to chance).An estimate of the duplication rate that is based on the duplicate sets we observed.the rate of heterogeneity within tripleton setsthe rate of homogeneity within tripleton sets. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
Placeholder method that will calculate the derived fields from the other ones.Methods inherited from class picard.analysis.MergeableMetricBase
canMerge, merge, merge, mergeIfCan
Methods inherited from class htsjdk.samtools.metrics.MetricBase
equals, hashCode, toString
-
Field Details
-
nSites
The count of sites used. -
nThreeAllelesSites
The count of sites in which a third allele was found. -
nTotalReads
The total number of reads over the het sites. -
nDuplicateSets
The number of duplicate sets examined. -
nExactlyTriple
The number of sets of size exactly 3 found. -
nExactlyDouble
The number of sets of size exactly 2 found. -
nReadsInBigSets
The number of reads in duplicate of sizes greater than 3. -
nDifferentAllelesBiDups
The number of doubletons where the two reads had different bases in the locus. -
nReferenceAllelesBiDups
The number of doubletons where the two reads matched the reference. -
nAlternateAllelesBiDups
The number of doubletons where the two reads matched the alternate. -
nDifferentAllelesTriDups
The number of tripletons where at least one of the reads didn't match either allele of the het site. -
nMismatchingAllelesBiDups
The number of tripletons where the two reads had different bases in the locus. -
nReferenceAllelesTriDups
The number of tripletons where the two reads matched the reference. -
nAlternateAllelesTriDups
The number of tripletons where the two reads matched the alternate. -
nMismatchingAllelesTriDups
The number of tripletons where at least one of the reads didn't match either allele of the het site. -
nReferenceReads
The number of reference alleles in the reads. -
nAlternateReads
The number of alternate alleles in the reads. -
nMismatchingUMIsInDiffBiDups
The number of UMIs that are different within Bi-sets that come from different alleles. -
nMatchingUMIsInDiffBiDups
The number of UMIs that are match within Bi-sets that come from different alleles. -
nMismatchingUMIsInSameBiDups
The number of UMIs that are different within Bi-sets that come from the same alleles. -
nMatchingUMIsInSameBiDups
The number of UMIs that are match within Bi-sets that come from the same alleles. -
nMismatchingUMIsInCoOrientedBiDups
The number of bi-sets with mismatching UMIs and same orientation. -
nMismatchingUMIsInContraOrientedBiDups
The number of bi-sets with mismatching UMIs and opposite orientation. -
nBadBarcodes
The number of sets where the UMIs had poor quality bases and were not used for any comparisons. -
nGoodBarcodes
the number of sets where the UMIs had good quality bases and were used for any comparisons. -
biSiteHeterogeneityRate
the rate of heterogeneity within doubleton sets. -
triSiteHeterogeneityRate
the rate of heterogeneity within tripleton sets -
biSiteHomogeneityRate
the rate of homogeneity within doubleton sets. -
triSiteHomogeneityRate
the rate of homogeneity within tripleton sets. -
independentReplicationRateFromBiDups
-
independentReplicationRateFromTriDups
The biological duplication rate (as a fraction of the duplicates sets) calculated from tripleton sets. -
pSameUmiInIndependentBiDup
When the alleles are different, we know that this is a biological duplication, thus we expect nearly all the UMIs to be different (allowing for equality due to chance). So we expect this to be near 1. -
pSameAlleleWhenMismatchingUmi
When the UMIs mismatch, we expect about the same number of different alleles as the same (assuming that different UMI implies biological duplicate) thus, this value should be near 0.5 -
independentReplicationRateFromUmi
Given the UMIs one can estimate the rate of biological duplication directly, as this would be the rate of having different UMIs in all duplicate sets. This is only a good estimate if the assumptions hold, for example if pSameUmiInIndependentBiDup is near 1. -
replicationRateFromReplicateSets
An estimate of the duplication rate that is based on the duplicate sets we observed.
-
-
Constructor Details
-
IndependentReplicateMetric
public IndependentReplicateMetric()
-
-
Method Details
-
calculateDerivedFields
public void calculateDerivedFields()Description copied from class:MergeableMetricBase
Placeholder method that will calculate the derived fields from the other ones. Classes that are derived from non-trivial derived classes should consider calling super.calculateDerivedFields() as well. Fields whose value will change due to this method should be annotated withNoMergingKeepsValue
.- Overrides:
calculateDerivedFields
in classMergeableMetricBase
-