Package picard.fingerprint
Class ClusterCrosscheckMetrics
- java.lang.Object
-
- picard.cmdline.CommandLineProgram
-
- picard.fingerprint.ClusterCrosscheckMetrics
-
@DocumentedFeature public class ClusterCrosscheckMetrics extends CommandLineProgram
Summary
Clusters the results from aCrosscheckFingerprints
run according to the LOD score. The resulting metric file can be used to assist diagnosing results fromCrosscheckFingerprints
. It clusters the connectivity graph between the different groups. Two groups are connected if they have a LOD score greater than theLOD_THRESHOLD
.
Details
The results of runningCrosscheckFingerprints
can be difficult to analyze, especially when many groups are related (meaning LOD greater thanLOD_THRESHOLD
) in non-transitive manner (A is related to B, B is related to C, but A doesn't seem to be related to C.)ClusterCrosscheckMetrics
clusters the metrics fromCrosscheckFingerprints
so that all the groups in a cluster are related to each other either directly, or indirectly (thus A, B and C would end up in one cluster.) Two samples can only be in two different clusters if all the samples from these two clusters do not get high LOD scores when compared to each other.
Example
java -jar picard.jar ClusterCrosscheckMetrics \ INPUT=sample.crosscheck_metrics \ LOD_THRESHOLD=3 \ OUTPUT=sample.clustered.crosscheck_metrics
The resulting file, consists of theClusteredCrosscheckMetric
class and contains the original crosscheck metric values, for groups that end-up in the same clusters (regardless of LOD score of each comparison). In addition it notes theClusteredCrosscheckMetric.CLUSTER
identifier and the size of the cluster (inClusteredCrosscheckMetric.CLUSTER_SIZE
.) Groups that do not have high LOD scores with any other group (including itself!) will not be included in the metric file. Note that cross-group comparisons are not included in the metric file.
-
-
Field Summary
Fields Modifier and Type Field Description File
INPUT
double
LOD_THRESHOLD
File
OUTPUT
-
Fields inherited from class picard.cmdline.CommandLineProgram
COMPRESSION_LEVEL, CREATE_INDEX, CREATE_MD5_FILE, GA4GH_CLIENT_SECRETS, MAX_RECORDS_IN_RAM, QUIET, REFERENCE_SEQUENCE, referenceSequence, specialArgumentsCollection, TMP_DIR, USE_JDK_DEFLATER, USE_JDK_INFLATER, VALIDATION_STRINGENCY, VERBOSITY
-
-
Constructor Summary
Constructors Constructor Description ClusterCrosscheckMetrics()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected int
doWork()
Do the work after command line has been parsed.-
Methods inherited from class picard.cmdline.CommandLineProgram
customCommandLineValidation, getCommandLine, getCommandLineParser, getDefaultHeaders, getFaqLink, getMetricsFile, getStandardUsagePreamble, getStandardUsagePreamble, getVersion, hasWebDocumentation, instanceMain, instanceMainWithExit, makeReferenceArgumentCollection, parseArgs, requiresReference, setDefaultHeaders, useLegacyParser
-
-
-
-
Field Detail
-
INPUT
@Argument(shortName="I", doc="The cross-check metrics file to be clustered.") public File INPUT
-
OUTPUT
@Argument(shortName="O", optional=true, doc="Output file to write metrics to. Will write to stdout if null.") public File OUTPUT
-
LOD_THRESHOLD
@Argument(shortName="LOD", doc="LOD score to be used as the threshold for clustering.") public double LOD_THRESHOLD
-
-
Method Detail
-
doWork
protected int doWork()
Description copied from class:CommandLineProgram
Do the work after command line has been parsed. RuntimeException may be thrown by this method, and are reported appropriately.- Specified by:
doWork
in classCommandLineProgram
- Returns:
- program exit status.
-
-