Package org.biojava.bio.seq.io
Class SeqIOConstants
java.lang.Object
org.biojava.bio.seq.io.SeqIOConstants
SeqIOConstants
contains constants used to identify
sequence formats, alphabets etc, in the context of reading and
writing sequences.
An int
used to specify symbol alphabet and
sequence format type is derived thus:
- The two least significant bytes are reserved for format types such as RAW, FASTA, EMBL etc.
- The two most significant bytes are reserved for alphabet and symbol information such as AMBIGUOUS, DNA, RNA, AA etc.
-
Bitwise OR combinations of each component
int
are used to specify combinations of format type and symbol information. To derive anint
identifier for DNA with ambiguity codes in Fasta format, bitwise OR the AMBIGUOUS, DNA and FASTA values.
- Author:
- Keith James
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final int
AA
indicates that a sequence contains AA (amino acid) symbols.static final int
AMBIGUOUS
indicates that a sequence contains ambiguity symbols.static final int
DNA
indicates that a sequence contains DNA (deoxyribonucleic acid) symbols.static final int
EMBL
indicates that the sequence format is EMBL.static final int
EMBL_AA
premade EMBL | AA.static final int
EMBL_DNA
premade EMBL | DNA.static final int
EMBL_RNA
premade EMBL | RNA.static final int
FASTA
indicates that the sequence format is Fasta.static final int
FASTA_AA
premade FASTA | AA.static final int
FASTA_DNA
premade FASTA | DNA.static final int
FASTA_RNA
premade FASTA | RNA.static final int
GCG
indicates that the sequence format is GCG.static final int
GENBANK
indicates that the sequence format is GENBANK.static final int
GENBANK_DNA
premade GENBANK | AA.static final int
GENBANK_DNA
premade GENBANK | DNA.static final int
GENBANK_DNA
premade GENBANK | RNA.static final int
GENPEPT
indicates that the sequence format is GENPEPT.static final int
GFF
indicates that the sequence format is GFF.static final int
IG
indicates that the sequence format is IG.static final int
INTEGER
indicates that a sequence contains integer alphabet symbols, such as used to describe sequence quality data.static final LifeScienceIdentifier
LSID_EMBL_AA
sequence format LSID for EMBL AA.static final LifeScienceIdentifier
LSID_EMBL_DNA
sequence format LSID for EMBL DNA.static final LifeScienceIdentifier
LSID_EMBL_RNA
sequence format LSID for EMBL RNA.static final LifeScienceIdentifier
LSID_FASTA_AA
sequence format LSID for Fasta AA.static final LifeScienceIdentifier
LSID_FASTA_DNA
sequence format LSID for Fasta DNA.static final LifeScienceIdentifier
LSID_FASTA_RNA
sequence format LSID for Fasta RNA.static final LifeScienceIdentifier
LSID_GENBANK_AA
sequence format LSID for Genbank AA.static final LifeScienceIdentifier
LSID_GENBANK_DNA
sequence format LSID for Genbank DNA.static final LifeScienceIdentifier
LSID_GENBANK_RNA
sequence format LSID for Genbank RNA.static final LifeScienceIdentifier
LSID_SWISSPROT
sequence format LSID for Swissprot.static final int
NBRF
indicates that the sequence format is NBRF.static final int
PDB
indicates that the sequence format is PDB.static final int
PHRED
indicates that the sequence format is PHRED.static final int
RAW
indicates that the sequence format is raw (symbols only).static final int
REFSEQ
indicates that the sequence format is REFSEQ.static final int
REFSEQ_AA
premade REFSEQ | AA.static final int
REFSEQ_DNA
premade REFSEQ | DNA.static final int
REFSEQ_RNA
premade REFSEQ | RNA.static final int
RNA
indicates that a sequence contains RNA (ribonucleic acid) symbols.static final int
SWISSPROT
indicates that the sequence format is SWISSPROT.static final int
UNKNOWN
indicates that the sequence format is unknown. -
Constructor Summary
Constructors -
Method Summary
-
Field Details
-
AMBIGUOUS
AMBIGUOUS
indicates that a sequence contains ambiguity symbols. The first bit of the most significant word of the int is set.- See Also:
-
DNA
DNA
indicates that a sequence contains DNA (deoxyribonucleic acid) symbols. The second bit of the most significant word of the int is set.- See Also:
-
RNA
RNA
indicates that a sequence contains RNA (ribonucleic acid) symbols. The third bit of the most significant word of the int is set.- See Also:
-
AA
AA
indicates that a sequence contains AA (amino acid) symbols. The fourth bit of the most significant word of the int is set.- See Also:
-
INTEGER
INTEGER
indicates that a sequence contains integer alphabet symbols, such as used to describe sequence quality data. The fifth bit of the most significant word of the int is set.- See Also:
-
UNKNOWN
UNKNOWN
indicates that the sequence format is unknown.- See Also:
-
RAW
RAW
indicates that the sequence format is raw (symbols only).- See Also:
-
FASTA
FASTA
indicates that the sequence format is Fasta.- See Also:
-
NBRF
NBRF
indicates that the sequence format is NBRF.- See Also:
-
IG
IG
indicates that the sequence format is IG.- See Also:
-
EMBL
EMBL
indicates that the sequence format is EMBL.- See Also:
-
SWISSPROT
SWISSPROT
indicates that the sequence format is SWISSPROT. Always protein, so already had the AA bit set.- See Also:
-
GENBANK
GENBANK
indicates that the sequence format is GENBANK.- See Also:
-
GENPEPT
GENPEPT
indicates that the sequence format is GENPEPT. Always protein, so already had the AA bit set.- See Also:
-
REFSEQ
REFSEQ
indicates that the sequence format is REFSEQ.- See Also:
-
GCG
GCG
indicates that the sequence format is GCG.- See Also:
-
GFF
GFF
indicates that the sequence format is GFF.- See Also:
-
PDB
PDB
indicates that the sequence format is PDB. Always protein, so already had the AA bit set.- See Also:
-
PHRED
PHRED
indicates that the sequence format is PHRED. Always DNA, so already had the DNA bit set. Also has INTEGER bit set for quality data.- See Also:
-
EMBL_DNA
EMBL_DNA
premade EMBL | DNA.- See Also:
-
EMBL_RNA
EMBL_RNA
premade EMBL | RNA.- See Also:
-
EMBL_AA
EMBL_AA
premade EMBL | AA.- See Also:
-
GENBANK_DNA
GENBANK_DNA
premade GENBANK | DNA.- See Also:
-
GENBANK_RNA
GENBANK_DNA
premade GENBANK | RNA.- See Also:
-
GENBANK_AA
GENBANK_DNA
premade GENBANK | AA.- See Also:
-
REFSEQ_DNA
REFSEQ_DNA
premade REFSEQ | DNA.- See Also:
-
REFSEQ_RNA
REFSEQ_RNA
premade REFSEQ | RNA.- See Also:
-
REFSEQ_AA
REFSEQ_AA
premade REFSEQ | AA.- See Also:
-
FASTA_DNA
FASTA_DNA
premade FASTA | DNA.- See Also:
-
FASTA_RNA
FASTA_RNA
premade FASTA | RNA.- See Also:
-
FASTA_AA
FASTA_AA
premade FASTA | AA.- See Also:
-
LSID_FASTA_DNA
LSID_FASTA_DNA
sequence format LSID for Fasta DNA. -
LSID_FASTA_RNA
LSID_FASTA_RNA
sequence format LSID for Fasta RNA. -
LSID_FASTA_AA
LSID_FASTA_AA
sequence format LSID for Fasta AA. -
LSID_EMBL_DNA
LSID_EMBL_DNA
sequence format LSID for EMBL DNA. -
LSID_EMBL_RNA
LSID_EMBL_RNA
sequence format LSID for EMBL RNA. -
LSID_EMBL_AA
LSID_EMBL_AA
sequence format LSID for EMBL AA. -
LSID_GENBANK_DNA
LSID_GENBANK_DNA
sequence format LSID for Genbank DNA. -
LSID_GENBANK_RNA
LSID_GENBANK_RNA
sequence format LSID for Genbank RNA. -
LSID_GENBANK_AA
LSID_GENBANK_AA
sequence format LSID for Genbank AA. -
LSID_SWISSPROT
LSID_SWISSPROT
sequence format LSID for Swissprot.
-
-
Constructor Details
-
SeqIOConstants
public SeqIOConstants()
-