Class SubstructureIdentifier

java.lang.Object
org.biojava.nbio.structure.SubstructureIdentifier
All Implemented Interfaces:
Serializable, StructureIdentifier

public class SubstructureIdentifier extends Object implements StructureIdentifier
This is the canonical way to identify a part of a structure.

The current syntax allows the specification of a set of residues from the first model of a structure. Future versions may be extended to represent additional properties.

Identifiers should adhere to the following specification, although some additional forms may be tolerated where unambiguous for backwards compatibility.

                name          := pdbID
                               | pdbID '.' chainID
                               | pdbID '.' range
                range         := range (',' range)?
                               | chainID
                               | chainID '_' resNum '-' resNum
                pdbID         := [1-9][a-zA-Z0-9]{3}
                               | PDB_[a-zA-Z0-9]{8}
                chainID       := [a-zA-Z0-9]+
                resNum        := [-+]?[0-9]+[A-Za-z]?
 
For example:
                1TIM                                    #whole structure (short format)
                1tim                                    #same as above
                4HHB.C                                  #single chain
                3AA0.A,B                                #two chains
                4GCR.A_1-40                             #substructure
      3iek.A_17-28,A_56-294,A_320-377         #substructure of 3 disjoint parts
                PDB_00001TIM                            #whole structure (extended format)
                pdb_00001tim                            #same as above
                PDB_00004HHB.C                          #single chain
                PDB_00003AA0.A,B                        #two chains
                PDB_00004GCR.A_1-40                     #substructure
      pdb_00003iek.A_17-28,A_56-294,A_320-377 #substructure of 3 disjoint parts
 
More options may be added to the specification at a future time.
Author:
dmyersturnbull, Spencer Bliven
See Also:
  • Constructor Details

    • SubstructureIdentifier

      public SubstructureIdentifier(String id)
      Create a new identifier from a string.
      Parameters:
      id -
    • SubstructureIdentifier

      public SubstructureIdentifier(String pdbId, List<ResidueRange> ranges)
      Create a new identifier based on a set of ranges. If ranges is empty, includes all residues.
      Parameters:
      pdbId - a pdb id, can't be null
      ranges - the ranges
    • SubstructureIdentifier

      public SubstructureIdentifier(PdbId pdbId, List<ResidueRange> ranges)
      Create a new identifier based on a set of ranges. If ranges is empty, includes all residues.
      Parameters:
      pdbId -
      ranges -
  • Method Details

    • toString

      public String toString()
      Overrides:
      toString in class Object
    • getIdentifier

      public String getIdentifier()
      Get the String form of this identifier. This provides the canonical form for a StructureIdentifier and has all the information needed to recreate a particular substructure. Example: 3iek.A_17-28,A_56-294
      Specified by:
      getIdentifier in interface StructureIdentifier
      Returns:
      The String form of this identifier
    • getPdbId

      public PdbId getPdbId()
      Get the PDB identifier part of the SubstructureIdentifier
      Returns:
      the PDB ID
    • getResidueRanges

      public List<ResidueRange> getResidueRanges()
    • toCanonical

      public SubstructureIdentifier toCanonical()
      Return itself. SubstructureIdentifiers are canonical!
      Specified by:
      toCanonical in interface StructureIdentifier
      Returns:
      A SubstructureIdentifier equivalent to this
    • reduce

      public Structure reduce(Structure s) throws StructureException
      Takes a complete structure as input and reduces it to residues present in the specified ranges

      The returned structure will be a shallow copy of the input, with shared Chains, Residues, etc.

      Ligands are handled in a special way. If a full chain is selected (e.g. '1ABC.A') then any waters and ligands with matching chain name are included. If a residue range is present ('1ABC.A:1-100') then any ligands (technically non-water non-polymer atoms) within StructureTools.DEFAULT_LIGAND_PROXIMITY_CUTOFF of the selected range are included, regardless of chain.

      Specified by:
      reduce in interface StructureIdentifier
      Parameters:
      s - A full structure, e.g. as loaded from the PDB. The structure ID should match that returned by getPdbId().
      Returns:
      Throws:
      StructureException
      See Also:
      • invalid reference
        StructureTools#getReducedStructure(Structure, String)
    • loadStructure

      public Structure loadStructure(AtomCache cache) throws IOException, StructureException
      Loads the complete structure based on getPdbId().
      Specified by:
      loadStructure in interface StructureIdentifier
      Parameters:
      cache - A source of structures
      Returns:
      A Structure containing at least the atoms identified by this, or null if no PDB ID is set
      Throws:
      StructureException - For errors loading and parsing the structure
      IOException - Errors reading the structure from disk
    • copyLigandsByProximity

      protected static void copyLigandsByProximity(Structure full, Structure reduced)
      Supplements the reduced structure with ligands from the full structure based on a distance cutoff. Ligand groups are moved (destructively) from full to reduced if they fall within the cutoff of any atom in the reduced structure. The default cutoff is used.
      Parameters:
      full - Structure containing all ligands
      reduced - Structure with a subset of the polymer groups from full
      See Also:
    • copyLigandsByProximity

      protected static void copyLigandsByProximity(Structure full, Structure reduced, double cutoff, int fromModel, int toModel)
      Supplements the reduced structure with ligands from the full structure based on a distance cutoff. Ligand groups are moved (destructively) from full to reduced if they fall within the cutoff of any atom in the reduced structure.
      Parameters:
      full - Structure containing all ligands
      reduced - Structure with a subset of the polymer groups from full
      cutoff - Distance cutoff (Å)
      fromModel - source model in full
      toModel - destination model in reduced
      See Also: