pybedtools.cbedtools.Interval

class pybedtools.cbedtools.Interval

Class to represent a genomic interval.

Constructor:

Interval(chrom, start, end, name=".", score=".", strand=".", otherfields=None)

Class to represent a genomic interval of any format. Requires at least 3 args: chrom (string), start (int), end (int).

start is always the 0-based start coordinate. If this Interval is to represent a GFF object (which uses a 1-based coordinate system), then subtract 1 from the 4th item in the line to get the start position in 0-based coords for this Interval. The 1-based GFF coord will still be available, albeit as a string, in fields[3].

otherfields is a list of fields that don’t fit into the other kwargs, and will be stored in the fields attribute of the Interval.

All the items in otherfields must be strings for proper conversion to C++.

By convention, for BED files, otherfields is everything past the first 6 items in the line. This allows an Interval to represent composite features (e.g., a GFF line concatenated to the end of a BED line)

But for other formats (VCF, GFF, SAM), the entire line should be passed in as a list for otherfields so that we can always check the Interval.file_type and extract the fields we want, knowing that they’ll be in the right order as passed in with otherfields.

Example usage:

>>> from pybedtools import Interval
>>> i = Interval("chr1", 22, 44, strand='-')
>>> i
Interval(chr1:22-44)
__init__($self, /, *args, **kwargs)

Initialize self. See help(type(self)) for accurate signature.

Methods

append
deparse_attrs

Attributes

attrs
chrom the chromosome of the feature
count
end The end of the feature
fields
file_type bed/vcf/gff
length the length of the feature
name
>>> import pybedtools
o_amt
o_end
o_start
score
start The 0-based start of the feature.
stop the end of the feature
strand the strand of the feature