4. Class Spectrum¶
Spectrum class offers a python object for mass spectrometry data.
The spectrum object holds the basic information on the spectrum and offers
methods to interrogate properties of the spectrum.
Data, i.e. mass over charge (m/z) and intensity decoding is performed on demand
and can be accessed via their properties, e.g. spec.Spectrum.peaks
.
The Spectrum class is used in the run.Run
class.
There each spectrum is accessible as a Spectrum object.
Theoretical spectra can also be created using the setter functions.
For example, m/z values, intensities, and peaks can be set by the
corresponding properties: spec.Spectrum.mz
,
spec.Spectrum.i
, spec.Spectrum.peaks
.
-
class
spec.
Spectrum
[source]¶ -
__init__
(measuredPrecision = value*)[source]¶ Initializes a pymzml.spec.Spectrum class.
Parameters: measuredPrecision (float) – in m/z, mandatory
-
xmlTree
¶ xmlTree property returns an iterator over the original xmlTree structure the spectrum was initilized with.
Example:
>>> for element in spectrum.xmlTree: ... print( element, element.tag, element.items() )
please refer to the xml documentation of Python and cElementTree for more details.
-
mz
¶ Returns the list of m/z values. If the m/z values are encoded, the function
_decode()
is used to decode the encoded data.The mz property can also be setted, e.g. for theoretical data. However, it is recommended to use the peaks property to set mz and intesity tuples at same time.
Return type: list Returns: Returns a list of mz from the actual analysed spectrum
-
i
¶ Returns the list of the intensity values. If the intensity values are encoded, the function
_decode()
is used to decode the encoded data.The i property can also be setted, e.g. for theoretical data.However, it is recommended to use the peaks property to set mz and intesity tuples at same time.
Return type: list Returns: Returns a list of intensity values from the actual analysed spectrum.
-
peaks
¶ Returns the list of peaks of the spectrum as tuples (m/z, intensity).
Return type: list of tuples Returns: Returns list of tuples (m/z, intensity) Example:
>>> import pymzml >>> run = pymzml.run.Reader(spectra.mzMl.gz, MS1_Precision = 5e-6, MSn_Precision = 20e-6) >>> for spectrum in run: ... for mz, i in spectrum.peaks: ... print(mz, i)
Note
The peaks property can also be setted, e.g. for theoretical data. It requires a list of mz/intensity tuples.
-
centroidedPeaks
¶ Returns the centroided version of a profile spectrum. Performs a Gauss fit to determine centroided mz and intensities, if the spectrum is in measured profile mode. Returns a list of tuples of fitted m/z-intesity values. If the spectrum peaks are already centroided, these peaks are returned.
Return type: list of tuples Returns: Returns list of tuples (m/z, intensity) Example:
>>> import pymzml >>> run = pymzml.run.Reader(spectra.mzMl.gz, MS1_Precision = 5e-6, MSn_Precision = 20e-6) >>> for spectrum in run: ... for mz, i in spectrum.centroidedPeaks: ... print(mz, i)
-
reprofiledPeaks
¶ Returns the reprofiled version of a centroided spectrum.
Return type: list of reprofiled mz,i tuples Returns: Reprofiled peaks as tuple list Example:
>>> import pymzml >>> run = pymzml.run.Reader(spectra.mzMl.gz, MS1_Precision = 5e-6, MSn_Precision = 20e-6) >>> for spectrum in run: ... for mz, i in spectrum.reprofiledPeaks: ... print(mz, i)
-
reprofiledPeaks
Returns the reprofiled version of a centroided spectrum.
Return type: list of reprofiled mz,i tuples Returns: Reprofiled peaks as tuple list Example:
>>> import pymzml >>> run = pymzml.run.Reader(spectra.mzMl.gz, MS1_Precision = 5e-6, MSn_Precision = 20e-6) >>> for spectrum in run: ... for mz, i in spectrum.reprofiledPeaks: ... print(mz, i)
-
measuredPrecision
¶ Sets the measured and internal precision
Parameters: value (float) – measured precision (e.g. 5e-6)
-
__add__
(otherSpec)[source]¶ Adds two pymzml spectra together.
Parameters: otherSpec (object) – Spectrum object Example:
>>> import pymzml >>> s = pymzml.spec.Spectrum( measuredPrescision = 20e-6 ) >>> file_to_read = "../mzML_example_files/xy.mzML.gz" >>> run = pymzml.run.Reader(file_to_read , MS1_Precision = 5e-6 , MSn_Precision = 20e-6) >>> for spec in run: ... s += spec
-
__mul__
(value)[source]¶ Multiplies each intensity with a float, i.e. scales the spectrum.
Parameters: value (float) – Value to multiply the spectrum
-
__truediv__
(value)[source]¶ Divides each intensity by a float, i.e. scales the spectrum.
Parameters: value (float, int) – Value to divide the spectrum
-
strip
(scope='all')[source]¶ Reduces the size of the spectrum. Interesting if specs need to be added or stored.
Parameters: scope (string) – accepts currently [“all”] “all” will remove the raw and profiled data and some internal lookup tables as well.
-
extremeValues
(key)[source]¶ Find extreme values, minimal and maximum mz and intensity
Parameters: key (string) – m/z : “mz” or intensity : “i” Return type: tuple Returns: tuple of minimal and maximum m/z or intensity
-
reduce
(mzRange=(None, None))[source]¶ Works on peaks and reduces spectrum to a m/z range.
Example:
>>> run = pymzml.run.Reader(file_to_read, MS1_Precision = 5e-6, MSn_Precision = 20e-6) >>> for spec in run: ... spec.reduce( mzRange = (100,200) )
-
deRef
()[source]¶ Strip some heavy data and return deepcopy of spectrum.
Example:
>>> run = pymzml.run.Reader(file_to_read, MS1_Precision = 5e-6, MSn_Precision = 20e-6) >>> for spec in run: ... tmp = spec.deRef()
-
removeNoise
(mode='median', noiseLevel=None)[source]¶ Function to remove noise from peaks, centroided peaks and reprofiled peaks.
Parameters: mode (string) – define mode for removing noise. Default = “median” (other modes: “mean”, “mad”) Return type: list of tuples Returns: Returns a list with tuples of m/z-intensity pairs above the noise threshold mad < median < mean
Threshold is calculated over the mad/median/mean of all intensity values. (mad = mean absolute deviation)
Example:
>>> import pymzml >>> run = pymzml.run.Reader(spectra.mzML.gz, MS1_Precision = 5e-6, MSn_Precision = 20e-6) >>> for spectrum in run: ... for mz, i in spectrum.removeNoise( mode = 'mean'): ... print(mz, i)
-
highestPeaks
(n)[source]¶ Function to retrieve the n-highest centroided peaks of the spectrum.
Parameters: n (int) – Number of n-highest peaks Return type: list Returns: list of centroided peaks (mz, intensity tuples) Example:
>>> run = pymzml.run.Reader("../mzML_example_files/deconvolution.mzML.gz", MS1_Precision = 5e-6, MSn_Precision = 20e-6) >>> for spectrum in run: ... if spectrum["ms level"] == 2: ... if spectrum["id"] == 1770: ... for mz,i in spectrum.highestPeaks(5): ... print(mz,i)
-
estimatedNoiseLevel
(mode='median')[source]¶ Calculates noise threshold for function
removeNoise()
-
hasOverlappingPeak
(mz)[source]¶ Checks if a spetrum has more than one peak for a given m/z value and within the measured precision
Parameters: mz (float) – m/z value which should be checked Returns: Returns True
if a nearby peak is detected, otherwiseFalse
Return type: bool
-
hasPeak
(mz2find)[source]¶ Checks if a Spectrum has a certain peak. Needs a certain mz value as input and returns a list of peaks if a peak is found in the spectrum, otherwise
[]
is returned. Every peak is a tuple of m/z and intensity.Parameters: mz2find (float) – mz value which should be found Return type: list Returns: m/z and intensity as tuple in list Example:
>>> import pymzml, get_example_file >>> example_file = get_example_file.open_example('deconvolution.mzML.gz') >>> run = pymzml.run.Reader(example_file, MS1_Precision = 5e-6, MSn_Precision = 20e-6) >>> for spectrum in run: ... if spectrum["ms level"] == 2: ... peak_to_find = spectrum.hasPeak(1016.5404) ... print(peak_to_find) [(1016.5404, 19141.735187697403)]
-
hasDeconvolutedPeak
(mass2find)[source]¶ Checks if a deconvoluted spectrum contains a certain peak. Needs a mass value as input and returns a list of peaks if a peak is found in the spectrum. If the mass is not found
[]
is returned. Every peak is a tuple of m/z and intensity.Parameters: mass2find (float) – mass value which should be found Return type: list Returns: mass and intensity as tuple in list if mass is found, otherwise []
Example:
>>> import pymzml, get_example_file >>> example_file = get_example_file.open_example('deconvolution.mzML.gz') >>> run = pymzml.run.Reader(example_file, MS1_Precision = 5e-6, MSn_Precision = 20e-6) >>> for spectrum in run: ... if spectrum["ms level"] == 2: ... peak_to_find = spectrum.hasDeconvolutedPeak(1044.5804) ... print(peak_to_find) [(1044.5596, 3809.4356300564586)]
-
similarityTo
(spec2)[source]¶ Compares two spectra and returns cosine
Parameters: spec2 (pymzml.spec.Spectrum) – another pymzml spectrum that is compated to the current spectrum. Returns: value between 0 and 1, i.e. the cosine between the two spectra. Return type: float Note
Spectra data is transformed into an n-dimensional vector, whereas mz values are binned in bins of 10 m/z and the intensities are added up. Then the cosine is calculated between those two vectors. The more similar the specs are, the closer the value is to 1.
-
tmzSet
¶ Creates a set out of transformed m/z values (including all values in the defined imprecision).
Return type: set
-
tmassSet
¶ Creates a set out of transformed mass values (including all values in the defined imprecision).
Return type: set
-
transformedPeaks
¶ m/z value is multiplied by the internal precision
Return type: list of tuples Returns: Returns a list of peaks (tuples of mz and intensity). Float m/z values are adjusted by the internal precision to integers.
-
transformed_deconvolutedPeaks
¶ Deconvoluted mz value is multiplied by the internal precision
Return type: list of tuples Returns: Returns a list of peaks (tuples of mz and intensity). Float m/z values are adjusted by the internal precision to integers.
-
deconvolute_peaks
(ppmFactor=4, minCharge=1, maxCharge=8, maxNextPeaks=100)[source]¶ Calculating uncharged masses and returning deconvoluted peaks.
The deconvolution of spectra is done by first identifying isotope envelopes and the charge state of this envelopes. The first peak of an isotope envelope is choosen as the monoisotopic peak for which the mass is calculated from the m/z ratio. Isotope envelopes are identified by searching the centroided spectrum for peaks which show no preceding isotope peak within a specified mass accuracy. To be sure, the measured mass accuracy is multiplied by a user adjustable factor (
ppmFactor
). When the current peak meets the criteria with no preceding peaks, the following peaks are analysed. The following peaks are considered to be part of the isotope envelope, as long as they fit within the measured precision and only one local maximum is present. The second local maximum is not considered as the starting point of a new isotope envelope as one cannot be sure were this isotope envelope starts. However, the last peak before the second local maximum is considered to be part of the isotope envelope from the first local maximum, as the intensity of this peak shouldn’t have a big influence on the whole isotope envelope intensity. The charge range for detecting isotope envelopes can be specified (minCharge
,maxCharge
). An isotope envelope always gets the highest possible charge. With the charge the mass can be calculated from the m/z value of the first peak of the isotope envelope. The intensity of the deconvoluted peak results from the sum of all isotope envelope peaks. In a last step, deconvoluted peaks are grouped together within the measured precision. This is necessary because isotope envelopes from the same fragment but with different charge states can leed to slightly different deconvoluted peaks.Parameters: - ppmFactor (int) – ppm factor (imprecision factor)
- minCharge (int) – minimum charge considered
- maxCharge (int) – maximum charge considered
- maxNextPeaks – maximum length for isotope envelope
Return type: tuple (mass, intensity)
Returns: Deconvoluted peaks, mass (instead of m/z) and intensity are returned
-
deconvolutedPeaks
¶ Calling
spec.Spectrum.deconvolute_peaks()
with standard parameters, which calculates uncharged masses and returns deconvoluted peaks.Return type: list Returns: list of deconvoluted peaks (mass (instead of m/z) / intensity tuples)
-