Objects

These are the common objects used to represent clustering data throughout all classes.

class spectra_cluster.objects.Cluster(cluster_id, precursor_mz, consensus_mz, consensus_intens, spectra, ignore_duplicated=True)

Represents a cluster in a .clustering output file.

__init__(cluster_id, precursor_mz, consensus_mz, consensus_intens, spectra, ignore_duplicated=True)

Creates a new cluster object

Parameters:
  • cluster_id – The cluster’s id
  • precursor_mz – The cluster’s average precursor m/z
  • consensus_mz – A list of doubles holding the consensus spectrum’s m/z values
  • consensus_intens – A list of doubles holding the consensus spectrum’s intensity values
  • spectra – A set of spectra associated with the cluster
  • ignore_duplicated – The constructor automatically removes duplicated spectra from the list of clustered spectra. If duplicated spectra are found, an Exception is raised. Setting this parameter to true does not prevent the filtering, but prevents the exception to be raised.
static calculate_sequence_counts(spectra, ignore_i_l=False)

Calculates the sequence counts based on the passed spectra. PTMs are ignored for this assessment

Parameters:
  • spectra – The spectra to derive the sequence counts from.
  • ignore_i_l – If set I and L are treated as equivalent. If set all I are replace by L and the sequences in the returned map may not correspond to the originally identified sequences.
Returns:

A dict with a sequence as key and the number of occurrences as value.

get_spectra()

Returns the stored spectra in a tuple. These object should not be changed. Otherwise, the cluster’s statistics may no longer be accurate.

Returns:A tuple containing the cluster’s spectra
set_spectra(new_spectra)

Updates the cluster’s stored spectra

Parameters:new_spectra – A list of PSM objects.
class spectra_cluster.objects.PSM(sequence, ptms)

Defines a peptide-spectrum-match

__init__(sequence, ptms)

Creates a new PSM object.

Parameters:
  • sequence – The sequence associated with the PSM.
  • ptms – A set of PTMs
Returns:

class spectra_cluster.objects.PTM(position, accession)

Defines a post-translational modification within a peptide

Variables:
  • position – The PTM’s position within the peptide string
  • accession – The PTM’s accession in UNIMOD (if starting with “MOD:”). This may also represent a PSI entry in the format [PSI-MS, MS:1001524, fragment neutral loss, 63.998283]
__init__(position, accession)

Creates a new PTM object

Parameters:
  • position – 1-based position within the peptide (0 for terminus)
  • accession – MOD accession of the modification.
Returns:

class spectra_cluster.objects.Spectrum(title, precursor_mz, charge, taxids, psms, similarity_score=0, json_properties=None)

A spectrum reference.

__init__(title, precursor_mz, charge, taxids, psms, similarity_score=0, json_properties=None)

Creates a new Spectrum reference.

Parameters:
  • title – The spectrum’s title.
  • precursor_mz – Measured precursor m/z
  • charge – Charge state
  • taxids – Set of taxids of the experiments in which the spectrum was observed
  • psms – A set of psms associated with the spectrum. If None is passed

an empty set is created. :param similarity_score: The similarity of this spectrum with the cluster’s consensus spectrum. :param json_properties: Additional properties of the spectrum encoded as a JSON string. :return:

get_clean_sequence_psms()

Returns all PSMs with all special characters removed from the sequences.

Returns:A tuple of PSMs
get_clean_sequences()

Returns the identified sequences without any additional characters and only using high-caps.

Returns:Identified sequences
get_filename()

The originally filename can optionally be encoded in the title string. If present this filename is returned otherwise None

Returns:Original filename or None if not present
get_id()

The spectrum’s id can optionally be encoded in the title string. If present this id is returned, otherwise None.

Returns:Original spectrum id or None if not present
get_mass()

Calculates the molecular mass based on the precursor_mz and the charge.

Returns:The molecular mass
get_property(key)

Get the property with the defined key.

Parameters:key – The property’s name.
Returns:The property’s value or None if it is not defined
get_title()

Optionally the spectrum’s filename and id can be encoded in the title string. If this is the cases, the original title is extracted from the string. If no fields were encoded, the whole title string is returned. Therefore, this function should always be used if the reader expects to access the original spectrum’s title.

Returns:The original spectrum’s title
is_identified()

Checks whether the spectrum was identified.

Returns:boolean