Objects¶
These are the common objects used to represent clustering data throughout all classes.
-
class
spectra_cluster.objects.Cluster(cluster_id, precursor_mz, consensus_mz, consensus_intens, spectra, ignore_duplicated=True)¶ Represents a cluster in a .clustering output file.
-
__init__(cluster_id, precursor_mz, consensus_mz, consensus_intens, spectra, ignore_duplicated=True)¶ Creates a new cluster object
Parameters: - cluster_id – The cluster’s id
- precursor_mz – The cluster’s average precursor m/z
- consensus_mz – A list of doubles holding the consensus spectrum’s m/z values
- consensus_intens – A list of doubles holding the consensus spectrum’s intensity values
- spectra – A set of spectra associated with the cluster
- ignore_duplicated – The constructor automatically removes duplicated spectra from the list of clustered spectra. If duplicated spectra are found, an Exception is raised. Setting this parameter to true does not prevent the filtering, but prevents the exception to be raised.
-
static
calculate_sequence_counts(spectra, ignore_i_l=False)¶ Calculates the sequence counts based on the passed spectra. PTMs are ignored for this assessment
Parameters: - spectra – The spectra to derive the sequence counts from.
- ignore_i_l – If set I and L are treated as equivalent. If set all I are replace by L and the sequences in the returned map may not correspond to the originally identified sequences.
Returns: A dict with a sequence as key and the number of occurrences as value.
-
get_spectra()¶ Returns the stored spectra in a tuple. These object should not be changed. Otherwise, the cluster’s statistics may no longer be accurate.
Returns: A tuple containing the cluster’s spectra
-
set_spectra(new_spectra)¶ Updates the cluster’s stored spectra
Parameters: new_spectra – A list of PSM objects.
-
-
class
spectra_cluster.objects.PSM(sequence, ptms)¶ Defines a peptide-spectrum-match
-
__init__(sequence, ptms)¶ Creates a new PSM object.
Parameters: - sequence – The sequence associated with the PSM.
- ptms – A set of PTMs
Returns:
-
-
class
spectra_cluster.objects.PTM(position, accession)¶ Defines a post-translational modification within a peptide
Variables: - position – The PTM’s position within the peptide string
- accession – The PTM’s accession in UNIMOD (if starting with “MOD:”). This may also represent a PSI entry in the format [PSI-MS, MS:1001524, fragment neutral loss, 63.998283]
-
__init__(position, accession)¶ Creates a new PTM object
Parameters: - position – 1-based position within the peptide (0 for terminus)
- accession – MOD accession of the modification.
Returns:
-
class
spectra_cluster.objects.Spectrum(title, precursor_mz, charge, taxids, psms, similarity_score=0, json_properties=None)¶ A spectrum reference.
-
__init__(title, precursor_mz, charge, taxids, psms, similarity_score=0, json_properties=None)¶ Creates a new Spectrum reference.
Parameters: - title – The spectrum’s title.
- precursor_mz – Measured precursor m/z
- charge – Charge state
- taxids – Set of taxids of the experiments in which the spectrum was observed
- psms – A set of psms associated with the spectrum. If None is passed
an empty set is created. :param similarity_score: The similarity of this spectrum with the cluster’s consensus spectrum. :param json_properties: Additional properties of the spectrum encoded as a JSON string. :return:
-
get_clean_sequence_psms()¶ Returns all PSMs with all special characters removed from the sequences.
Returns: A tuple of PSMs
-
get_clean_sequences()¶ Returns the identified sequences without any additional characters and only using high-caps.
Returns: Identified sequences
-
get_filename()¶ The originally filename can optionally be encoded in the title string. If present this filename is returned otherwise None
Returns: Original filename or None if not present
-
get_id()¶ The spectrum’s id can optionally be encoded in the title string. If present this id is returned, otherwise None.
Returns: Original spectrum id or None if not present
-
get_mass()¶ Calculates the molecular mass based on the precursor_mz and the charge.
Returns: The molecular mass
-
get_property(key)¶ Get the property with the defined key.
Parameters: key – The property’s name. Returns: The property’s value or None if it is not defined
-
get_title()¶ Optionally the spectrum’s filename and id can be encoded in the title string. If this is the cases, the original title is extracted from the string. If no fields were encoded, the whole title string is returned. Therefore, this function should always be used if the reader expects to access the original spectrum’s title.
Returns: The original spectrum’s title
-
is_identified()¶ Checks whether the spectrum was identified.
Returns: boolean
-