.peptides_on_prot_sequence
- proteopy.pl.peptides_on_prot_sequence(adata, protein_id, group_by=None, ref_sequence=None, alt_pep_sequence_key=None, add_sequences=None, allow_overlaps=False, allow_multi_match=False, color_scheme=None, title=None, order=None, figsize=None, show=True, save=None, ax=None)[source]
Plot peptide coverage on a protein reference sequence.
Filters peptides belonging to the given
protein_idand draws them as broken bars on top of a grey reference bar. Extra sequences (e.g., domains or post-translational modification sites) can be overlaid via theadd_sequencesparameter.- Parameters:
adata (AnnData) – Peptide-level
AnnData.protein_id (str) – Value in
adata.var["protein_id"]to select peptides for.group_by (str | None) – Column in
adata.varused to assign peptides to colored rows. WhenNone, all peptides are placed in a single row labeledprotein_id.ref_sequence (str | None) – Full reference protein sequence as an amino-acid string.
alt_pep_sequence_key (str | None) – Column in
adata.varwhose values are used as peptide amino-acid strings. WhenNone,adata.var_names(i.e.,peptide_id) are used.add_sequences (dict[str, dict] | None) – Additional named sequences to overlay (e.g., domains). Each value must be a dict with a
"group"key and either a"seq"or"seq_coord"key. Coordinates are 0-based throughout."seq_coord"must be a half-open[start, end)tuple wherestart >= 0andend <= len(ref_sequence).allow_overlaps (bool) – When
False, raiseValueErrorif two sequences in the same group overlap positionally.allow_multi_match (bool) – When
True, a peptide sequence matching the reference at multiple positions is shown as one bar per match, labeled"<name> (match N)". WhenFalse, aValueErroris raised for ambiguous matches.color_scheme (str | dict | list | Colormap | callable | None) – Color specification for groups.
title (str | None) – Axes title.
figsize (tuple[float, float] | None) – Figure dimensions
(width, height)in inches.show (bool) – Call
plt.show()at the end.save (str | Path | None) – Path at which to save the figure (300 dpi,
bbox_inches="tight"). WhenNone, the figure is not saved.ax (matplotlib.axes.Axes | None) – Matplotlib Axes object to plot onto. If
None, a new figure and axes are created. The function always returns the Axes object used for plotting.
- Returns:
The Axes object used for plotting.
- Return type:
- Raises:
ValueError – If
adatais not at peptide level.ValueError – If
ref_sequenceisNone.ValueError – If no peptides match
protein_idinadata.var["protein_id"].ValueError – If a peptide sequence is not found in the reference.
ValueError – If a peptide matches the reference at multiple positions and
allow_multi_match=False.ValueError – If two sequences in the same group overlap and
allow_overlaps=False.KeyError – If
alt_pep_sequence_keyorgroup_byis not found inadata.var.
Examples
Build a minimal peptide-level AnnData with two proteins.
"ELPGW"belongs to P2 and is excluded by the filter.>>> import numpy as np >>> import pandas as pd >>> import anndata as ad >>> import proteopy as pr >>> ref = "MKTAYIAKQRQISFVKSHFSRQDEEP" >>> peptides = [ ... "KTAYIAK", "QRQISFVK", "SHFSRQ", "ELPGW", ... ] >>> var = pd.DataFrame( ... { ... "peptide_id": peptides, ... "protein_id": ["P1", "P1", "P1", "P2"], ... "proteoform": ["A", "A", "B", "A"], ... }, ... index=peptides, ... ) >>> obs = pd.DataFrame( ... {"sample_id": ["S1", "S2"]}, ... index=["S1", "S2"], ... ) >>> adata = ad.AnnData( ... X=np.ones((2, 4)), ... obs=obs, ... var=var, ... )
Filter to P1; group peptides by proteoform.
>>> pr.pl.peptides_on_prot_sequence( ... adata, ... protein_id="P1", ... ref_sequence=ref, ... group_by="proteoform", ... )