.karayel_2020
- proteopy.download.karayel_2020(intensities_path='karayel-2020_ms-proteomics_human-erythropoiesis_intensities.tsv', var_annotation_path='karayel-2020_ms-proteomics_human-erythropoiesis_protein-annotation.tsv', sample_annotation_path='karayel-2020_ms-proteomics_human-erythropoiesis_sample-annotation.tsv', *, sep=None, fill_na=None, force=False)[source]
Save Karayel 2020 erythropoiesis dataset to disk.
Download and process the protein-level DIA-MS dataset from Karayel et al. [1] and save it as three tabular files: intensities in long format, protein annotations, and sample annotations.
The study quantified ~7,400 proteins from CD34+ hematopoietic stem/progenitor cells (HSPCs) isolated from healthy donors, across five sequential erythroid differentiation stages with four biological replicates each (20 samples total). Cells were FACS-sorted using CD235a, CD49d, and Band 3 surface markers. The differentiation stages are:
Progenitor: CFU-E progenitor cells (CD34+ HSPCs, negative fraction)
ProE&EBaso: Proerythroblasts and early basophilic erythroblasts
LBaso: Late basophilic erythroblasts
Poly: Polychromatic erythroblasts
Ortho: Orthochromatic erythroblasts
Data are sourced from the PRIDE archive (PXD017276). Protein quantities marked as
Filteredin the original data are converted tonp.nan. Samples collected at day 7 are excluded.- Parameters:
intensities_path (str | Path, optional) – Destination path for the intensities file. Columns:
sample_id,protein_id,intensity.var_annotation_path (str | Path, optional) – Destination path for the protein annotation file. Columns:
protein_id,gene_id.sample_annotation_path (str | Path, optional) – Destination path for the sample annotation file. Columns:
sample_id,cell_type,replicate.sep (str | None, optional) – Column separator for all output files. When
None, the separator is inferred from each file extension viadetect_separator_from_extension()(.tsv→ tab,.csv→ comma).fill_na (float | int | None, optional) – If not
None, replace NaN values in the long-format intensities DataFrame with this value before saving.force (bool, optional) – If
True, overwrite existing files at the output paths. Otherwise, raiseFileExistsErrorwhen a destination file already exists.
- Returns:
Writes files to disk; does not return a value.
- Return type:
None
Examples
>>> import proteopy as pr >>> pr.download.karayel_2020( ... intensities_path="intensities.tsv", ... var_annotation_path="protein_annotations.tsv", ... sample_annotation_path="sample_annotations.tsv", ... )
References