.normalize_median

proteopy.pp.normalize_median(adata, *, log_space, target='median', fill_na=None, zeros_to_na=False, batch_id=None, inplace=True, force=False)[source]

Median normalization of intensities.

NAs are ignored when computing sample medians.

Parameters:
  • adata (AnnData) – Input AnnData.

  • log_space (bool) – Whether the input intensities are log-transformed. Mismatches with automatic detection raise unless force=True.

  • target ({'max', 'median'}) – How to compute the normalization target from sample medians. 'max' uses the maximum sample median, 'median' uses the median of sample medians. Defaults to 'median'.

  • fill_na (float, optional) – Temporarily replace non-finite entries with this value for the median computation only; original values are restored afterward.

  • zeros_to_na (bool, default False) – Treat zeros as missing for the median computation only; original zeros are restored afterward.

  • batch_id (str, optional) – Column in adata.obs to perform normalization within batches.

  • inplace (bool, default True) – Modify adata in place. If False, return a copy.

  • force (bool, default False) – Proceed even if log_space disagrees with automatic log detection.

Returns:

  • AnnData or None – Normalized AnnData when inplace is False; otherwise None.

  • pandas.DataFrame, optional – Per-sample factors when inplace is False.

Notes

Median normalization:
log_space=True

X + target - sample_median

log_space=False

X * target / sample_median

'max'

target = max of sample medians (within batch if per_batch)

'median'

target = median of sample medians (within batch if per_batch)

Examples

>>> import proteopy as pr
>>> adata = pr.datasets.karayel_2020()

Normalize using the median of sample medians (default):

>>> pr.pp.normalize_median(adata, log_space=False)

Normalize using the maximum of sample medians:

>>> pr.pp.normalize_median(adata, target='max', log_space=False)