Skip to contents

Infer the signaling activity of over 1000 secreted proteins from gene expression profiles.

Usage

SecAct.activity.inference(
  inputProfile,
  inputProfile_control = NULL,
  is.differential = FALSE,
  is.paired = FALSE,
  is.singleSampleLevel = FALSE,
  sigMatrix = "SecAct",
  is.filter.sig = FALSE,
  is.group.sig = TRUE,
  is.group.cor = 0.9,
  lambda = 5e+05,
  nrand = 1000,
  ncores = 1L,
  backend = "auto",
  rng_method = "mt19937",
  batch_size = NULL,
  output_h5 = NULL
)

Arguments

inputProfile

Gene expression matrix with gene symbol (row) x sample (column).

inputProfile_control

Gene expression matrix with gene symbol (row) x sample (column).

is.differential

A logical flag indicating whether inputProfile has been differential profiles against the control (Default: FALSE).

is.paired

A logical flag indicating whether you want a paired operation of differential profiles between inputProfile and inputProfile_control if samples in inputProfile and inputProfile_control are paired (Default: FALSE).

is.singleSampleLevel

A logical flag indicating whether to calculate activity change for each single sample between inputProfile and inputProfile_control (Default: FALSE). If FALSE, calculate the overall activity change between two phenotypes .

sigMatrix

Secreted protein signature matrix. Could be "SecAct", "CytoSig", "SecAct-Breast", "SecAct-Colorectal", "SecAct-Glioblastoma", "SecAct-Kidney", "SecAct-Liver", "SecAct-Lung-Adeno", "SecAct-Ovarian", "SecAct-Pancreatic", "SecAct-Prostate". SecAct signatures were derived from all cancer ST samples; SecAct-XXX signatures were derived from XXX cancer ST samples.

is.filter.sig

A logical flag indicating whether to filter the secreted protein signatures based on the genes from inputProfile (Default: FALSE). Because some sequencing platforms (e.g., CosMx) cover only a subset of secreted proteins, setting this option to TRUE restricts the activity inference on those proteins.

is.group.sig

A logical flag indicating whether to group similar signatures (Default: TRUE). Many secreted proteins, such as cytokines with similar cell surface receptors and downstream pathways, have cellular effects that appear redundant within a cellular context. When enabled, this option clusters secreted proteins based on Pearson correlations among their composite signatures. The output still reports activity estimates for all secreted proteins prior to clustering. Secreted proteins assigned to the same non-redundant cluster share the same inferred activity.

is.group.cor

A numeric value specifying the correlation cutoff used to define similar signatures (Default: 0.90). When r > 0.90, 1,170 secreted protein signatures are grouped into 657 non-redundant signature groups.

lambda

Penalty factor in the ridge regression. If NULL, lambda will be assigned as 5e+05 or 10000 when sigMatrix = "SecAct" or "CytoSig", respectively.

nrand

Number of randomization in the permutation test, with a default value of 1000.

ncores

Number of threads for accelerator backends (ignored by pure R). Default 1.

backend

One of "auto", "gpu", "cpu-fast", "cpu-pure". "auto" picks GPU (RidgeCuda) > CPU-fast (RidgeFast) > CPU-pure depending on what is installed. Default "auto".

rng_method

RNG for permutations. "mt19937" (default) is GSL-compatible MT19937 seed 0 — bit-identical across backends when ncores=1. Accelerators may support "srand" for faster, non-reproducible runs. Pure-R supports only "mt19937".

batch_size

Optional positive integer. When supplied, permutation inference is performed over column-batches of Y via the backend's ridge_batch (memory-efficient path for large sample counts). Default NULL (no batching).

output_h5

Optional path to an HDF5 file. When supplied (requires batch_size and the rhdf5 package), the four result matrices are streamed to HDF5 datasets beta/se/zscore/pvalue and the function returns metadata in place of the matrices. Use when even the result matrices do not fit in RAM. Not compatible with is.group.sig=TRUE.

Value

A list with four items, each is a matrix. beta: regression coefficients se: standard errors of coefficients zscore: beta/se pvalue: statistical significance