Signaling patterns and velocities for multi-cellular spatial transcriptomics data • SecAct

This tutorial demonstrates how to infer secreted protein activities of each spot for a liver cancer spatial transcriptomics (ST) sample from Visium platform (Paper). Each spot is 55 µm in diameter covering 1-10 cells. Further, SecAct provides additional modules to analyze the signaling patterns and velocities across the whole slide. Before running the tutorial, make sure that you have installed SecAct as well as our previous R package SpaCET. Here, SpaCET will be used to create a SpaCET object to store the ST data.

Read ST data to a SpaCET object

To load data into R, user can create a SpaCET object by using create.SpaCET.object.10X. Please make sure that visiumPath points to the standard output folders of 10x Space Ranger. If the ST data is not from 10x/Visium, you can use create.SpaCET.object instead [details].

library(SecAct)
library(SpaCET)

# set the path to the data folder
dataPath <- file.path(system.file(package="SecAct"), "extdata/")

# load ST data to create an SpaCET object
visiumPath <- paste0(dataPath,"Visium_HCC/")
SpaCET_obj <- create.SpaCET.object.10X(visiumPath = visiumPath)

# filter out spots with less than 1000 expressed genes
SpaCET_obj <- SpaCET.quality.control(SpaCET_obj, min.genes = 1000)

# plot the QC metrics
SpaCET.visualize.spatialFeature(
  SpaCET_obj, 
  spatialType = "QualityControl", 
  spatialFeatures = c("UMI","Gene"),
  imageBg = TRUE
)

Infer secreted protein activity

After loading ST data, user can run SecAct.activity.inference.ST to infer the activities of 1248 secreted proteins for each spot. The output are stored in SpaCET_obj @results $SecAct_output $SecretedProteinActivity, which includes four items, (1) beta: regression coefficients; (2) se: standard error; (3) zscore: beta/se; (4): pvalue: two-sided test p value of z score from permutation test.

# infer activity; ~0.5 hrs
SpaCET_obj <- SecAct.activity.inference.ST(
  inputProfile = SpaCET_obj,
  scale.factor = 1e+05
)
# The running times mentioned here and onwards were obtained
# from our HPC with 20 cores and 200 GB of memory.

# show activity
SpaCET_obj @results $SecAct_output $SecretedProteinActivity $zscore[1:6,1:3]

Estimate signaling pattern

After calculating the secreted protein activity, SecAct could further estimate the consensus pattern from these inferred signaling activities across the whole tissue slide. This module contains two steps.

First, SecAct filters 1248 secreted proteins to identify the significant secreted proteins mediating intercellular communication in this slide. To achieve this, SecAct will calculate the Spearman correlation of spots’ signaling activity and spots’ neighbors’ RNA expression. The p values were adjusted by the Benjamini-Hochberg (BH) method as false discovery rate (FDR). The cutoffs are r > 0.05 and FDR < 0.01.

Second, SecAct employs Non-negative Matrix Factorization (NMF) to estimate the consensus signaling patterns. A critical parameter in NMF is the factorization rank k. User can assign a number vector to k, e.g., k=2:5. Then, SecAct.signaling.pattern would find the optimal number of factors determined as the point preceding the largest decrease in the silhouette value. This will take a while. Based on our pre-calculation, k=3 is the optimal number of factors. To save time, we directly run against k=3.

# estimate signaling pattern; 3 minutes
SpaCET_obj <- SecAct.signaling.pattern(SpaCET_obj, k=3)


# plot signaling pattern
SpaCET.visualize.spatialFeature(
    SpaCET_obj, 
    spatialType = "SignalingPattern", 
    spatialFeatures = "All", 
    imageBg = FALSE,
    legend.position = "none",
    colors=c("#03c383","#fbbf45","#ef6a32")
)

Further, SecAct can identify secreted proteins associated with each signaling pattern according to the matrix W from NMF results. For one secreted protein (represented by a row in W), the signaling pattern with a value at least twice as large as any other pattern, is designated as the dominant pattern for that protein.

# identify secreted proteins dominated by pattern 3
pattern.gene <- SecAct.signaling.pattern.gene(SpaCET_obj, n=3)

# show these genes
head(pattern.gene)

Calculate signaling velocity

Several secreted proteins with pattern 3 are related to epithelial-mesenchymal transition process, such as COL1A1, TGFB1, and SPARC. By integrating secreted protein-coding gene expression and signaling activity, SecAct can also infer signaling velocity at each spatial spot, indicating the direction and strength of secreted signaling. Let’s take SPARC as an example.

# show SPARC signaling velocity 
SecAct.signaling.velocity.spotST(SpaCET_obj, gene = "SPARC")

# show animated SPARC signaling velocity
SecAct.signaling.velocity.spotST(SpaCET_obj, gene = "SPARC", animated=TRUE)

Deconvolve ST data

For the current ST data, user also can run our previous R package SpaCET to estimate the cell lineages for each spot. Click here for more details. Based on the deconvolution results, we can see the interface region consists of fibroblasts, macrophages, and endothelial cells.

# deconvolve ST data
SpaCET_obj <- SpaCET.deconvolution(
  SpaCET_obj, 
  cancerType = "LIHC", 
  coreNo = 8
)

# show the spatial distribution of all cell types.
SpaCET.visualize.spatialFeature(
  SpaCET_obj, 
  spatialType = "CellFraction", 
  spatialFeatures = c(
    "Malignant","CAF","Endothelial","Macrophage",
    "Hepatocyte","B cell","T CD4","T CD8"), 
  sameScaleForFraction = TRUE,
  pointSize = 0.1, 
  nrow = 2
)

Calculate hallmark score

User also can run SpaCET.GeneSetScore to estimate the hallmark scores for each spot. Click here for more details. You can see that Pattern 3 is correlated with epithelial-mesenchymal transition (EMT).

# run gene set calculation
SpaCET_obj <- SpaCET.GeneSetScore(SpaCET_obj, GeneSets="Hallmark")

# visualize EMT
SpaCET.visualize.spatialFeature(
  SpaCET_obj, 
  spatialType = "GeneSetScore", 
  spatialFeatures = "HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION",
  legend.position = "right",
  imageBg=TRUE,
  pointSize=1.2
)