Skip to content
This repository was archived by the owner on Dec 15, 2025. It is now read-only.

rickert-lab/Spatial_Permutation_and_Normalization

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spatial permutation and normalization of multiplexed immunofluorescence imaging data

Overview

This R script is used to identify statistically significant spatial features i.e. positive or negative cell-cell colocalizations using the colocation quotient (CLQ) analysis. Here we describe how to calculate the CLQ, and create a null distribution of CLQ values and normalize the data. The normalization process considers the number of cells within each subpopulation. Subpopulations with a low cell count were more likely to yield a broader distribution of CLQ values during the permutation analysis. This broader distribution resulted from the substantial impact of random label sampling on CLQ value calculations.

  • get_CLQ() The colocation quotient (CLQ) quantifies how a cell subpopulation colocates spatially with another cell subpopulation among a set of nearest neighbors, defined here as 20. We calculated the colocation quotient for the pairwise cell types identified with CELESTA (Zhang et al., 2022, Nature Methods) under naïve and treatment conditions using the following equation: CLQb→a = (Cb→a/Na) / (Nb/(N− 1)) where C is the number of cells of cell type b among the defined nearest neighbors of cell type a, N is the total number of cells and Na and Nb are the numbers of cells for cell type a and cell type b.

  • KNN_neighbors Function to find N-nearest neighboring cells

  • find_cell_type_neighbors This step intends to find the cell types for neighboring cells

  • CLQ_permutated_matrix_gen1 This function intends to assess the significance of the CLQ values obtained by randomly permuting 500 times the cell labels (cell types) while preserving the subpopulation proportions.

  • write_counts This function intends to count the number of cells for each subpopulation. It generates a summary table with cell type number, corresponding names and the cell counts in the sample.

  • CLQ_matrix_gen This function will read the CELESTA cell assignment file, and will generate the original CLQ matrix.

  • CLQ_permutated_matrix_gen2 : This function retrieves the output of the permutated matrix of each sample.

  • read_counts This function retrieves the output of the subpopulation counts for each sample.

  • significance_matrix_gen This function identifies statistically significant CLQ values. The CLQ values falling outside or at the tail of the distribution generated by the permutation analysis are considered significant, whereas values within the distribution are deemed non-significant, as they can be reproduced after spatial randomization. Percentile values < 0.05 or > 0.95 are considered as significant. The normalization achieved through the permutation analysis facilitates not only spatial feature comparisons but also enables the comparison of different conditions from the same, or independent experiments. CLQs were normalized according to the following formula: (Observed CLQ - Mean CLQ)/(Max CLQ – Mean CLQ).

-plot_gen This function plots the distribution of all the permutation CLQ values for each cell pair. The blue bar is the normalized CLQ value and the red bar is the original CLQ value.

-CLQ_normalization_by_sample This function requires (1) a named vector with the original CLQ values for one sample before normalization, each element need to have a name, which is the two cell types in the cell pair, connected by “_“, (2) A cell count file with the number of cells for each cell type in that sample, (3) Number of nearest neighbors in the CLQ calculation, (4) A threshold value cell count for rare cell populations, default is 5, (5) CELESTA input prior cell type signature matrix and (6) Clipping parameters, default to 0.05. but a warning message will suggest clipping more as needed. The original CLQ distribution is bell-shaped, but is skewed on the rail. The clipping parameter allows for better visualization when normalizing the data.

Dependency

  • spdep: for obtaining spatial neighborhood information
  • ggplot2

Usage

See example.R for a full run.

Inputs

The spatial permutation analysis requires two inputs:
1. CELESTA cell subpopulations:
A dataframe with one column named cell_types with all the user-defined CELESTA cell subpopulations.

See file example: “cell_types_celesta.csv”

2. Segmented imaging data with CELESTA cell assignment:
The _cell_type_assigment.csv output dataframe from the CELESTA algorithm available to download at https://github.com/plevritis-lab/CELESTA.

See file example: “TAFs1_cell_type_assignment.csv”

Outputs

Spatial permutation outputs: 1. After running the write_counts() function, the script will output a .csv file with the number of cells for each cell subpopulation.

See file example: “TAFs1_CellCounts.csv”

  1. After running the CLQ_matrix_gen function, the script will output a .csv file with the original CLQ values for each cell pair.

See file example: “TAFs1_CLQ.csv”

  1. After running the CLQ_permutated_matrix_gen function, the script will output a .csv file of 500 CLQ values obtained by randomly permuting 500 times the cell labels (cell subpopulations) while preserving the proportions. These values will be plotted in

See file example: “TAFs1_CLQ_Permutated.csv”

  1. After running the significance_matrix_gen function, the script will output a .csv file with the script will output a .csv file with the sample name, the identity of and count of each cell subpopulation, the original CLQ value, the percentile and if the value is deemed significant.

Note that the original CLQ of value zero smay be caused by insufficient cell numbers of respective cell types. These are filtered out in the post-process prior to colocatome generation.

See file example: “TAFs1_CLQ_data_full.csv” and .png images in the PA_figures_TAFs1 folder.

5.After running the CLQ_normalization_by_sample functions, the cript will output a .csv file with normalized values.

See file examples: “TAFs1_CLQ_Normalized_L0_R0.05”. L0 = left clipping parameter at 0 (no clipping) and R0.05 = clipping parameter at 0.05.

Note that the folder contains only a subset of the distribution plots.

Core CLQ publications and software

Core publications and existing software releases from the authors of the the Co-Location Quotient (CLQ) can be found here: https://seg.gmu.edu/

Getting help

If you encounter a bug, please file an issue with a minimal reproducible example on GitHub. For questions and other discussion, please use community.rstudio.com.

About

Permutation analysis and normalization strategy for the colocatome paper.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 100.0%