在处理空间组的数据的时候,很多情况下,我们会根据解剖学知识给某些地方进行区域注释;这块区域的细胞类型则是通过单细胞映射空间组进行的,再进行了手动赋予区域信息以及迁移映射了细胞类型信息之后,则需要对这些进行进行一些可视化以及一些统计分析。

可视化的代码如下,为一柱状堆积图:

library(pheatmap)
library(ggplot2)
library(viridis)
library(tidyverse)
df = read.table('*.csv', sep = ',')
colnames(df) = df[1, ]
df = df[-1, ] %>% select(-1)
df = df %>% group_by(region) %>% count(celltype)
df$Gyrus_Sulcus = factor(df$Ribbon, levels = c('Frontal', 'Insular', 'Parietal', 'Temporal'))

pdf("*.pdf", width = 10, height = 6)

df %>% filter(Gyrus_Sulcus != 'others') %>% 
    ggplot(aes(x = Gyrus_Sulcus, y = n, fill = celltype)) +
    geom_col(position = "fill") +

    scale_fill_manual(values = c('#bd6c48', '#21c36f', '#feb308', '#9b5fc0', '#6ecb3c', '#02d8e9', '#1d5dec', '#069af3', '#a2cffe', '#8cff9e', '#ffb7ce', '#ab9004', '#937c00', '#8f9805', '#b6c406', '#ca9bf7', '#fafe4b', '#fe46a5', '#ac1db8', '#e6daa6', '#afa88b', '#137e6d', '#2bb179', '#89a0b0', '#f29e8e', '#fe828c', '#63b365', '#c14a09', '#fe83cc', '#fef69e', '#610023', '#c04e01', '#9f2305', '#b75203', '#b04e0f', '#a0450e', '#d5ab09', '#6832e3', '#ffff81', '#fffd74', '#fdb147', '#4e7496', '#c69f59', '#d5b60a', '#728f02')) +
    theme_bw() + xlab("") + ylab("percentage (%)") +
    guides(color = guide_legend(ncol = 1)) +
    scale_y_continuous(labels = scales::percent_format(suffix = "")) +
    theme(panel.grid = element_blank()) + 
    theme(panel.border = element_blank()) + theme(axis.line = element_line(colour = "black")) +
    theme(axis.text.x = element_text(angle = 90)) +
    coord_flip()
dev.off()

之后则是根据方差分析对其差异显著性进行验证:

import numpy as np
import pandas as pd
from scipy.stats import chi2_contingency

csv = pd.read_csv('*.csv')

celltypes = list(set(csv['celltype']))
regions = list(set(csv['Ribbon']))
regions_length = [len(csv[csv.Ribbon == i]) for i in regions]

for i in celltypes:
    
    celltypes_length = []
    
    for j in regions:
        csv_temp = csv[csv['Ribbon'] == j]
        csv_temp = csv_temp[csv_temp['celltype'] == i]
        celltypes_length.append(len(csv_temp))
    
    # 输入观测值
    observed_values = []
    for j in range(len(celltypes_length)):
        observed_values.append([celltypes_length[j], regions_length[j] - celltypes_length[j]])
    observed_values = np.array(observed_values)

    # 进行卡方检验
    chi2, p_value, _, _ = chi2_contingency(observed_values)
    
    if p_value < 0.01:
        k = '**'
    elif p_value < 0.05:
        k = '*'
    else:
        k = '-'
    
    # 输出结果
    print("celltype: ", i,"卡方统计量: ", chi2,"p-value: ", p_value,'显著性为:',k)

到此为止,则是获得了每个区域的细胞类型的比例图和显著性验证。