Before running, you need to copy the
cell_type_annotation_model.pyc file to the program running
directory. The first is the introduction of the package
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
import os import time from scipy.sparse import * import random import argparse import anndata import numpy as np import pandas as pd import scanpy as sc import matplotlib.pyplot as plt from torch import nn import torch import torch_geometric from cell_type_annotation_model import DNNModel, SpatialModelTrainer import torch.utils.data as Data
Then there is the setting of random seed, so that the experimental
results can be repeated
Next is the reading of single cell data. It should be noted that the
homogenization of single cells needs to be consistent with the spatial
group. If the. X matrix of a single cell is the original counts matrix,
then the. X matrix of the spatial group should also be the original
matrix; If the. X matrix of a single cell is a matrix processed by log
normalization, then the. X matrix of a spatial group should also be a
matrix processed by log normalization, the same applies to other
matrices. I have already processed the single cell data and spatial
group data consistently in this note, so I will not elaborate on the
normalization steps here.
Next is the training of the DNN model. We can see the process in the
first diagram of the Spatial ID original text. The first step is to
train a DNN model using one's own single cell data, and then proceed
with subsequent processing. The following is the training process of the
DNN model I wrote for your reference
Label is the single cell annotation in your OBS, train_percentage is
the percentage of data you use for training, epochs is the number of
training sessions, learning_ Rate is the learning rate of training,
which depends on one's own tuning; model_save_path is the storage
location for the DNN model that I have trained myself. I have set an
accuracy parameter here, and an increase in acc will automatically save
this model. Finally, a model will also be saved. It is recommended to
train the acc for this step to over 80%. If the acc is too low, it
indicates that there may be a problem with single cell annotation.
The above is the definition of the function, and the following are
the parameters and corresponding operations. Fill in the DNN model you
trained above with the model, result_csv stores the cell type data
corresponding to each bin/cell, new_ Adata is the data annotated with
cells, and params is the parameter that needs to be adjusted. You need
to debug it yourself based on the situation of the transfer.