dataset.data_prep.CORTEX
- class dataset.data_prep.CORTEX(file_dir='/home/longlab/Data/Thesis/Data/expression_mRNA_17-Aug-2014.txt', n_genes=558)[source]
Bases:
Dataset
Loads CORTEX dataset.
A class with necessary pre-processing steps for the gold standard Zeisel data set which contains 3005 mouse cortex cells and gold-standard labels for seven distinct cell types. Each cell type corresponds to a cluster to recover.
The pre-processing steps are:
exctracting the labels of the cell types from the data
choosing the genes that are transcribed in more than 25 cells
3. Selecting the 558 genes with the highest Variance in the remaining genes from the previous step 4. Performing random permutation of the genes
Parameters
- file_dirstr
The path to the .csv file.
- n_genesint
Number of the high variable genes that should be selected.
- n_cells:
Total number of cells.
- data: torch Tensor
The data.
- labels: torch Tensor
The labels.
Examples
>>> import data_prep >>> cortex = data_prep.CORTEX() >>> dl = DataLoader(cortex, batch_size= 128, shuffle=True)
Methods