dataset.data_prep.CORTEX
- class dataset.data_prep.CORTEX(file_dir='/home/longlab/Data/Thesis/Data/expression_mRNA_17-Aug-2014.txt', n_genes=558)[source]
 Bases:
DatasetLoads CORTEX dataset.
A class with necessary pre-processing steps for the gold standard Zeisel data set which contains 3005 mouse cortex cells and gold-standard labels for seven distinct cell types. Each cell type corresponds to a cluster to recover.
The pre-processing steps are:
exctracting the labels of the cell types from the data
choosing the genes that are transcribed in more than 25 cells
3. Selecting the 558 genes with the highest Variance in the remaining genes from the previous step 4. Performing random permutation of the genes
Parameters
- file_dirstr
 The path to the .csv file.
- n_genesint
 Number of the high variable genes that should be selected.
- n_cells:
 Total number of cells.
- data: torch Tensor
 The data.
- labels: torch Tensor
 The labels.
Examples
>>> import data_prep >>> cortex = data_prep.CORTEX() >>> dl = DataLoader(cortex, batch_size= 128, shuffle=True)
Methods