utils.helper.entropy_batch_mixing
- utils.helper.entropy_batch_mixing(latent_space, batches, K=50, n_jobs=8, n=100, n_iter=50)[source]
Adopted from:
1) Haghverdi L, Lun ATL, Morgan MD, Marioni JC. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat Biotechnol. 2018 Jun;36(5):421-427. doi: 10.1038/nbt.4091.
2) Lopez R, Regier J, Cole MB, Jordan MI, Yosef N. Deep generative modeling for single-cell transcriptomics. Nat Methods. 2018 Dec;15(12):1053-1058. doi: 10.1038/s41592-018-0229-2.
This function will choose n cells from batches, finds K nearest neighbors of each randomly chosen cell, and calculates the average regional entropy of all n cells.
The procedure is repeated for n_iter iterations. Finally, the average of the iterations is returned as the final batch mixing score.
Parameters
- latent_spacenumpy ndarray
The latent space matrix.
- batchesa numpy array or a list
The batch number of each sample in the latent space matrix.
- Kint
Number of nearest neighbors.
- n_jobsint
Number of jobs. Please visit scikit-learn documentation for more info.
- nint
Number of cells to be chosen randomly.
- n_iterint
Number of iterations to randomly choosing n cells.
Returns
- scorefloat <= 1
The batch mixing score; the higher, the better.