Generating single-cell sequencing reads
tags: Resimpy
Introduction
resimpy_umi_sc is a module that can simulate single-cell sequencing
reads consisting of only UMIs per each from a gene-by-cell matrix
generated by an external simulator called
SPsimSeq. To
achieve this purpose, a case-study CLI should look like below:
resimpy_umi_sc \
-r seq_errs \
-rs umi \
-perm_num 3 \
-umiup 1 \
-umiul 10 \
-umi_num 50 \
-pcr_num 8 \
-pcr_err 0.0001 \
-seq_err 0.0001 \
-ampl_rate 0.85 \
-sim_thres 3 \
-spl_rate 1 \
-seq_errs 1e-3;1e-2;0.1 \
-out_dir ./
Parameters are illustrated below.
Par ameter a cronym |
Full name |
Function |
|---|---|---|
r |
recipe |
to specify a module to work on your requirement |
rs |
read structure |
e.g., umi+seq or umi |
pe rm_num |
permutation number |
in silico test numbers |
umiup |
UMI unit pattern |
1 for monomer blocks, 2 for dimer blocks, 3 for trimer blocks |
umiul |
UMI unit len fixed |
the fixed length of a monomer UMI |
u mi_num |
UMI number fixed |
the fixed number of molecules/UMIs to be initiated in the initial read pool |
sim _thres |
similarity threshold fixed |
how many nucleotites are different at least between each pair of two randomly generated UMIs |
p cr_num |
PCR n umber/cycle |
a fixed PCR number |
p cr_err |
PCR error |
a fixed DNA polymerase error rate during PCR |
s eq_err |
sequencing error |
a fixed sequencing error rate |
amp l_rate |
am plification rate |
PCR amplification rate |
sp l_rate |
subsampling rate |
subsampling rate used for sequencing |
se q_errs |
sequencing errors |
sequencing error rate partitioned by semicolon, e.g., 1e-3;1e-2;0.1 |
pc r_errs |
PCR errors |
DNA polymerase error rate partitioned by semicolon, e.g., 1e-3;1e-2;0.1 |
pc r_nums |
PCR numbers |
PCR numbers partitioned by semicolon, e.g., 8;9;10;11;12 |
um i_lens |
UMI lengths |
UMI lengths partitioned by semicolon, e.g., 8;9;10;11;12 |
ampl _rates |
am plification rates |
amplification rates partitioned by semicolon, e.g., 0.1;0.2;0.3;0.4;0.5;0.6;0.7;0.8;0.9;1.0 |
o ut_dir |
output directory |
a directory where you want to output results |
As we have configured SPsimSeq internally, there is no need to specify
it again in CLI. But all parameters for the SPsimSeq matrix are fixed,
we are considering to extend it more flexibly. In each permutation test,
reads will be generated based on one varying parameter such as
seq_errs and all of the fixed parameters such as pcr_num except
for the varying one. In this context, seq_err will not be applied
because seq_errs is claimed, such that reads can be examined under
this varying one. This is actually a one-factor experiment control.
Similarly, for pcr_errs, pcr_nums, umi_lens, and
ampl_rates, the CLIs should look like below:
Reads changing with PCR errors
resimpy_umi_sc -r pcr_errs -rs umi+seq -perm_num 3 -umiup 1 -umiul 10 -umi_num 50 -pcr_num 8 -pcr_err 0.0001 -seq_err 0.0001 -ampl_rate 0.85 -sim_thres 3 -spl_rate 1 -pcr_errs 1e-3;1e-2;0.1 -out_dir ./
Reads changing with amplification rates
resimpy_umi_sc -r ampl_rates -rs umi+seq -perm_num 3 -umiup 1 -umiul 10 -umi_num 50 -pcr_num 8 -pcr_err 0.0001 -seq_err 0.0001 -ampl_rate 0.85 -sim_thres 3 -spl_rate 1 -ampl_rates 0.1;0.2;0.3;0.4;0.5;0.6;0.7;0.8;0.9;1.0 -out_dir ./
Reads changing with PCR numbers
resimpy_umi_sc -r pcr_nums -rs umi+seq -perm_num 3 -umiup 1 -umiul 10 -umi_num 50 -pcr_num 8 -pcr_err 0.0001 -seq_err 0.0001 -ampl_rate 0.85 -sim_thres 3 -spl_rate 1 -pcr_nums 6;7;8;9;10;11;12;13;14 -out_dir ./
Reads changing with UMI lengths
resimpy_umi_sc -r umi_lens -rs umi+seq -perm_num 3 -umiup 1 -umiul 10 -umi_num 50 -pcr_num 8 -pcr_err 0.0001 -seq_err 0.0001 -ampl_rate 0.85 -sim_thres 3 -spl_rate 1 -umi_lens 6;7;8;9;10;11;12 -out_dir ./