Generating sequencing reads for detecting PCR artificial translocation

tags: Resimpy

Introduction

resimpy_umi_transloc is a module designed for simulating sequencing reads consisting of two UMIs per each for detecting PCR artificial translocation. To achieve this purpose, a case-study CLI should look like below:

resimpy_umi \
-r seq_errs \
-rs umi \
-tr 0.02 \
-perm_num 3 \
-umiup 1 \
-umiul 10 \
-umi_num 50 \
-pcr_num 8 \
-pcr_err 0.0001 \
-seq_err 0.0001 \
-ampl_rate 0.85 \
-sim_thres 3 \
-spl_rate 1 \
-seq_errs 1e-3;1e-2;0.1 \
-out_dir ./

Parameters are illustrated below.

Par ameter a cronym

Full name

Function

r

recipe

to specify a module to work on your requirement

rs

read structure

e.g., umi+seq or umi

-tr

tr anslocation rate

artificial PCR translocation rate e.g., 0.02

pe rm_num

permutation number

in silico test numbers

umiup

UMI unit pattern

1 for monomer blocks, 2 for dimer blocks, 3 for trimer blocks

umiul

UMI unit len fixed

the fixed length of a monomer UMI

u mi_num

UMI number fixed

the fixed number of molecules/UMIs to be initiated in the initial read pool

sim _thres

similarity threshold fixed

how many nucleotites are different at least between each pair of two randomly generated UMIs

p cr_num

PCR n umber/cycle

a fixed PCR number

p cr_err

PCR error

a fixed DNA polymerase error rate during PCR

s eq_err

sequencing error

a fixed sequencing error rate

amp l_rate

am plification rate

PCR amplification rate

sp l_rate

subsampling rate

subsampling rate used for sequencing

se q_errs

sequencing errors

sequencing error rate partitioned by semicolon, e.g., 1e-3;1e-2;0.1

pc r_errs

PCR errors

DNA polymerase error rate partitioned by semicolon, e.g., 1e-3;1e-2;0.1

pc r_nums

PCR numbers

PCR numbers partitioned by semicolon, e.g., 8;9;10;11;12

um i_lens

UMI lengths

UMI lengths partitioned by semicolon, e.g., 8;9;10;11;12

ampl _rates

am plification rates

amplification rates partitioned by semicolon, e.g., 0.1;0.2;0.3;0.4;0.5;0.6;0.7;0.8;0.9;1.0

o ut_dir

output directory

a directory where you want to output results

In each permutation test, reads will be generated based on one varying parameter such as seq_errs and all of the fixed parameters such as pcr_num except for the varying one. In this context, seq_err will not be applied because seq_errs is claimed, such that reads can be examined under this varying one. This is actually a one-factor experiment control. Similarly, for pcr_errs, pcr_nums, umi_lens, and ampl_rates, the CLIs should look like below:

Reads changing with PCR errors

resimpy_general -r pcr_errs -rs umi+seq -tr 0.02 -perm_num 3 -umiup 1 -umiul 10 -umi_num 50 -seq_len 20 -pcr_num 8 -pcr_err 0.0001 -seq_err 0.0001 -ampl_rate 0.85 -sim_thres 3 -spl_rate 1 -pcr_errs 1e-3;1e-2;0.1 -out_dir ./

Reads changing with amplification rates

resimpy_general -r ampl_rates -rs umi+seq -tr 0.02 -perm_num 3 -umiup 1 -umiul 10 -umi_num 50 -seq_len 20 -pcr_num 8 -pcr_err 0.0001 -seq_err 0.0001 -ampl_rate 0.85 -sim_thres 3 -spl_rate 1 -ampl_rates 0.1;0.2;0.3;0.4;0.5;0.6;0.7;0.8;0.9;1.0 -out_dir ./

Reads changing with PCR numbers

resimpy_general -r pcr_nums -rs umi+seq -tr 0.02 -perm_num 3 -umiup 1 -umiul 10 -umi_num 50 -seq_len 20 -pcr_num 8 -pcr_err 0.0001 -seq_err 0.0001 -ampl_rate 0.85 -sim_thres 3 -spl_rate 1 -pcr_nums 6;7;8;9;10;11;12;13;14 -out_dir ./

Reads changing with UMI lengths

resimpy_general -r umi_lens -rs umi+seq -tr 0.02 -perm_num 3 -umiup 1 -umiul 10 -umi_num 50 -seq_len 20 -pcr_num 8 -pcr_err 0.0001 -seq_err 0.0001 -ampl_rate 0.85 -sim_thres 3 -spl_rate 1 -umi_lens 6;7;8;9;10;11;12 -out_dir ./