Quick start guide
tags: Resimpy
Overview
We set up a quick start guide to walk you through examples to use
Resimpy. Resimpy can be applied to generating a myriad of sequencing
data over a predefined number of PCR cycles and of course, along with a
number of other parameters. It is currently equipped with 3 modules,
that is, resimpy_general, resimpy_umi_sc, and
resimpy_transloc to support the generation of scRNA-seq reads,
UMI-exclusive reads, and other loosely-structured RNA-seq reads. Given
the error assignment approaches inherent to the nucleotide synthesis
process, almost all methods are inevitably confronted with several
challenges such as time-comsuming processes and high-momery
requirements, so is Resimpy. However, we speeded up Resimpy in a quite
notable manner owing to several optimization strategies. For instance,
we used Pandas vectorization and an optimized nucleotide position
retrieving strategy. We tested and run Resimpy with an initial sequence
number 50 in seconds below 14 PCR cycles, and in minutes below 18-20
cycles.
Documentation
The API documentation of Mclumi is available at Readthedocs https://resimpy.readthedocs.io/en/latest/index.html.
System Requirement
Cross platforms.
Installation
pip install --upgrade resimpyx
Usage
Command-Line Interface (CLI)
Overview
usage: resimpy_general [-h] --recipe recipe --read_structure read_structure
--permutation_num permutation_num
[--umi_unit_pattern umi_unit_pattern]
[--umi_unit_len_fixed umi_unit_len_fixed]
[--umi_num_fixed umi_num_fixed]
[--seq_length seq_length]
[--sim_thres_fixed sim_thres_fixed]
[--pcr_num_fixed pcr_num_fixed]
[--ampl_rate_fixed ampl_rate_fixed]
[--seq_sub_spl_rate seq_sub_spl_rate]
[--pcr_err_fixed pcr_err_fixed]
[--seq_err_fixed seq_err_fixed]
[--ampl_set_rates ampl_set_rates]
[--umi_unit_set_lens umi_unit_set_lens]
[--pcr_set_nums pcr_set_nums]
[--pcr_set_errs pcr_set_errs]
[--seq_set_errs seq_set_errs]
[--out_directory out_directory]
Welcome to the resimpy_general module
optional arguments:
-h, --help show this help message and exit
--recipe recipe, -r recipe
which condition among seq_errs, ampl_rates, pcr_errs,
pcr_nums, and umi_lens is used
--read_structure read_structure, -rs read_structure
read structure consisting of a UMI block (umi) and a
sequence block (seq), e.g., umi or umi+seq
--permutation_num permutation_num, -perm_num permutation_num
permutation test number
--umi_unit_pattern umi_unit_pattern, -umiup umi_unit_pattern
unit UMI pattern. This is to specify if UMIs consist
of monomer, dimer, trimer, or other blocks
--umi_unit_len_fixed umi_unit_len_fixed, -umiul umi_unit_len_fixed
unit UMI length fixed. This is to specify the length
of a monomer UMI. The final UMI length =
umi_unit_pattern * umi_unit_len_fixed
--umi_num_fixed umi_num_fixed, -umi_num umi_num_fixed
UMI number
--seq_length seq_length, -seq_len seq_length
genomic sequence length
--sim_thres_fixed sim_thres_fixed, -sim_thres sim_thres_fixed
edit distance-measured similarities between UMIs
--pcr_num_fixed pcr_num_fixed, -pcr_num pcr_num_fixed
Number of PCR cycles fixed
--ampl_rate_fixed ampl_rate_fixed, -ampl_rate ampl_rate_fixed
PCR amplification rate fixed
--seq_sub_spl_rate seq_sub_spl_rate, -spl_rate seq_sub_spl_rate
Subsampling rate for sequencing
--pcr_err_fixed pcr_err_fixed, -pcr_err pcr_err_fixed
PCR error fixed
--seq_err_fixed seq_err_fixed, -seq_err seq_err_fixed
Sequencing error fixed
--ampl_set_rates ampl_set_rates, -ampl_rates ampl_set_rates
a semicolon-partitioned string of a set of
amplification rates
--umi_unit_set_lens umi_unit_set_lens, -umi_lens umi_unit_set_lens
a semicolon-partitioned string of a set of unit UMI
lens
--pcr_set_nums pcr_set_nums, -pcr_nums pcr_set_nums
a semicolon-partitioned string of a set of PCR numbers
--pcr_set_errs pcr_set_errs, -pcr_errs pcr_set_errs
a semicolon-partitioned string of a set of PCR errors
--seq_set_errs seq_set_errs, -seq_errs seq_set_errs
a semicolon-partitioned string of a set of sequencing
errors
--out_directory out_directory, -out_dir out_directory
output directory