In this step, users can simulate the read-out (sequencing) of the DNA sequences.
There are three different sequencing technologies to chose from: (i) Oxford Nanopore, (ii) Illumina, and (iii) PacBio.
The DNA sequencing process follows the approach followed by Schwarz et Al. [1] and includes six sequencing technologies:
(i) single-read and (ii) paired-end by Illumina (Schirmer et al., 2016) [2], (iii) subread and (iv) CCS methods by PacBio
(Weirather et al., 2017) and (v) 1D and (vi) 2D sequencing by Oxford Nanopore (Weirather et al., 2017)[3].
Every method is characterised by a raw error rate as well as error rates and probabilites for three errors: insertions,
deletions, and mutations. The number of errors for a chosen method is calculated by multiplying the sequence length with
the raw error rate of the method. The selected type of each error is based on the weights of each error class, positional
weights or restrictions and a random number generator. The errors are applied sequentially, allowing for error cross-talk.
For example, if a deletion led to the formation of a triplet with a high error probability, this triplet will be included in
urther error evaluations.
References
-
[1] Schwarz, M., Welzel, M., Kabdullayeva, T., Becker, A., Freisleben, B. and Heider, D., 2020. MESA: automated assessment of synthetic DNA fragments and simulation of DNA synthesis, storage, sequencing and PCR errors. Bioinformatics, 36(11), pp.3322-3326.
-
[2] Schirmer, M., D'Amore, R. and Ijaz, U.Z., Hall, N. andQuince, C.(2016) Illumina error profiles: resolving fine-scale variation in metagenomic sequencing data. BMC bioinformatics, 17, p.125.
-
[3] Weirather, J.L., de Cesare, M., Wang, Y., Piazza, P., Sebastiano, V., Wang, X.J., Buck, D. and Au, K.F., 2017. Comprehensive comparison of Pacific Biosciences and Oxford Nanopore Technologies and their applications to transcriptome analysis. F1000Research, 6.