Hi,
I was wondering about the appropriateness of using raw sequencing (i.e. non-filtered) data as input for a Cortex run?
As Cortex is involved in the analysis of genetic variation through de novo assembly, and with the built-in filtering steps (e.g. --quality score threshold), would it be better to incorporate all sequences as opposed to a filtered dataset that comprises 80-90% of the original raw data?
I am just concerned about missing SNPs through stringent filtering and therefore, would it be advisable to run on raw sequencing data to reduce potential gaps during the assembly process?
Kind regards,
Joe