Xpression Software

0 views
Skip to first unread message

Jaunita Rousu

unread,
Aug 5, 2024, 1:11:49 AM8/5/24
to necnoteco
Xpressioncan be run from the command-line and the GUI. This section provides documentation for the command-line scripts themselves, as opposed to the GUI Xpression user guide. The material overlaps since the GUI simply wraps the scripts, although more detail can be found here since it is generated from the source itself.

Next, BWA maps the quality reads for each sample to the reference_file given. reference_fasta is a file of nucleic genome data in fasta format. This file may contain multiple references, such as the contigs in a draft genome. All reference names must match those located in the reference_genbank file if an expression profile is to be generated. BWA will use threads number_of_cpu and mismatches to the reference given by allowed_mismatches.


Then the SAM file produced by BWA is used with reference_genbank to count reads over genic and intergenic loci. For a read to be used in further processing, it must have a MAPQ score (Phred-scale) of 37 or greater. This quality threshold corresponds to read being unique, where reads with duplicate mapping locations are ignored.


Briefly, reads in in_fastq_file are filtered for quality and separated by barcode by extract_reads(), these results written to disk by write_results(), and read counts are printed to stdout by print_read_counts() to be collected by the calling function.


The biologically relevant segment of the read is determined by the position given by bioseq_start. This is a 1-based position, so for a barcode length of 4, using sample_bioseq_start of 5 means the read will be stored starting at the 5th nucleotide inthe read.


the total count for AAAA will be incremented, and if the quality is high, (ie, minimum base-wise quality score >= 20) the read will be added to the AAAA record andthe good count for AAAA will be incremented.

3a8082e126
Reply all
Reply to author
Forward
0 new messages