Here is an explanation of each part of the command line, which are separated by the pipe symbol “|”. Using the attached example bam file, pipe each part of the command line into “less –S” to see what it does.
1) samtools view Example_file.bam – use samtools’ “view” command to transform binary bam into textual sam format. As with the following commands, the output of this command is piped into the next command 2) cut -f 1,10,11,12 – cut columns 1, 10, 11, 12 out of the file 3) sed 's/RG:Z://' – stream editor (sed) used as ‘s/text to be replaced/new text/’, in this example use the sed function to replace RG:Z: with nothing 4) gawk '{print $1"__"$4, $2, "+", $3}' - to create a new header row from merged information from different columns; use gawk to print columns 1 and 4 merged and separated by _, followed by column 2, with a + inserted before column 3, according with fastq formatting 5) sed 's/^\(.\)/@\1/' – the first column so far contains the fastq headers, but they are missing the “@” sign at the beginning. This command replaces the first character of each line with itself and also adds an “@” sign before it, according with fastq formatting 6) tr ' ' '\n' – transliterate i.e. replace each space with a return character \n 7) > Example_file.fastq – redirect the STDOUT (standard output) into a file on disk
Here is an explanation of each part of the command line, which are
separated by the pipe symbol �|�. Using the attached example bam
file, pipe each part of the command line into �less �S� to see
what it does.
1)��� samtools view Example_file.bam � use samtools� �view�
command to transform binary bam into textual sam format. As with
the following commands, the output of this command is piped into
the next command
2)��� cut -f 1,10,11,12 � cut columns 1, 10, 11, 12 out of the
file�
3)��� sed 's/RG:Z://' � stream editor (sed) used as �s/text to be
replaced/new text/�, in this example use the sed function to
replace RG:Z: with nothing
4)��� gawk '{print $1"__"$4, $2, "+", $3}' -� to create a new
header row from merged information from different columns; use
gawk to print columns 1 and 4 merged and separated by _, followed
by column 2, with a + inserted before column 3, according with
fastq formatting
5)��� sed 's/^\(.\)/@\1/' � the first column so far contains the
fastq headers, but they are missing the �@� sign at the beginning.
This command replaces the first character of each line with itself
and also adds an �@� sign before it, according with fastq
formatting
6)��� tr ' ' '\n' � transliterate i.e. replace each space with a
return character \n
7)��� > Example_file.fastq � redirect the STDOUT (standard
output) into a file on disk
--
******************************************************************
Ludovic Duvaux, Postdoctoral Research Associate
Department of Animal & Plant Sciences
Western Bank
University of Sheffield
Sheffield, S10 2TN
United Kingdom
Tel: +44 (0) 1142220112
******************************************************************