BAM to Fastq file conversion

0 views
Skip to first unread message

c.mcinerney

unread,
Aug 1, 2013, 11:48:22 AM8/1/13
to NGS...@googlegroups.com

Converting BAM to fastq file format:

This can be achieved using the following command:

samtools view AU05-445.bam | cut -f 1,10,11,12 | sed 's/RG:Z://' | gawk '{print $1"__"$4, $2, "+", $3}' | sed 's/^\(.\)/@\1/' | tr ' ' '\n' > AU05-445.fastq

Here is an explanation of each part of the command line, which are separated by the pipe symbol “|”. Using the attached example bam file, pipe each part of the command line into “less –S” to see what it does.

1)    samtools view Example_file.bam – use samtools’ “view” command to transform binary bam into textual sam format. As with the following commands, the output of this command is piped into the next command
2)    cut -f 1,10,11,12 – cut columns 1, 10, 11, 12 out of the file 
3)    sed 's/RG:Z://' – stream editor (sed) used as ‘s/text to be replaced/new text/’, in this example use the sed function to replace RG:Z: with nothing
4)    gawk '{print $1"__"$4, $2, "+", $3}' -  to create a new header row from merged information from different columns; use gawk to print columns 1 and 4 merged and separated by _, followed by column 2, with a + inserted before column 3, according with fastq formatting
5)    sed 's/^\(.\)/@\1/' – the first column so far contains the fastq headers, but they are missing the “@” sign at the beginning. This command replaces the first character of each line with itself and also adds an “@” sign before it, according with fastq formatting
6)    tr ' ' '\n' – transliterate i.e. replace each space with a return character \n
7)    > Example_file.fastq – redirect the STDOUT (standard output) into a file on disk

Also attached is a word file with the above text.

Best regards,
Caitríona and Claudius
Example_file.bam
Converting BAM to fastq.docx

Ludovic Duvaux

unread,
Aug 1, 2013, 11:56:13 AM8/1/13
to NGS...@googlegroups.com, c.mcinerney
Hi all,

here is a thread on the SEQanswers forum if you want to learn more about this (i.e. if you want to test other ways to convert BAM to fastq):
http://seqanswers.com/forums/showthread.php?t=7061

Thus another straightforward way can be to use the Picardtool "SAMToFastq" dedicated to this purpose.

Cheers,
Ludovic


On 01/08/13 16:48, c.mcinerney wrote:

Converting BAM to fastq file format:

This can be achieved using the following command:

samtools view AU05-445.bam | cut -f 1,10,11,12 | sed 's/RG:Z://' | gawk '{print $1"__"$4, $2, "+", $3}' | sed 's/^\(.\)/@\1/' | tr ' ' '\n' > AU05-445.fastq

Here is an explanation of each part of the command line, which are separated by the pipe symbol �|�. Using the attached example bam file, pipe each part of the command line into �less �S� to see what it does.

1)��� samtools view Example_file.bam � use samtools� �view� command to transform binary bam into textual sam format. As with the following commands, the output of this command is piped into the next command
2)��� cut -f 1,10,11,12 � cut columns 1, 10, 11, 12 out of the file�
3)��� sed 's/RG:Z://' � stream editor (sed) used as �s/text to be replaced/new text/�, in this example use the sed function to replace RG:Z: with nothing
4)��� gawk '{print $1"__"$4, $2, "+", $3}' -� to create a new header row from merged information from different columns; use gawk to print columns 1 and 4 merged and separated by _, followed by column 2, with a + inserted before column 3, according with fastq formatting
5)��� sed 's/^\(.\)/@\1/' � the first column so far contains the fastq headers, but they are missing the �@� sign at the beginning. This command replaces the first character of each line with itself and also adds an �@� sign before it, according with fastq formatting
6)��� tr ' ' '\n' � transliterate i.e. replace each space with a return character \n
7)��� > Example_file.fastq � redirect the STDOUT (standard output) into a file on disk


Also attached is a word file with the above text.

Best regards,
Caitr�ona and Claudius
--
You received this message because you are subscribed to the Google Groups "NGS Group APS Sheffield" group.
To unsubscribe from this group and stop receiving emails from it, send an email to NGSshef+u...@googlegroups.com.
To post to this group, send an email to NGS...@googlegroups.com.
Visit this group at http://groups.google.com/group/NGSshef.
For more options, visit https://groups.google.com/groups/opt_out.
�
�

-- 
******************************************************************
Ludovic Duvaux, Postdoctoral Research Associate

Department of Animal & Plant Sciences
Western Bank
University of Sheffield
Sheffield, S10 2TN
United Kingdom 

Tel: +44 (0) 1142220112

****************************************************************** 
Reply all
Reply to author
Forward
0 new messages