Desperately asking for help on Process-radtags for already demultiplexed paired end ddRAD data

이수랑

unread,

Jul 17, 2018, 2:44:52 AM7/17/18

to Stacks

Dear Stacks users,

I've gotten a paired end ddRAD data and tried to analyze them.

FYI, I am fairly new to Stacks, so please enlighten me with your experience and knowledge.

The data was already demultiplexed so that I do not have bar code files. Therefore all of the samples were already renamed with the ID for each sample.

The first attempt for cleaning and trimming was with process-radtags as the following:

process_radtags -P -p ./raw -o ./samples -c -q -r --renz_1 ecoRI --renz_2 mspI

Unfortunately, the results I got seemed a little off.

Every .rem file stored quite big reads and of course the next step denovo.pl did not work with an error of length difference.

Would it be one of the flag causing the problem? If I drop -r flag from the command, the result would be okay?

As far as I understand, -r is for rescuing bar codes. If the data has been demultiplexed, would it mean that bar codes should not be rescued?

Should I use the process-radtags for each sample since the files already have renamed?

The names look like 5473-AA-HiSeqddRAD_R1_AG01.fastq.gz 5473-AA-HiSeqddRAD_R2_AG01.fastq.gz

I have no idea what is causing the problematic length variation.

Please give me some advice on the problem.

Thank you for the help in advance.

Nicolas Rochette

unread,

Jul 17, 2018, 12:38:25 PM7/17/18

to Stacks

Hi,

Without the log file we don't really have enough information to help.

Best,

Nicolas

--
Stacks website: http://catchenlab.life.illinois.edu/stacks/
---
You received this message because you are subscribed to the Google Groups "Stacks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stacks-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/stacks-users/9aa60de3-b974-4a05-b063-4fee2f72712e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

이수랑

unread,

Jul 18, 2018, 8:53:50 AM7/18/18

to Stacks

Thank you Nicolas,

I am attaching the log file process_radtags.

This log was written after I made a little change for an option flag (-r). This time I did not use -r flag so the command I used was the following.

process_radtags -P -p ./raw -o ./samples -c -q --renz_1 ecoRI --renz_2 mspI

I hope this clarifies things.

Thank you for your advice in advance.

2018년 7월 18일 수요일 오전 1시 38분 25초 UTC+9, Nicolas Rochette 님의 말:

process_radtags.raw.log

Nicolas Rochette

unread,

Jul 18, 2018, 12:51:43 PM7/18/18

to Stacks

Having a log of "Ambiguous RAD-tags" means that process_radtags doesn't see the restriction site remnants (e.g. AATTC for ecoRI). Please see steps 5 and 6 of

Rochette NC, Catchen JM (2017) Deriving genotypes from RAD-seq short-read data using Stacks. Nat Protoc, 12, 2640–2659. doi.org/10.1038/nprot.2017.123.

for indications about what to check.

Regarding the length variation, are all your reads the same length?

Best,

Nicolas

To view this discussion on the web visit https://groups.google.com/d/msgid/stacks-users/b8afccdc-0035-46d5-b366-3755d9e707a7%40googlegroups.com.

이수랑

unread,

Jul 19, 2018, 2:59:26 AM7/19/18

to stacks...@googlegroups.com

Dear Nicolas,

Thank you for the paper. The reads lengths vary greatly among samples.

I thought depending on the data quality between samples the read sizes would naturally vary. Some of samples were poor in quality from the biginning.

Does this mean that I cannot use process radtags?

If I can, can you give me advice on how I address the length issue?

Thank you for your help.

To view this discussion on the web visit https://groups.google.com/d/msgid/stacks-users/4e9a39bb-428d-1ab3-30e7-f6c3a23ec647%40illinois.edu.

Nicolas Rochette

unread,

Jul 19, 2018, 12:53:34 PM7/19/18

to Stacks

Hi 이수랑,

Your read lengths probably vary greatly within each sample as well. You would need to go back to the original data. The person who prepared the library should know the barcodes.

Best,

Nicolas

To view this discussion on the web visit https://groups.google.com/d/msgid/stacks-users/CAL%2BdP1Lp8ua9zKz-ybMbOaukNbvNDznDdNPGAQg9rFFdsBFb%3Dw%40mail.gmail.com.

이수랑

unread,

Aug 13, 2018, 2:54:36 AM8/13/18

to Stacks

Dear Nicolas,

I have checked the length variation within each sample and found out they were vary from 130-150.

The RAD data are the products of paired end ddRAD seq.

I finally received the barcode file. It looks like the company used two index barcodes.

As I explained earlier, the data were initially demulitplexed and given to me with original sample names I submitted.

After proess-radtags, reads are supposed to be uniform in length within each sample ideally.

Why do some data produce varying read lengths? You said it it something to do with barcodes/

Are there supposed to be barcode remaining in the reads and the remaining causes the length variation?

Previously, you suggested not to use force_diff_len to make the reads uniform in length.

Then how am I supposed to proceed further steps to identify SNPs after all?

You have helped me a lot already, but if you could please give me more guidance on the problem.

I really appreciate your help in advance.

2018년 7월 20일 금요일 오전 1시 53분 34초 UTC+9, Nicolas Rochette 님의 말:

Stacks newbie

unread,

Oct 29, 2019, 3:33:20 PM10/29/19

to Stacks

I am in the same trouble as this. Process_radtags results are off. the buld of the reads is in the .rem files, while the main files are more or less empty. my major reason for using process radtags is because my original demultiplexed reads were of variable lengths and as such I could not use Stacks without trimming. I am trying to do so with process radtags and it is discarding all my reads and putting them in the .rem files. my .log file is attached. Please, HELP!

process_radtags.allA.log

Masse T.

unread,

Oct 30, 2019, 7:04:37 AM10/30/19

to stacks...@googlegroups.com

I have the same problem, please tell me if you found out a solution!

however, since my reads were were originally multiplexed and already trimmed and merged using PEAR, I thought maybe it is not even necessary to run process.radtag, so I continued with my original reads to next step, reference-mapping using gstack, I haven't got an error of unequal lengths yet, hope not! if so, I'll let you know

--
Stacks website: http://catchenlab.life.illinois.edu/stacks/
---
You received this message because you are subscribed to the Google Groups "Stacks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stacks-users...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/stacks-users/b1efd235-0315-4784-ba41-7f2ad7c65ab4%40googlegroups.com.

--

Best Wishes

Mohadeseh

P Please consider the environment before printing anything

Catchen, Julian

unread,

Oct 30, 2019, 12:33:33 PM10/30/19

to stacks...@googlegroups.com, Masse T.

We do not recommend trimming and merging with pear prior to running
Stacks. If you are going to run Stacks, start with the raw, paired-end
reads and follow the standard protocol. gstacks depends on the reads
being separate for its assembly algorithm, and, if you have a reference
genome, you should not be using pear at all.

Masse T. wrote on 10/30/19 5:04 AM:

> step denovo.pl <http://denovo.pl> did not work with an error of

Reply all

Reply to author

Forward