Attaching fake barcodes to GBS sequences

77 views
Skip to first unread message

Erena Edae

unread,
Oct 27, 2023, 12:30:50 PM10/27/23
to TASSEL - Trait Analysis by Association, Evolution and Linkage
I have GBS sequences without barcodes. I want to attach fake barcodes to each sample sequences to make the data compatible for TASSEL pipeline for SNP calling on Unix machine. Any suggestion is appreciated.

Thanks,
Erena.

Lynn Carol Johnson

unread,
Oct 30, 2023, 9:31:21 AM10/30/23
to tas...@googlegroups.com

Hi Erena –

 

An R script that creates new fastQ files with barcodes is the barcode_faker.R script, found here. Note that after running this script, you will need to create a keyfile from the data exported by the R script. See the script for details.

 

Lynn

 

--
You received this message because you are subscribed to the Google Groups "TASSEL - Trait Analysis by Association, Evolution and Linkage" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tassel+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tassel/ff18dc2f-2b91-4de8-bcbe-08112604be74n%40googlegroups.com.

Erena Edae

unread,
Oct 30, 2023, 5:22:17 PM10/30/23
to tas...@googlegroups.com
Hi Lynn,

I saw that R script but it did not work with R on the remote server (Unix machine). I do not have enough space to download the sequence data to run on a local machine with R studio.

Thanks,
Erena.



--
----------
Erena A. Edae, PhD
Researcher, Department of Plant Pathology, University of Minnesota
1551 Lindig st, St. Paul, MN 55108
Skype: erena.a.edae

Lynn Carol Johnson

unread,
Oct 31, 2023, 6:36:49 AM10/31/23
to tas...@googlegroups.com

Hi Erena –

 

I’m sorry that doesn’t work for you.  Perhaps someone in this group has a suggestion.

 

Lynn

 

Erena Edae

unread,
Oct 31, 2023, 6:34:21 PM10/31/23
to tas...@googlegroups.com
Hi Lynn,

It seems to be working with Rstudio Server, but it completed only 1/2 of the total files in the last 18 hours.

Thanks,
Erena.

Erena Edae

unread,
Nov 5, 2023, 12:24:33 PM11/5/23
to tas...@googlegroups.com
Hi Lynn,
I was try to run " FastqToTagCountPlugin ". Why "Error"after at each barcode? Is it something to worry about "

"pool-1-thread-1] ERROR net.maizegenetics.analysis.gbs.FastqToTagCountPlugin - Good Barcodes Read: 14510"

Thanks,
Erena.



On Mon, Oct 30, 2023 at 9:31 AM Lynn Carol Johnson <lc...@cornell.edu> wrote:

Lynn Carol Johnson

unread,
Nov 6, 2023, 3:15:41 PM11/6/23
to tas...@googlegroups.com

Hi Erena -

 

There should be a stack trace after that message.  Will you send me the full log file?

 

Lynn

 

From: 'Erena Edae' via TASSEL - Trait Analysis by Association, Evolution and Linkage <tas...@googlegroups.com>


Date: Sunday, November 5, 2023 at 12:24 PM
To: tas...@googlegroups.com <tas...@googlegroups.com>

Erena Edae

unread,
Nov 6, 2023, 5:36:50 PM11/6/23
to tas...@googlegroups.com
Hi Lynn,
I have attached the error and output file. I do not have a log file.

Thanks,
Erena.

GBS_SNP_calling-160058989.err
GBS_SNP_calling-160058989.out

Lynn Carol Johnson

unread,
Nov 7, 2023, 1:08:50 PM11/7/23
to tas...@googlegroups.com

HI Erena –

 

It looks like you are using the older GBS pipeline version, GBS v1.   Is there a reason not to run using the GBSv2 pipeline?  The v2 pipeline is more recent and easier to debug, and may not have the same issues.  We don’t have anyone around who supports the GBSv1 version. 

 

You can find information on GBSv2 here:
https://bitbucket.org/tasseladmin/tassel-5-source/wiki/Tassel5GBSv2Pipeline

 

Lynn

 

Erena Edae

unread,
Nov 7, 2023, 7:09:16 PM11/7/23
to tas...@googlegroups.com
Thank you. I will try the new version. It has been a long time since I used the GBS pipeline.

Thanks,
Erena.

Erena Edae

unread,
Nov 11, 2023, 11:06:27 AM11/11/23
to TASSEL - Trait Analysis by Association, Evolution and Linkage
Hi Lynn,
I have issues with the "SNPQualityProfilerPlugin". I have attached the logfiles.
GBS_SNP_calling-160164468.out

Lynn Carol Johnson

unread,
Nov 13, 2023, 10:11:51 AM11/13/23
to tas...@googlegroups.com

What did you give as parameter names for taxa and tname?  If you provide values for these parameters, “taxa” must be a file with names in taxa list format, one per line, e.g.

<NAME>
M0034:C05F2ACXX:5:250021031
M0035:C05F2ACXX:5:250021014
M0077:C05F2ACXX:5:250021034
M0079:C05F2ACXX:5:250021046
M0322:C05F2ACXX:5:250021058
M0323:C05F2ACXX:5:250021070


The log file shows a value of “none” – if you don’t want to give a value, leave that parameter off your command line.  Same with the “tname” parameter.  If you don’t want to list something, e.g.”IBM”,  remove that parameter from the command line.

 

 

Erena Edae

unread,
Nov 13, 2023, 4:57:51 PM11/13/23
to tas...@googlegroups.com
Thank you. I figured it out.

Erena.

Erena Edae

unread,
Nov 13, 2023, 5:07:19 PM11/13/23
to tas...@googlegroups.com
Hi Lynn,

I have issues with the -ProductionSNPCallerPluginV2. It seems working but the genotypic data is " N" almost for all SNPs and Taxa. 
I have attached the log file and SNP output file.

Thanks,
Erena.

On Mon, Nov 13, 2023 at 10:11 AM Lynn Carol Johnson <lc...@cornell.edu> wrote:
GBS_SNP_calling-160170181.out

Lynn Carol Johnson

unread,
Nov 14, 2023, 12:14:43 PM11/14/23
to tas...@googlegroups.com

Hi Erena –

 

Can you  look at your input, look at the data you have stored in the db and verify its correctness?  Were you able to determine anything via the SNPQualityProfilerPlugin output? 

The genotype table is built based on the tag data stored in the database.   Perhaps play with some of the parameters to the plugin – setup the “mnQS” lower and see if it makes a difference. 

 

As a programmer (and not a biologist) I don’t have the same insight you might get from some of your colleagues.  Is there a biologist with whom you can consult?

 

Lynn

 

From: 'Erena Edae' via TASSEL - Trait Analysis by Association, Evolution and Linkage <tas...@googlegroups.com>
Date: Monday, November 13, 2023 at 5:07 PM
To: tas...@googlegroups.com <tas...@googlegroups.com>
Subject: Re: [TASSEL-Group] Attaching fake barcodes to GBS sequences

Hi Lynn,

 

I have issues with the -ProductionSNPCallerPluginV2. It seems working but the genotypic data is " N" almost for all SNPs and Taxa. 

I have attached the log file and SNP output file.

Reply all
Reply to author
Forward
0 new messages