PASA issues with genome-guided/de novo assembly

293 views
Skip to first unread message

jvi...@cub.uca.edu

unread,
Feb 13, 2017, 3:49:17 PM2/13/17
to pasapipeline-users
I am having issues getting the PASA alignment to work. I keep getting the error:


" ## Processing CMD: 1/23
14:43:32    CMD: /usr/local/bin/PASApipeline-2.0.2/scripts/..//scripts/upload_transcript_data.dbi -M localhost -t X47P09.transcriptome.concatenated.fasta  -f NULL
DBD::mysql::db do failed: Table 'localhost.cdna_info' doesn't exist at /usr/local/bin/PASApipeline-2.0.2/scripts/..//PerlLib/Mysql_connect.pm line 162, <$filehandle> line 1.
failed query: < insert cdna_info (cdna_acc, is_assembly, is_fli, is_TDN, length, header) values (?,?,?,?,?,?) >    values: TRINITY_DN31359_c0_g1_i1 0 0 0 213 TRINITY_DN31359_c0_g1_i1 len=213 path=[191:0-212] [-1, 191, -2]
Errors: Table 'localhost.cdna_info' doesn't exist
 at /usr/local/bin/PASApipeline-2.0.2/scripts/..//PerlLib/Mysql_connect.pm line 173, <$filehandle> line 1.
    Mysql_connect::RunMod(Mysql_connect=HASH(0x1253650), " insert cdna_info (cdna_acc, is_assembly, is_fli, is_TDN, len"..., "TRINITY_DN31359_c0_g1_i1", 0, 0, 0, 213, "TRINITY_DN31359_c0_g1_i1 len=213 path=[191:0-212] [-1, 191, -2]") called at /usr/local/bin/PASApipeline-2.0.2/scripts/..//scripts/upload_transcript_data.dbi line 97
Issuing rollback() due to DESTROY without explicit disconnect() of DBD::mysql::db handle localhost:localhost at /usr/share/perl/5.22/Carp.pm line 167.


ERROR: The following command died with exit code (6400):
/usr/local/bin/PASApipeline-2.0.2/scripts/..//scripts/upload_transcript_data.dbi -M localhost -t X47P09.transcriptome.concatenated.fasta  -f NULL

Must re-run pipeline starting at index [1] via running Launch_PASA_pipeline with parameter '-s 1' (exclude param '-C' since db already created) "

However, even when I try using -s and excluding -C I get the same error.

Attached is the terminal output.

Any help would be much appreciated.

On a side note will future Trinity releases have an option for a hybrid approach?

James

Screenshot from 2017-02-13 14-43-53.png

Brian Haas

unread,
Feb 13, 2017, 4:20:59 PM2/13/17
to jvi...@cub.uca.edu, pasapipeline-users
Hi James,

It seems it's not allowing you to create the mysql database.  

For the mysql connection to work right, the 

    pasa_conf/conf.txt

file needs to have your connection information, and your 

  alignAssembly.conf 

file will indicate the PASA database name to create and use.

'localhost' is generally the name of the server, if you're running mysql locally.  For your database, maybe create a name like 'my_pasa_db' or something.  It needs to be unique to each target data set being analyzed.

Finally, be sure to run the sample data pipeline through to ensure it's all working (in case you're trying to run your own data first).

best,

~brian



--
You received this message because you are subscribed to the Google Groups "pasapipeline-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pasapipeline-users+unsub...@googlegroups.com.
To post to this group, send email to pasapipeline-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pasapipeline-users/03c46934-fb2f-4960-bc76-4c9da677a596%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

jvi...@cub.uca.edu

unread,
Feb 14, 2017, 1:58:38 PM2/14/17
to pasapipeline-users
First off thank you for your help.

I went through the PASA instructions and have written out all I did to try to get it setup correctly. I am still getting the same error when running the sample data. Attached are the commands and error outputs I ran into.

I'm not entirely sure if I am setting up PASA and mysql correctly.

Any help would be much appreciated.

Thanks,
James
cmds

Brian Haas

unread,
Feb 14, 2017, 9:12:07 PM2/14/17
to jvi...@cub.uca.edu, pasapipeline-users
Hi James,

It looks like this step failed;

  Configuring DBD-mysql-4.041 ... N/A

! Configure failed for DBD-mysql-4.041. See /home/loki/.cpanm/work/1487096150.8464/build.log for details.

and so the mysql connection isn't going to work.

It looks like there are some other errors too that you might need to revisit.

I'll admit that getting this set up and running can be a bit challenging, and I don't have a lot of time to support it unfortunately.




--
You received this message because you are subscribed to the Google Groups "pasapipeline-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pasapipeline-users+unsub...@googlegroups.com.
To post to this group, send email to pasapipeline-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

jvi...@cub.uca.edu

unread,
Feb 14, 2017, 9:13:06 PM2/14/17
to pasapipeline-users
Sorry that was supposed to be a .txt file. Here is is in the proper format.
cmds.txt

jvi...@cub.uca.edu

unread,
Feb 14, 2017, 11:36:44 PM2/14/17
to pasapipeline-users
Thank you Brian. I have found a fix for the DBD::mysql part. Now the error looks maybe a bit more manageable. The error message on the run with the -s 1 parameter looks like it actually started to do something with the sequences.

mysql seems to be working as best I can tell now:
mysql> SHOW DATABASES;
+-------------------------+
| Database            |
+-------------------------+
information_schema
mysql             
performance_schema
phpmyadmin        
sample_mydb_pasa  
sys               

6 rows in set (0.00 sec)

##############################################
# with -C parameter:
##############################################

sudo ../scripts/Launch_PASA_pipeline.pl -c alignAssembly.config -C -r -R -g genome_sample.fasta -t all_transcripts.fasta.clean -T -u all_transcripts.fasta -f FL_accs.txt --ALIGNERS blat,gmap --CPU 2 -N 2 --TDN tdn.accs  --IMPORT_CUSTOM_ALIGNMENTS_GFF3 custom_alignments.gff3


## Processing CMD:
22:23:12    CMD: /usr/local/bin/PASApipeline-2.0.2/scripts/..//scripts/create_mysql_cdnaassembly_db.dbi -c alignAssembly.config -S /usr/local/bin/PASApipeline-2.0.2/scripts/..//schema/cdna_alignment_mysqlschema -r
DBD::mysql::db do failed: Can't drop database 'sample_mydb_pasa'; database doesn't exist at /usr/local/bin/PASApipeline-2.0.2/scripts/..//PerlLib/Mysql_connect.pm line 162.
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1067 (42000) at line 111 in file: '/usr/local/bin/PASApipeline-2.0.2/schema/cdna_alignment_mysqlschema': Invalid default value for 'date'
CMD: /usr/bin/mysql  -uroot -ppassword -hlocalhost -Dsample_mydb_pasa -e 'source /usr/local/bin/PASApipeline-2.0.2/scripts/..//schema/cdna_alignment_mysqlschema' failed.
Use of uninitialized value $index in concatenation (.) or string at ../scripts/Launch_PASA_pipeline.pl line 981.
Use of uninitialized value $index in concatenation (.) or string at ../scripts/Launch_PASA_pipeline.pl line 981.


ERROR: The following command died with exit code (256):
/usr/local/bin/PASApipeline-2.0.2/scripts/..//scripts/create_mysql_cdnaassembly_db.dbi -c alignAssembly.config -S /usr/local/bin/PASApipeline-2.0.2/scripts/..//schema/cdna_alignment_mysqlschema -r

Must re-run pipeline starting at index [] via running Launch_PASA_pipeline with parameter '-s ' (exclude param '-C' since db already created)

###############################################
# With parameter -s 1:
###############################################

sudo ../scripts/Launch_PASA_pipeline.pl -c alignAssembly.config -s 1 -r -R -g genome_sample.fasta -t all_transcripts.fasta.clean -T -u all_transcripts.fasta -f FL_accs.txt --ALIGNERS blat,gmap --CPU 2 -N 2 --TDN tdn.accs  --IMPORT_CUSTOM_ALIGNMENTS_GFF3 custom_alignments.gff3


## Processing CMD: 1/40
22:26:20    CMD: /usr/local/bin/PASApipeline-2.0.2/scripts/..//scripts/upload_transcript_data.dbi -M sample_mydb_pasa -t all_transcripts.fasta.clean -T tdn.accs -f FL_accs.txt
DBD::mysql::db do failed: Table 'sample_mydb_pasa.cdna_info' doesn't exist at /usr/local/bin/PASApipeline-2.0.2/scripts/..//PerlLib/Mysql_connect.pm line 162, <$filehandle> line 1.
failed query: < insert cdna_info (cdna_acc, is_assembly, is_fli, is_TDN, length, header) values (?,?,?,?,?,?) >    values: gi|1019751|gb|F15262.1|F15262 0 0 0 221 gi|1019751|gb|F15262.1|F15262 ATTS5510 Gif-SiliqueB Arabidopsis thaliana cDNA clone YBY179 3' similar to Serine-type carboxypeptidase, mRNA sequence
Errors: Table 'sample_mydb_pasa.cdna_info' doesn't exist

 at /usr/local/bin/PASApipeline-2.0.2/scripts/..//PerlLib/Mysql_connect.pm line 173, <$filehandle> line 1.
    Mysql_connect::RunMod(Mysql_connect=HASH(0x21d3d40), " insert cdna_info (cdna_acc, is_assembly, is_fli, is_TDN, len"..., "gi|1019751|gb|F15262.1|F15262", 0, 0, 0, 221, "gi|1019751|gb|F15262.1|F15262 ATTS5510 Gif-SiliqueB Arabidops"...) called at /usr/local/bin/PASApipeline-2.0.2/scripts/..//scripts/upload_transcript_data.dbi line 97
Issuing rollback() due to DESTROY without explicit disconnect() of DBD::mysql::db handle sample_mydb_pasa:localhost at /usr/share/perl/5.22/Carp.pm line 167.



ERROR: The following command died with exit code (6400):
/usr/local/bin/PASApipeline-2.0.2/scripts/..//scripts/upload_transcript_data.dbi -M sample_mydb_pasa -t all_transcripts.fasta.clean -T tdn.accs -f FL_accs.txt

Must re-run pipeline starting at index [1] via running Launch_PASA_pipeline with parameter '-s 1' (exclude param '-C' since db already created)

Thanks,
James



Brian Haas

unread,
Feb 15, 2017, 7:49:05 AM2/15/17
to jvi...@cub.uca.edu, pasapipeline-users
Great!

This issue looks to be related to the version of MySQL being used.  Which version are you using here?

-Brian
(by iPhone)

--
You received this message because you are subscribed to the Google Groups "pasapipeline-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pasapipeline-us...@googlegroups.com.
To post to this group, send email to pasapipel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pasapipeline-users/3166fb6c-b79a-443c-8d7b-67f8255b1690%40googlegroups.com.

jvi...@cub.uca.edu

unread,
Feb 15, 2017, 8:56:19 AM2/15/17
to pasapipeline-users, jvi...@cub.uca.edu
mysql  Ver 14.14 Distrib 5.7.17, for Linux (x86_64) using  EditLine wrapper

James
To unsubscribe from this group and stop receiving emails from it, send an email to pasapipeline-users+unsub...@googlegroups.com.

Brian Haas

unread,
Feb 15, 2017, 9:08:41 AM2/15/17
to jvi...@cub.uca.edu, pasapipeline-users
Gotcha.

OK - this morning I'll see what I can do to quickly modernize the system so it'll work better for everyone.

more later,

~b

To unsubscribe from this group and stop receiving emails from it, send an email to pasapipeline-users+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "pasapipeline-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pasapipeline-users+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

jvi...@cub.uca.edu

unread,
Feb 15, 2017, 9:50:10 AM2/15/17
to pasapipeline-users, jvi...@cub.uca.edu
Thank you Brian. I cannot express enough how much I appreciate your work.

-James
To unsubscribe from this group and stop receiving emails from it, send an email to pasapipeline-users+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "pasapipeline-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pasapipeline-users+unsub...@googlegroups.com.
To post to this group, send email to pasapipel...@googlegroups.com.

Brian Haas

unread,
Feb 15, 2017, 11:06:46 AM2/15/17
to jvi...@cub.uca.edu, pasapipeline-users

I was able to make a few changes and get it working with the latest version of mysql.

Can you try pulling the code:


and see if this latest code works for you?


best,

~b

jvi...@cub.uca.edu

unread,
Feb 15, 2017, 5:07:25 PM2/15/17
to pasapipeline-users, jvi...@cub.uca.edu
This update got it working through the sample pipeline, however PASA could not find the fasta program I renamed from fasta36. It is in my /usr/local/bin and my $PATH output is:

loki@loki-bio:/usr/local/bin/PASApipeline-master/sample_data$ echo $PATH/home/loki/bin:/home/loki/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/bin/fasta-36.3.8e/bin/fasta:/usr/local/bin/fasta-36.3.8e/bin/

I have had this issue with programs in /usr/local/bin not being found in the path. Even on fresh Ubuntu installs. I admit i am new to linux and I have looked for solutions online, but I have yet to find one that works.

## Processing CMD:
15:57:43    CMD: /usr/local/bin/PASApipeline-master/scripts/..//scripts/cDNA_annotation_comparer.dbi -G genome_sample.fasta --CPU 2 -M sample_mydb_pasa  > pasa_run.log.dir/sample_mydb_pasa.annotation_compare.20110.out
Thread 1 terminated abnormally: Cannot find program fasta
Thread 2 terminated abnormally: Cannot find program fasta
ERROR, thread 1 exited with error Cannot find program fasta

ERROR, thread 2 exited with error Cannot find program fasta

Thread 4 terminated abnormally: Cannot find program fasta
Thread 3 terminated abnormally: Cannot find program fasta
ERROR, thread 3 exited with error Cannot find program fasta

ERROR, thread 4 exited with error Cannot find program fasta

Thread 5 terminated abnormally: Cannot find program fasta
Thread 6 terminated abnormally: Cannot find program fasta
ERROR, thread 5 exited with error Cannot find program fasta

ERROR, thread 6 exited with error Cannot find program fasta

Thread 7 terminated abnormally: Cannot find program fasta
Thread 8 terminated abnormally: Cannot find program fasta
ERROR, thread 7 exited with error Cannot find program fasta

ERROR, thread 8 exited with error Cannot find program fasta

Thread 9 terminated abnormally: Cannot find program fasta
Thread 10 terminated abnormally: Cannot find program fasta
ERROR, thread 9 exited with error Cannot find program fasta

ERROR, thread 10 exited with error Cannot find program fasta

Thread 12 terminated abnormally: Cannot find program fasta
Thread 11 terminated abnormally: Cannot find program fasta
ERROR, thread 11 exited with error Cannot find program fasta

ERROR, thread 12 exited with error Cannot find program fasta

Thread 13 terminated abnormally: Cannot find program fasta
ERROR, thread 13 exited with error Cannot find program fasta

Thread 15 terminated abnormally: Cannot find program fasta
Thread 16 terminated abnormally: Cannot find program fasta
ERROR, thread 15 exited with error Cannot find program fasta

ERROR, thread 16 exited with error Cannot find program fasta

Thread 17 terminated abnormally: Cannot find program fasta
Thread 18 terminated abnormally: Cannot find program fasta
ERROR, thread 17 exited with error Cannot find program fasta

ERROR, thread 18 exited with error Cannot find program fasta

Thread 20 terminated abnormally: Cannot find program fasta
ERROR, thread 20 exited with error Cannot find program fasta

Thread 22 terminated abnormally: Cannot find program fasta
Thread 21 terminated abnormally: Cannot find program fasta
ERROR, thread 21 exited with error Cannot find program fasta

ERROR, thread 22 exited with error Cannot find program fasta

Thread 25 terminated abnormally: Cannot find program fasta
ERROR, thread 25 exited with error Cannot find program fasta

Thread(29)    SUCCESS    CMD: unknown    Time to complete: 0 seconds
Thread(30)    SUCCESS    CMD: unknown    Time to complete: 1 seconds
Error, there were 21 threads (contig jobs) that failed...  See error messages above in order to troubleshoot further at /usr/local/bin/PASApipeline-master/scripts/..//scripts/cDNA_annotation_comparer.dbi line 311.

Do you have any idea how to force Ubuntu to look in /usr/local/bin. I thought just having it as a path would be sufficient, but I must be missing something here.

Thank you,
James

Thank you for this update, I'm excited to get it working with my data.

Brian Haas

unread,
Feb 15, 2017, 7:56:13 PM2/15/17
to jvi...@cub.uca.edu, pasapipeline-users
Ah - hopefully this is an easy one.

Just set your PATH:

   export PATH=${PATH}:/usr/local/bin

You can put this in your ~/.bashrc file in order to make it permanent.

~b

--
You received this message because you are subscribed to the Google Groups "pasapipeline-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pasapipeline-users+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

jvi...@cub.uca.edu

unread,
Feb 15, 2017, 10:39:31 PM2/15/17
to pasapipeline-users, jvi...@cub.uca.edu
I think it wasn't the path but something I messed up trying to set up the seqclean bundled with PASA. I got around this by installing the most recent seqclean source code. After the fact I learned from getting the blat binary working that you sometimes have to grant +x permissions.

The PASA pipeline appears to be running fine now.

Thank you for all the help,
James

Brian Haas

unread,
Feb 16, 2017, 7:02:44 AM2/16/17
to jvi...@cub.uca.edu, pasapipeline-users
nice job getting it set up!   

I'll aim to put out a new official release shortly that includes these latest updates.

~b

--
You received this message because you are subscribed to the Google Groups "pasapipeline-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pasapipeline-users+unsub...@googlegroups.com.
To post to this group, send email to pasapipeline-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

jvi...@cub.uca.edu

unread,
Feb 16, 2017, 8:01:42 AM2/16/17
to pasapipeline-users, jvi...@cub.uca.edu
Thank you for all the help!

I'm curious if the comphre_init_build.fasta output looks normal:

>asmbl_9996
CGCTAAAGAAACTTGAAGAAATTATTGAAGCATCAGATGGAATCATGGTTGCACGGGGTGACCTTGGAGTCGAAATACCACTAGAACAGGTCCCAAATATTCAGGAAGAAATAATCGATCTTTGCAGGCAACTAAACAAGCCAGTTATCATAGCTTCACAGCTTCTTGAGTCAATGATCGAATATCCTACCCCAACTCGAGCTGAGGTTGCTGATGTTTCTGAGGCAGTTAGACAGTGTGCTGATGCCTTGATGCTATCTGGTGAATCAGCTATGGGATCATATGGTCACAAGGCTATATCTGTTTTAAGAACGACTAGCTCCCGAATGGAGCTATGGAATCGTGAGATGAACCAACAAATACTTCTTCCTCAACTTGGAGATTCATTGC
>asmbl_9997
GGAATGGCTACTTTTGAAGTCATTGAAATGATTGGAAATGACTTACGTTGCAAGTGTACAGACCCCGGTTTACTTTTACCTCGAGCTAAACTGAGTTTCTGGAGAAATGGGAGACTTGTTGAAAAACATCATGAGCTCCCAACTTTATCTGCAAAGGATTGGTCTGATATCGAGTTTGGAATTTCAGAAGGGGTTGATTTTGTTGCTGTTTCATTTGTAAAAGATGCTGCTGCCATCAATCATTTAAAG
>asmbl_9998
TTTTAATCTGACCCAACGCCCAAAATCTTGTATTCAACCCATGGCTTCTGTGACTGCTGATTTCGCTCTCATGTCACCAGGGTTAAATTTCAAAAGAAGGGTATTTAAATCTCCAGTTTCGCCTATTAGAGTAGGCTTTAGGACAAAACTGAATGGTTTAGTGGCTAAGGCTGTGTTGCAAGAAGAGTTAAAAGATATTGAGAAGATAGAAGTTTTGGGATTTGA
>asmbl_9999
CCCCAATTCAAATTCATTCAATCTTTCATATTTCTTCTCTATATTGAAAAATAATCTGGAATTTATAATGTCGAATATCGGATCAAATCGGTCTCCTTCTCCTCTTTCTTCAAGAAACTGTAGAAACTCGGAATCGAATTCAACCATGAGAATGAGTTTCAATGAAAACCCTTTCGGTTCTAGACCTAATGCACTCATAAACCCTAAAAAAAACTTAAACCCACCTACCCCTGCTAACACCCCTACTACTACTACAGATCATATGAAGAGGAATTCAATCACTCGTAAAAGCGTTCCGATGTTTCAAGATGGGAAAGAGAACCATAAAGATGCACTTAGATCACCTGCAAAAGGGGGTTCAAAGAACTTCATGGCACCGACGATTTCGGCTGCTTCGAAGTTCACTCCATCTCCAAGAAAGAAAATCTTGGGTGAGAAAAATGATCTCACAAGAACTTCAATTCAATTTTCGGGTAAAGATTCTGATTTGAAATGTGATTCTATGAATTTTGATCAAATTGTTGAACAAAAGGAAGTTGTTATTGAAACCCCTGCCGTTAAAAAGGTAACTTTCAGTTCAACTAATGATGAAAAGGAAATGACTTCTGATGTTACTGAGGATAATACCGATTTTGTTAAGATTAGGCCTTTTTGTTGTTCTCCTATAACTTCACCGCTTATTGCCCCTCTTGATGCTGACCCTTATGTTCCTCCATATGACCCGAAAAAGAATTTTCTTTCGCCTAGACCTCAGTATCTTCGTTATAAGCCAAACCCTAGAATCGATCATTTACTAAATAAAGAAGATTATGGAGAAGATGATGTCACTAGACTTGAAGAGAGCTTTAATCTGTCTCTAAGTTCATTAGAAAGTGAAGAACAAGTAGGAAAGAATGAAGCCGAATTTGATGATTCCGTTAATGGGTCATCAGAAGATATATCTGAAGTAGGTTCAGAACCCGATTTTGAGGTGGAAAAAGCATCAAAGCCTCTGTCATCAGACGATTCGTCTGAGATAGTTTCAGAAGAAAAACATCCGGATTTTGAGGTGGTAAAAGCATCTAAGCCTCTGTCATCAGACGATTCGTCTGAGATAGTTTCAGAAGAAAAACAGTATGATTCAATGGTGAAAAAAGCATCAAAACCTCTGTTTTTTACACGAACCAAGACTGTTTCGTTTATTTTTGTGATGTTCTTGGTAGCATGTTTTTCACTCTCATTTACCGAATCTCCACCCATAGATTTGCCAATCTATAATGACATTGGCTTCCCAGATGTTTATCATGAATCTTTGAAATTTGCTGCATTTGCTAGACAATCTTTCGGTGATGTCGTTGCAAATGTTAAACAGTGGTCAATTGATTTCGTTTCTTATCTCTCCAACCAAAAATCTCATCTGTTCCCAACCCAAAAGACTAGCCCAATACAGTTCTTTAACTTGAGCACTTCATCAATACAAGAAGAATTTTTGTTCAATCGCCACATTGGGACAGATTACATTCATGATATTCATGAATATGATGAAGAAATGGAAGATGAAACTGAAACAATTGAAGATGTTTATGAAGAAATGGAAGTTGATGATGAGATTGATGTAGTAGATGAAGTTGTGTCTGAAGAAGAACTCACTGACGTTCAAAGTAATAATGACGGTGCTGTGGAGTCGAACCCCAATCTTGAAGATGAGTTTGTAAGCAACTCTGAAGCTGCTCAAACAGAAACTCAAAGTGATGGTGATAGTTTTGTGGAGTCGAAAACGAATTATAAAGATGAGACTTTAAGCAACTCTGAAGTTCAAACAGATATTCAAACCGATGTTAAAGAAGGGTATTTGGCAAATTCTGGTGATTACTCACAAGTTTTGAGTTCTTTGTTGACAATATCAGGCAACACAGTCTGTTTGATTGTTTGTTCGATGGTGATAGCAGCAGCTTCAGCTATTTTCTACATGAAAAAGGCCAAGTCTCAGACTGCAAAAACCACCAAAGTTTGTATTGGTGAGGATAACGTGTGCAGAGACTCGGGTTCTGCAGAATCGAGCAGTGCTCAGAATGGTTACAAGAAGAGAGCTAACAACAACAAACGAGAGTCGTTGGCATCGTCATCAGATTTCTCAATGGGTTCTCCTTCTTATGGAAGCTTCACTACATTTGAAAGAATTCCCATTAAGGGAGATGAAGTAATGGTAACACCGATTAGGAGATCAAGCAGACTGCTGAAAAAGTGAAAAACCATCAAGGGTGGTGAACTGATGCTGACACCAAATTAGGAATATCAAGCAGGCTGCTGAAGAATCAAAGTCACCTCTTCATGACTATCTTCGTGTGCTAGGATCAATTTTTAACTTTTTTCTTCAGATAGTTAAAGTTTTTTTTTTCTTTAGACTTATGTAGTTGGTTTCTTTTGATTGTTACAATTTATTTTTCTTCAGACTTATTGTAGTTAGTTTAAACTCGTTAATTGATTCTTTCGAAG
>TRINITY_DN31386_c0_g1_i1
GTCAGGAGTCCAAAGATCCATTATTCCAAAACCATGTGCCATGTCCGCCTTTTCATCACTCTTTTTAGACCCTATGTAAGATTGTTCTAGACCAGCTTGCCCAATCAAGACCTCTTTGACTGCATATGGAGCATCAAAAACTTGGTCAAGCCTAGAATATAGACCGATGGAGGACATCTCATGAGATATATGCAGTGGTTCTGGATTGCGACTTTCATCAGCCACATGATTTATAACTGGAACATCAAATGACCTAAGGTTCTTGAATAGAAAATCACGGTCAGCAGAATCCTGGGTGATAAAAGCCTTCCACATGTAATATGGAACATGACTCTCTAAGTAGGCAGCATGAAGTTGATTTGAG
>TRINITY_DN31325_c0_g1_i1
GAAATAACAAACAACCACCAATGTATCTTGTACAAACTTTCATTATTAGTTAAATTATTATACATATTTGCATACACAATATCCACTCCCTCGCGTCACGATTTAGGTACTTGCCGATATCATCGGATGGAAAATTTTACTTTGTAATGGAGGAAACCAACTTAAGGTCAATATCTTCTTTTGCTTTCGCGTCAAAAGGTTCATCACCAATATATAAATCCAAAATGGAACGACATAAGAGTGGACTTTGGACACTTCCAACCTCCTTGCCATCAATAGTTGTCCGAAGCACGTATCCCTTTTCCTTGGAAAGGTCTATTATAGACCCACGAGGAATCTTGTATTCGTCACTAAACTGAGAAGTGAACTTTTGAAGTAACTCTTTGTTTTCAGATCCTCCAAACTTCTTTAATCTACTTCCAACCGAATCTTCAAATGCATTACGAACAGAACGAATGCTAAGTTTTCCGTACACTATCTGCAGCCTAACAGTAACACTTACATCACTTTCCATCAAATCTTGAGTGACATCCTTTTGCTTCAGTTCAGTAGCTGAAAGTGTATCGTATTTCTCACTTAAAAAGTTATTGAGATCATTATCATCCGCATATACACCGAAGGCGTAAACATCGATGTTTTTAATACCAAACACAGCTTTTTTTCTCAACCCAATTCCCAAAAGGCTCTGAGAATCCTTCAGGGCTGCTGGAAACAAAACCCCGGTTTTCGATTCCGTAACGAGGGGCGCGTCGGAGAGTGACAGCGAGCCGAAAAGATGAGAAGATGGTTGGAAATTGGAAAGGAAGGAGTTGATTGGGTTTTGAGAGGTGGAAATGGCGGCGGAAATGCCGGCACCGACTGCGGCGGAGACGGCAACGGAGACAGCGATTGTTGAGAAGGGGTTGGAAGGGAGGTGTGGAGTGGTGGTTACGGCTGGGTTAGAGAAGGAGAATGGGAATCGGAATGAGACCATTCTATTGTGTGTTTTTTTTGTGTGTTGTGTTCTCCGGTGAAGTTAATTTTTGCG
>TRINITY_DN31324_c0_g1_i1
CTAAGATCTTCTTCAACATAACCATGTAGAGATTCTAGAATGCAAAATCGGTGCTCCTTACATAATTACATTTGTTAAAAAAAATCTCTAGGAGATGGATATAAATTGATGCATGTATAAGTGAAGATCAATACATAAAGATTTTGAAACAGATAATCGATGATAGGGTTTCACTCATGAAAGGAAAAAGGAACTGTTGTGACGGTGATGGATATAGATCACAATTTGTAACGAGCTGTTAAACTCGAACCCGGCCAAAAGCATCGTCAAAATTGAGGCGGCTGAGTGGGTTAGTGACGGTTAGTGACAACAACGGCAGCGGCGGTGACGGGGTTGCCAGCACCAACAACTGTTATGGCGGCGGTGGTGGCGCTTGGCTGGTTGCTGTGGTGGTGATAAGATTTATTTGGTTCGGATAATGTTGATTTGATCATCGTTTAAATTAAAGAACAAAAATTGCAAAGAGGATCTTTATAACATATCTTGAATTCTTGATATTCCTGCTAAAAAGATAG
>TRINITY_DN31331_c0_g1_i1
AGCAATCATTGACAACCAAAAACTGATTATGATAATACCCATCCATCTCTCAAAAACACACACACATCATAACTACTACTTAGAGCTAGCTTCAATCTTCAATGGCTGTTTCTGGTTTTGAAGGCTTTGAAAAACGCCTTGAACTTCAATTCTCTAGTGACAATAATACTCCTATCGGTATCGGTCTCCGAGAAATCGATTTCGAATCGGTTGAACAAGTTCTTCATTTAGTACAATGTACTGTTGTATCCGCCCTTGGCAACCAACACTTTGATTCTTATGTCCTCTCTGAATCCTCTCTCTTCATTTACCCAACTAAAATCATTATCAAAACATG
>TRINITY_DN31336_c0_g1_i1
TGGAAAGGGTTCTAAAATGGTCACTAAGATACCCGTTAGGCCGCGAGGCGAAAAACTCGGCCTACCCGGGTAACCTGAACAGGAAAAACCTCATCTAATTACCCAGCAACATAATCATGTTATCATTTGAAAACTAGCATGAACATATGGCATCAAACAATACAAGATAAAATTGATTAAAATAATAAACATTAATTAATAAATACTAAAAAA

If I understand correctly asmbl identifier indicates PASA generated transcripts and Trinity identifiers indicated transcripts that didn't map to the genome from the de novo Trinity assembly. I tried using TrinityStats.pl to get an idea of how the comprehensive assembly compares to the de novo and genome-guided assemblies, but TrinityStats.pl couldn't recognize the asmbl identifiers.

Is there a way to convert the asembl identifiers into Trinity identifiers?

Will these two different headers mess up future Trinotate annotation?

Thank you,
James

Brian Haas

unread,
Feb 16, 2017, 8:19:29 AM2/16/17
to jvi...@cub.uca.edu, pasapipeline-users
right - there are some compatibility issues w/ different scripts.  The PASA / Trinity integration is mostly complete but there are still a few minor hiccups like TrinityStats.

If I remember correctly, it should be generating a gene-to-trans map file, and as long as you have that, you can continue on with Trinotate just fine.

best,

~b

--
You received this message because you are subscribed to the Google Groups "pasapipeline-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pasapipeline-users+unsub...@googlegroups.com.
To post to this group, send email to pasapipeline-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

jvi...@cub.uca.edu

unread,
Feb 16, 2017, 8:50:45 AM2/16/17
to pasapipeline-users, jvi...@cub.uca.edu
That about covers it all. I'll read up on quality assessment and look at alternatives to N50 like Detonate and BLASTP hit coverage. I do have a gene-to-trans map and am excited to annotate my new assemblies.

Thank you for taking the time to help and for your generous contributions to the scientific community.

James

Brian Haas

unread,
Feb 16, 2017, 8:52:55 AM2/16/17
to jvi...@cub.uca.edu, pasapipeline-users
thanks for being a happy user!  

best of luck,

~b

--
You received this message because you are subscribed to the Google Groups "pasapipeline-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pasapipeline-users+unsub...@googlegroups.com.
To post to this group, send email to pasapipeline-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

jvi...@cub.uca.edu

unread,
Mar 4, 2017, 2:59:17 PM3/4/17
to pasapipeline-users, jvi...@cub.uca.edu
Quick question. Does anyone have any suggestions (perhaps even 3rd party scripts/tools) for calculating the n50 and other metrics produced by TrinityStats.pl for a PASA hybrid assembly?

Thanks,
James

jvi...@cub.uca.edu

unread,
Mar 16, 2017, 1:19:36 PM3/16/17
to pasapipeline-users, jvi...@cub.uca.edu
Please disregard my last question.

I found TransRate produces most of these statistics.

I do wish it could tell me the isoform portion of the Trinity statistics, but I am not even sure if PASA generated contigs have isoforms as I do not see any identifiable information in the asmbl ID's.

Thank you Brian for this amazing tool,

James


Reply all
Reply to author
Forward
0 new messages