ERROR, Specified key was too long; max key length is 767 bytes

26 views
Skip to first unread message

Nadal Tegstrom

unread,
Jan 27, 2021, 3:46:34 PM1/27/21
to pasapipeline-users

Hi Brian,

Thanks for your continuous support for PASA.

I am new to PASA, and now I am trying to use it to build a comprehensive transcriptome database. When I ran the command:

Launch_PASA_pipeline.pl -c alignAssembly.config -C -R -g Fraxinus_americana_v.0.1_CLC_SSPACE_GAPCLOSER.fasta -t Trinity-GG.fasta --ALIGNERS blat,gmap --transcribed_is_aligned_orient --CPU 8

I got the following flag:
* [Wed Jan 27 12:1+3:31 2021] Running CMD: /home/P910/software/PASApipeline.v2.4.1//scripts/create_mysql_cdnaassembly_db.dbi -c alignAssembly.config -S '/home/P910/software/PASApipeline.v2.4.1//schema/cdna_alignment_mysqlschema'

ERROR 1071 (42000) at line 194 in file: '/home/P910/software/PASApipeline.v2.4.1/schema/cdna_alignment_mysqlschema': Specified key was too long; max key length is 767 bytes

But PASA did not stall and finished mapping, etc. until it got stuck at:
[Wed Jan 27 17:30:30 2021] Running CMD: /home/P910/software/PASApipeline.v2.4.1//scripts/set_spliced_orient_transcribed_orient.dbi -M 'chko_Famer' > pasa_run.log.dir/setting_aligned_as_transcribed_orientation.output

When I interrupted it, I got this message:
* [Wed Jan 27 17:30:30 2021] Running CMD: /home/P910/software/PASApipeline.v2.4.1//scripts/set_spliced_orient_transcribed_orient.dbi -M 'chko_Famer' > pasa_run.log.dir/setting_aligned_as_transcribed_orientation.output

^CError, cmd: /home/P910/software/PASApipeline.v2.4.1//scripts/set_spliced_orient_transcribed_orient.dbi -M 'chko_Famer' > pasa_run.log.dir/setting_aligned_as_transcribed_orientation.output died with ret 2 No such file or directory at /home/P910/software/PASApipeline.v2.4.1//PerlLib/Pipeliner.pm line 187.
    Pipeliner::run(Pipeliner=HASH(0x17a7e80)) called at ./Launch_PASA_pipeline.pl line 1044

I don't know  if the problem with SQL would cause this to halt. I am using mysql v.15.1 distrib 10.0.24-MariaDB using readline 5.2. Do you have a suggestion how to address it? Also is it correct that I follow the leveraging RNASeq by PASA pipeline and use the genome-guided assembly at this step? 

Thank you very much and I am looking forward to hearing from you.

Brian Haas

unread,
Jan 28, 2021, 7:04:12 AM1/28/21
to Nadal Tegstrom, pasapipeline-users
hi,

I'm not sure what might have caused it to stall.   I'd suggest trying to run it on the sample data set that comes with it to see if it can work end-to-end, if you haven't tried that already.

Wrt max key length, I haven't seen this happen on our earlier installations.  I wonder if there's some mysql configuration on your server that's causing that.   One thing you can do is to follow the output being generated at that step, such as via
 
 tail -f  pasa_run.log.dir/setting_aligned_as_transcribed_orientation.output 

so you'll see if it's just running slowly or entirely locked up for some reason.

You can also try the sqlite backend instead of mysql to see if that works for you.

hth,

~b

--
You received this message because you are subscribed to the Google Groups "pasapipeline-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pasapipeline-us...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pasapipeline-users/bbce8f91-52e6-4968-9db7-f7ff44890a3dn%40googlegroups.com.


--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

Nadal Tegstrom

unread,
Jan 28, 2021, 9:07:54 AM1/28/21
to pasapipeline-users
Hi Brian,

Thank you very much for your reply. I will try with the sample data and will get back to you.

Regarding my real dataset, I have a stranded library. Do you think it is better to use the leveraging RNASeq pipeline (instead of the alignment pipeline) with a genome-guided assembly as an input before following the comprehensive transcriptome db step? 

Sorry for the very basic question, but I am very new to bioinformatics in general.

Thanks again for your time and have a great day.

Brian Haas

unread,
Jan 28, 2021, 9:12:34 AM1/28/21
to Nadal Tegstrom, pasapipeline-users
hi,

If you have a high quality genome (ie. human, mouse, etc.) you could just run StringTie.  If you have a draft genome (lots of gaps), or your samples aren't a great match to the reference you're working with, then the comprehensive assembly approach would benefit you.

If the data are stranded, then you'll want to use all the strand-specific options that are available to you.

hth,

~b

Nadal Tegstrom

unread,
Jan 30, 2021, 2:25:01 PM1/30/21
to pasapipeline-users
Hi Brian,

I have updated mySQL to 10.5 and now the problem is solved. :)
Thanks for all your help and have a great day!

Best regards

Brian Haas

unread,
Jan 30, 2021, 3:44:22 PM1/30/21
to Nadal Tegstrom, pasapipeline-users
Great news!!  Thx

-via googleFi

Reply all
Reply to author
Forward
0 new messages