Goby 2.3.4.1 released

Fabien Campagne

unread,

Aug 29, 2014, 11:04:19 AM8/29/14

to goby-fr...@googlegroups.com

We have released a new version of Goby. This new release mostly fixes a few problems identified since the last release.

New Features:

- Add an option to the fasta-to-compact mode that will convert a set of files and concatenate the result to a single compact-reads file (see new --concat option).

- Add a mode to test that the connection from Goby to R is working (requires JRI and R built with shared library support). The mode is called test-r-connection (tcr).

Bug fixes:

- Fix a bug that caused some slices to occur within annotations, despite the --annotation option being given on the command line. The problem was that the chromosome index was not /obtained from the genome and was set to zero, always. In rare cases, this would cause one annotation to be omitted from the output. Thanks go to Laurent Mesnard for reporting this problem.

- Restore STRICT_SOMATIC filter.

- Close files opened when loading Goby Alignment header and index files. This fixes a too many file error that could occur when loading hundreds of alignments simultaneously.

- Allow lenient import mode for TSV files. This makes it possible to convert TSV files to lucene.index when they have been created with Goby in the past with a \t character as last character of the column line.

yif...@gmail.com

unread,

Oct 23, 2014, 2:19:32 PM10/23/14

to goby-fr...@googlegroups.com

Hello Fabien!

I just tried Goby-2.3.4.1 to remove redundant sequences of a multi-fasta file (~1000 entries, some are 60kb in length). The program did the job in a flash, which is amazing!
There is one problem with my analysis, which is the original sequence IDs are changed to ascending numbers. This breaks the track of my analysis as the sequence ID are lost from here.
So my question:
Is there an option to keep the sequence ID untouched, so that I can keep track of my analysis like which sequences are filtered or kept?
Thanks a lot!

Fabien Campagne

unread,

Oct 23, 2014, 2:46:17 PM10/23/14

to goby-fr...@googlegroups.com, yif...@gmail.com

Hello,

It may have been a bit too fast in this case. Be advised that this tool is designed to remove exact duplicate reads, which may not make much sense with long read lengths (because the probability of a base with error increases).

Regarding your question, yes it should be possible to preserve read names, you can do this when you convert from fasta to compact-reads with the -x option (see http://campagnelab.org/software/goby/reference-documentation/modes/fasta-to-compact/, you can also access this doc from the main project page http://goby.campagnelab.org). Names should be preserved by filtering, let us know if this was not the case.

Hope this helps. Best,

Fabien

yif...@gmail.com

unread,

Oct 23, 2014, 4:38:36 PM10/23/14

to goby-fr...@googlegroups.com, yif...@gmail.com

On Thursday, October 23, 2014 12:46:17 PM UTC-6, Fabien Campagne wrote:
> Hello,
>
>
> It may have been a bit too fast in this case. Be advised that this tool is designed to remove exact duplicate reads, which may not make much sense with long read lengths (because the probability of a base with error increases).
>
>
> Regarding your question, yes it should be possible to preserve read names, you can do this when you convert from fasta to compact-reads with the -x option (see http://campagnelab.org/software/goby/reference-documentation/modes/fasta-to-compact/, you can also access this doc from the main project page http://goby.campagnelab.org). Names should be preserved by filtering, let us know if this was not the case.
>
>
> Hope this helps. Best,
>
>
> Fabien
>
>

Thanks for your prompt reply.
I gave it a try, which did not work out.
The webpage says the option is -x|--include-identifiers; but the manpage from cmd line -H seems to tell it is -y, whereas the [-x <dynamic-options>] set a dynamic option, in the format ....
When I used --include-identifiers, error occurred saying Unknown flag 'include-identifiers'.
I need this option working badly!
Thanks again.

Fabien Campagne

unread,

Oct 23, 2014, 4:58:29 PM10/23/14

to goby-fr...@googlegroups.com, yif...@gmail.com

On Thursday, October 23, 2014 4:38:36 PM UTC-4, yif...@gmail.com wrote:

On Thursday, October 23, 2014 12:46:17 PM UTC-6, Fabien Campagne wrote:
> Hello,
>
>
> It may have been a bit too fast in this case. Be advised that this tool is designed to remove exact duplicate reads, which may not make much sense with long read lengths (because the probability of a base with error increases).
>
>
> Regarding your question, yes it should be possible to preserve read names, you can do this when you convert from fasta to compact-reads with the -x option (see http://campagnelab.org/software/goby/reference-documentation/modes/fasta-to-compact/, you can also access this doc from the main project page http://goby.campagnelab.org). Names should be preserved by filtering, let us know if this was not the case.
>

Thanks for your prompt reply.
I gave it a try, which did not work out.
The webpage says the option is -x|--include-identifiers; but the manpage from cmd line -H seems to tell it is -y, whereas the [-x <dynamic-options>] set a dynamic option, in the format ....
When I used --include-identifiers, error occurred saying Unknown flag 'include-identifiers'.
I need this option working badly!

The web documentation was out of date, this should be fixed now, sorry about that.

The command line is correct and the flag is indeed --include-identifiers (or -y). I just tried it as follows and had no problem (note the flag when converting back to fasta):

cat 1.fasta <<EOT
>!
ACTG
EOT

goby 1g fasta-to-compact --include-identifiers

goby 1g compact-to-fasta -i 1.compact-reads -o out.fasta --identifier-to-header

cat out.fasta

>!

ACTG

yif...@gmail.com

unread,

Oct 23, 2014, 5:09:47 PM10/23/14

to goby-fr...@googlegroups.com

--(note the flag when converting back to fasta):
"--identifier-to-header "
That's the trick! Thanks a lot again!

Reply all

Reply to author

Forward