Multi-part database volume

199 views
Skip to first unread message

halocaridina

unread,
Aug 27, 2012, 3:51:35 PM8/27/12
to sequenc...@googlegroups.com
Hi Yannick, Anurag, Cedric and Ben,

Many thanks for developing and maintaining SequenceServer.  Its been very nice to have an alternative to the www-blast package.

Got a quick question.  I've set up a custom .nal file using the directions in the SequenceServer FAQ to aggregate ~45 individual BLAST DBs that were created via "sequenceserver format-databases".

The structure of the all_dbs.nal file is:

TITLE  all_dbs 
DBLIST /path/to/db1 /path/to/db2 ...

After restarting Apache and revisiting my SequenceServer page, the alias isn't shown and I get the following in my httpd/error_log:

I, [2012-08-27T14:19:05.170846 #27568]  INFO -- : Found a multi-part database volume at /path/to/all_dbs  - ignoring it

All the single DBs are correctly identified and on the page, so that should be OK. 

I updated SequenceServer less then a week ago, so I don't think the issue is a dated installation.

I see that Mark Anthony Gibbins added this function and is running a fork of the project.  Has his changes been integrated into SequenceServer mainline?  If so, any suggestions on what might be the issue? 

Thanks in advance,

Halocaridina

Mark Gibbins

unread,
Aug 27, 2012, 6:40:44 PM8/27/12
to sequenc...@googlegroups.com
Hi Halocaridina,

Would you be able to post a few things for me:
  1. The full filenames of the database parts, and their alias file.
  2. Contents of the alias file.
  3. Any other relevant lines from the SS log.
I couldn't determine from your message, but the alias file and database parts must be in the same path. The specifications of the multipart alias file from NCBI force this.

Also, the log message is simply a notification that a multipart database was found so this looks to be a different problem.

Thanks!
Mark

Anurag Priyam

unread,
Aug 27, 2012, 7:16:21 PM8/27/12
to sequenc...@googlegroups.com, Yannick Wurm
On Tue, Aug 28, 2012 at 4:10 AM, Mark Gibbins <xiy...@gmail.com> wrote:
> I couldn't determine from your message, but the alias file and database
> parts must be in the same path.

Right. But the FAQ[1] "can I use preformatted BLAST database"
contradicts this. So most likely Halocaridina's alias file is in a
separate directory, while the volumes in the database directory. And
probably that is what the issue is: SS ignores the volumes, and never
sees the alias file which is in a separate directory.

I think we should remove that FAQ entry. For one, the alternative
solution is confusing and less preferable since SS handles it out of
the box (thanks to Mark). Second, FAQ makes it sound like an edge
case while it is not. Third, let's not suggest users that modifying
alias files themselves is ok. It's not. It incurs additional
administrative burden. Yannick?

> The specifications of the multipart alias
> file from NCBI force this.

I don't think so. Alias files are plain text files with a pointer to
the volumes. If the pointers are absolute path, BLAST+ can find them
regardless of whether alias and volumes are in the same directory or
not. Yannick's solution (the FAQ) suggests the same.

[1]: http://www.sequenceserver.com/#faq

--
Anurag Priyam

Mark Gibbins

unread,
Aug 27, 2012, 7:40:11 PM8/27/12
to sequenc...@googlegroups.com, Yannick Wurm


On Tuesday, 28 August 2012 00:16:52 UTC+1, Anurag Priyam wrote:
On Tue, Aug 28, 2012 at 4:10 AM, Mark Gibbins <xiy...@gmail.com> wrote:
> I couldn't determine from your message, but the alias file and database
> parts must be in the same path.

Right.  But the FAQ[1] "can I use preformatted BLAST database"
contradicts this.  So most likely Halocaridina's alias file is in a
separate directory, while the volumes in the database directory.  And
probably that is what the issue is: SS ignores the volumes, and never
sees the alias file which is in a separate directory.

Agreed, it sounds like path-related problem to me. 
 
I think we should remove that FAQ entry.  For one, the alternative
solution is confusing and less preferable since SS handles it out of
the box (thanks to Mark).  Second, FAQ makes it sound like an edge
case while it is not.  Third, let's not suggest users that modifying
alias files themselves is ok.  It's not.  It incurs additional
administrative burden.  Yannick?

Indeed - perhaps we could warn users if a multipart database is found without a corresponding alias file? And vice-versa.

And with regard to the FAQ, we could change it to emphasise that if you are using multipart databases,
the alias file must be stored under the SS db path regardless of where the parts are.

What do  you think?
 

> The specifications of the multipart alias
> file from NCBI force this.

I don't think so.  Alias files are plain text files with a pointer to
the volumes.  If the pointers are absolute path, BLAST+ can find them
regardless of whether alias and volumes are in the same directory or
not.  Yannick's solution (the FAQ) suggests the same.

[1]: http://www.sequenceserver.com/#faq

Ah so it does! For some reason I was sure I came across it saying they had to be in the same path as the alias file. Apologies.


--
Anurag Priyam

halocaridina

unread,
Aug 28, 2012, 10:40:18 AM8/28/12
to sequenc...@googlegroups.com, Yannick Wurm

Hi Mark and Anurag,

Thanks for the quick reply.  Answers to your questions are:

1) The .nal file is in the same directory as the DBs created using "sequenceserver format-databases".  

2) Each DB has six files associated with it having extensions: .nhr .nin .nog .nsd .nsi .nsq

3) For each DB in that directory, there is a symlink to the fasta file that was used to build the database and the symlink shares the same name as each DB.  Thinking that having the symlinks and DBs sharing the same name in the same directory might be the problem, I moved the symlinks out of the directory, restarted Apache and revisited the SS page.  Same "- ignoring it" message.

4) The .nal file has the following structure:

DBLIST /home/data_processed/db_4_blast/Abarenicola_pacifica /home/data_processed/db_4_blast/Alciopa_spp [absolute paths to the other 43 DBs, each separated by a space]

5) I'm using the basenames of each DB (i.e.,no extensions) with an absolute path in the .nal file, which should be correct syntax as Anurag pointed out.

Am I just missing something obvious?  

Thanks again,

Halocaridina

Mark Gibbins

unread,
Aug 28, 2012, 12:35:08 PM8/28/12
to sequenc...@googlegroups.com, Yannick Wurm


On Tuesday, 28 August 2012 15:40:19 UTC+1, halocaridina wrote:

Hi Mark and Anurag,

Thanks for the quick reply.  Answers to your questions are:

1) The .nal file is in the same directory as the DBs created using "sequenceserver format-databases".  

2) Each DB has six files associated with it having extensions: .nhr .nin .nog .nsd .nsi .nsq

3) For each DB in that directory, there is a symlink to the fasta file that was used to build the database and the symlink shares the same name as each DB.  Thinking that having the symlinks and DBs sharing the same name in the same directory might be the problem, I moved the symlinks out of the directory, restarted Apache and revisited the SS page.  Same "- ignoring it" message.

4) The .nal file has the following structure:

DBLIST /home/data_processed/db_4_blast/Abarenicola_pacifica /home/data_processed/db_4_blast/Alciopa_spp [absolute paths to the other 43 DBs, each separated by a space]

5) I'm using the basenames of each DB (i.e.,no extensions) with an absolute path in the .nal file, which should be correct syntax as Anurag pointed out.

I think I have an idea of the problem now, but just to confirm would you be able either list the databases SS lists in the web interface or upload a screenshot? imgur.com is fine if you don't mind it being public.

This probably has something to do with the way SS determines what is and isn't part of a database set.

halocaridina

unread,
Aug 28, 2012, 12:44:56 PM8/28/12
to sequenc...@googlegroups.com, Yannick Wurm

Hi Mark,

Sure, the list is:

Abarenicola_pacificaAlciopa_sppAncistrosyllis_groenlandicaAphroditida_japonica
Axiothella_rubrocincta
Boccardia_proboscidea
Chaetozona_spp
Chrysopetallid_colormorph1
Clymenella_torquata
Cossura_longicirrata
Delaya_leruthi
Enchytraeus_albidus
Galathowenia_oculata
Galeolaria_caespitosa
Glycera_dibranchiata
Glycinde_armigera
Goniada_brunnea
Halosydna_brevisetosa
Heteromastus_filiformis
Leitoscolopus_robustus
Lumbrineris_crassicephana
Magelona_beckleyi
Myxicola_infundibulum
Nainereis_laevigata
Nephtys_incisa
Nereis_succinea
Ninoe_nigrens
Paramphinome_jeffreysii
Pectinaria_gouldii
Phascolosoma_agassizii
Poeobius_meseres
Pomatoleios_kraussii
Sabella_pacifica
Scalibregma_inflata
Schizobranchia_insignis
Sparganophilus_spp
Sternapsis_scutata
Sthenalanella_uniformis
Syllis_cf_halina
Terebellides_stoemi
Themiste_pyroides
Thysanocardia_nigra
Tomopteris_spp

All 43 listed here are present in the SS web interface in the exact same order and spelling.

Glad we are getting to the source of the issue and hope this helps the project.

Cheers,

Halocaridina 

halocaridina

unread,
Aug 28, 2012, 12:46:49 PM8/28/12
to sequenc...@googlegroups.com, Yannick Wurm

Hi Mark,

Paste error.  The three on the first line should have been on three separate lines.

Mark Gibbins

unread,
Aug 28, 2012, 1:52:45 PM8/28/12
to sequenc...@googlegroups.com, Yannick Wurm
As I suspected this seems to be a problem with the way SS handles alias files that alias databases
that don't match the NCBI format for multi-part dabases (nr.00, nr.01 etc.) and is obviously an extension of the mutli-part
bug as SS should really be querying any alias file it finds to ignore the individual databases it lists in favour of using the alias file.

This is why you see the 'ignoring it' log message - this is where SS is ignoring a db volume in favour of using the alias.
Unfortunately it doesn't account for using aliases as a way of aggregating a large amount of individual databases!

So I'll start working on this, but in the meantime, if you want to get around the bug for now and all you want to do is blast
all the databases you listed, you can use this alias file: http://pastebin.com/8J3FTa4J

It's picked up on my local server at the top of the list, so if you want to blast all databases, just select "All Databases".
It's ugly, but it works (or should do) :)

If you have any more problems, let me know.

Thanks,
Mark

halocaridina

unread,
Aug 28, 2012, 4:12:37 PM8/28/12
to sequenc...@googlegroups.com, Yannick Wurm

Hi Mark,

Appreciate the effort.  Unfortunately, error still persists using the alias that you provided.  I actually tried the same thing yesterday (removing the absolute paths) and got the "-ignoring it" message.

One interesting behavior I noticed yesterday, which started me down the path of trying to use an alias file and might shed some light on what's going on.

The individual FASTA files that I'm building the DBs from have anywhere from 40K-200K entries.  The DBs created from them using "sequenceserver format-databases" are picked up by SS and work fine from the web interface.

Yesterday morning, I concatenated all 45 FASTA files into a single FASTA file (headers have taxon-specific tags, so easy to track) and ran "sequenceserver format-databases" against that ~4.5GB file to generate a "mega" DB. Once created in the SS DB directory where all of the other DBs are, I restarted Apache and revisited the SS page and "mega" DB wasn't listed while everything else was.  

The error was exactly the same as with the alias file: "Found a multi-part database volume at /path/to/SS/db_directory - ignoring it" 

So, two different means of generating the same error. Could it be something size related, either as a single DB or the total of multiple smaller DBs?

Cheers,

Halocaridina

Mark Gibbins

unread,
Aug 28, 2012, 6:42:32 PM8/28/12
to sequenc...@googlegroups.com
On 28 Aug 2012, at 21:12, halocaridina <haloca...@gmail.com> wrote:


Hi Mark,

Appreciate the effort.  Unfortunately, error still persists using the alias that you provided.  I actually tried the same thing yesterday (removing the absolute paths) and got the "-ignoring it" message.

One interesting behavior I noticed yesterday, which started me down the path of trying to use an alias file and might shed some light on what's going on.

The individual FASTA files that I'm building the DBs from have anywhere from 40K-200K entries.  The DBs created from them using "sequenceserver format-databases" are picked up by SS and work fine from the web interface.

Yesterday morning, I concatenated all 45 FASTA files into a single FASTA file (headers have taxon-specific tags, so easy to track) and ran "sequenceserver format-databases" against that ~4.5GB file to generate a "mega" DB. Once created in the SS DB directory where all of the other DBs are, I restarted Apache and revisited the SS page and "mega" DB wasn't listed while everything else was.  

This is very odd as any valid formatted db should be detected by SS, regardless of its size.


The error was exactly the same as with the alias file: "Found a multi-part database volume at /path/to/SS/db_directory - ignoring it" 

That message is only printed when SS finds a multi-part database with a filename that matches the regex I created, so it must be finding a pre-formatted database in your db folder.

A copy of the full log from startup would be great, that way I can trace exactly what's happening.

You may also want to try cloning the latest code from github if you haven't already.
--
You received this message because you are subscribed to the Google Groups "sequenceserver" group.
To post to this group, send email to sequenc...@googlegroups.com.
To unsubscribe from this group, send email to sequenceserve...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/sequenceserver/-/P0ZisuIEk-kJ.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

halocaridina

unread,
Aug 29, 2012, 11:29:16 AM8/29/12
to sequenc...@googlegroups.com

Hi Mark,

Thanks again for the help.  I eventually got everything worked out and the alias file that you sent is now working.

I feel a little stupid since it came down to a simple renaming of the .nal file.  Originally, I had just used the same .nal file name and pasted the alias information you provided into it. It dawned on me last night after going through this thread that might be the problem.

Specifically, I changed the filename from "All_taxa_08_27_12.nal" to "All_taxa.nal" and the alias was picked up and added to the SS list of available DBs (I also check that it indeed worked by doing a BLAST).  To verify that the old "All_taxa_08_27_12.nal" name was really the problem, I renamed the now working "All_taxa.nal" back to "All_taxa_08_27_12.nal", restarted SS and it was ignored/disappeared from the DB list.  Renamed it again to "All_taxa.nal", restarted and again the alias works.

Is this expected behavior?

Cheers, thanks again and apologize for wasting your time,

Halocaridina 

Mark Gibbins

unread,
Aug 29, 2012, 11:34:44 AM8/29/12
to sequenc...@googlegroups.com
On 29 Aug 2012, at 16:29, halocaridina <haloca...@gmail.com> wrote:


Hi Mark,

Thanks again for the help.  I eventually got everything worked out and the alias file that you sent is now working.

I feel a little stupid since it came down to a simple renaming of the .nal file.  Originally, I had just used the same .nal file name and pasted the alias information you provided into it. It dawned on me last night after going through this thread that might be the problem.

Specifically, I changed the filename from "All_taxa_08_27_12.nal" to "All_taxa.nal" and the alias was picked up and added to the SS list of available DBs (I also check that it indeed worked by doing a BLAST).  To verify that the old "All_taxa_08_27_12.nal" name was really the problem, I renamed the now working "All_taxa.nal" back to "All_taxa_08_27_12.nal", restarted SS and it was ignored/disappeared from the DB list.  Renamed it again to "All_taxa.nal", restarted and again the alias works.

Is this expected behavior?

Cheers, thanks again and apologize for wasting your time,

This isn't expected behaviour, but I'm glad you found the cause! It's exposed a problem in the way SS handles aliases so I'm glad you let us know - no wasted time at all.

I expect the reason it wasn't being found was something to do with the naming of the alias file. It most likely triggered the regular expression to match it as a volume, rather than an alias.

Once this happens, SS ignores the file completely. This explains why it wasn't showing up on the web interface and you had that message in your log about ignoring a file, despite you not having any
pre-formatted volumes in your db folder.

Glad we could help and you've got it all moving now, and thanks for exposing a pretty annoying bug. I had a feeling this wouldn't be a complete fix.

Let us know if you need any more help or run into any more problems :)

Cheers,
Mark

To view this discussion on the web visit https://groups.google.com/d/msg/sequenceserver/-/BQMujU09hjUJ.

Anurag Priyam

unread,
Sep 2, 2012, 2:49:01 AM9/2/12
to sequenc...@googlegroups.com
Halocaridina and Mark,

On Wed, Aug 29, 2012 at 9:04 PM, Mark Gibbins <xiy...@gmail.com> wrote:
> On 29 Aug 2012, at 16:29, halocaridina <haloca...@gmail.com> wrote:
>> Specifically, I changed the filename from "All_taxa_08_27_12.nal" to
>> "All_taxa.nal" and the alias was picked up and added to the SS list of
>> available DBs (I also check that it indeed worked by doing a BLAST). To
>> verify that the old "All_taxa_08_27_12.nal" name was really the problem, I
>> renamed the now working "All_taxa.nal" back to "All_taxa_08_27_12.nal",
>> restarted SS and it was ignored/disappeared from the DB list. Renamed it
>> again to "All_taxa.nal", restarted and again the alias works.

Did you have the 'nal' extension added to your alias file residing on
your disk? Or are you just using it here for clarity -- to
distinguish alias files from the volumes and other files?

> I expect the reason it wasn't being found was something to do with the
> naming of the alias file. It most likely triggered the regular expression to
> match it as a volume, rather than an alias.

Git says multipart database recognition was last touched by Ben in
commit 647041d9, which causes SS to ignore database files only if they
end in two digits, extension included. So SS won't ignore
'/home/yeban/sequences/All_taxa_08_27_12.nal'. Unless, there is no
'nal' at the end. Here: http://rubular.com/r/TYtj0BJWyl.

On Tue, Aug 28, 2012 at 8:10 PM, halocaridina <haloca...@gmail.com> wrote:
> 4) The .nal file has the following structure:
>
> DBLIST /home/data_processed/db_4_blast/Abarenicola_pacifica
> /home/data_processed/db_4_blast/Alciopa_spp [absolute paths to the other 43
> DBs, each separated by a space]

What is the need of a custom alias file? You want to search 43
sequence files in one go, right? And, what if you want to compare
against a subset of the databases, say 5 out of 43 databases, listed
in your alias file?

--
Anurag Priyam

halocaridina

unread,
Sep 2, 2012, 9:31:16 AM9/2/12
to sequenc...@googlegroups.com

Hi Anurag,

FIrst, thanks for the notification regarding the point release.  I plan to update beginning of next week.

Yes, the alias file ended with the .nal extension during all of my debugging with Mark last week. 

Yes, the reason for the alias file is to search all DBs in the list using SS by just clicking a single radio button. The rational is that I've set up SS as part of a collaborative project involving 12+ research groups and the number of DBs will continue to grow as more transcriptomes (used to build the DBs) are sequenced. Having a single option to "Select all" was something requested by the group, so I pursued the route of using an alias file. It is a feature that I see other groups being interested in if they want to search across all DBs in a long list.  For those wanting to search a subset of DBs, they are content just clicking the radio buttons of interest (or I might create aliases files for commonly used subsets in our case).

Cheers and thanks again,

Halocaridina  

Anurag Priyam

unread,
Sep 2, 2012, 5:35:31 PM9/2/12
to sequenc...@googlegroups.com
On Sun, Sep 2, 2012 at 7:01 PM, halocaridina <haloca...@gmail.com> wrote:
> FIrst, thanks for the notification regarding the point release. I plan to
> update beginning of next week.

Can you run `gem list sequenceserver` and report the output? I want
to know the version number of SS installed on your system.

> Yes, the reason for the alias file is to search all DBs in the list using SS
> by just clicking a single radio button. The rational is that I've set up SS
> as part of a collaborative project involving 12+ research groups and the
> number of DBs will continue to grow as more transcriptomes (used to build
> the DBs) are sequenced. Having a single option to "Select all" was something
> requested by the group, so I pursued the route of using an alias file. It is
> a feature that I see other groups being interested in if they want to search
> across all DBs in a long list. For those wanting to search a subset of DBs,
> they are content just clicking the radio buttons of interest (or I might
> create aliases files for commonly used subsets in our case).

Hmm. I was trying to guess if a 'select all' button would be of help.
So you list databases independently too, so a different combination
could be used. Maybe SS could do something about grouping databases
together (issue #29).

[29]: https://github.com/yannickwurm/sequenceserver/issues/29

--
Anurag Priyam

Mark Gibbins

unread,
Sep 2, 2012, 7:37:52 PM9/2/12
to sequenc...@googlegroups.com
Typos courtesy of my iPad
I was actually going to suggest a database management area where you
can create/edit groups of databases. You could have it use the built
in blast tool for creating aliases and aggregating DBS together, then
you could list those groups on the main page? What do you think?

>
> --
> Anurag Priyam
>
> --
> You received this message because you are subscribed to the Google Groups "sequenceserver" group.
> To post to this group, send email to sequenc...@googlegroups.com.
> To unsubscribe from this group, send email to sequenceserve...@googlegroups.com.

Anurag Priyam

unread,
Sep 2, 2012, 8:21:48 PM9/2/12
to sequenc...@googlegroups.com
Mark,

On Mon, Sep 3, 2012 at 5:07 AM, Mark Gibbins <xiy...@gmail.com> wrote:
>> Hmm. I was trying to guess if a 'select all' button would be of help.
>> So you list databases independently too, so a different combination
>> could be used. Maybe SS could do something about grouping databases
>> together (issue #29).
>>
>> [29]: https://github.com/yannickwurm/sequenceserver/issues/29
>
> I was actually going to suggest a database management area where you
> can create/edit groups of databases. You could have it use the built
> in blast tool for creating aliases and aggregating DBS together, then
> you could list those groups on the main page? What do you think?

Let's take this off list so as to not spam others with development
noise. I have sent you an email summarizing my idea of database grouping.

--
Anurag Priyam

halocaridina

unread,
Sep 2, 2012, 8:50:00 PM9/2/12
to sequenc...@googlegroups.com

Hi Anurag and Mark,

Anurag, here is the info on SS version #:

[19:42:51] [halocaridina@deathstar ~]$ gem list sequenceserver

*** LOCAL GEMS ***

sequenceserver (0.8.0)
-----------------------------

I literally updated to 0.8.0 a day or two before starting this thread.

I would encourage the discussion between the SS developers regarding options on DB grouping.  I think most people would like this type of feature as well as not having the redundancy of concatenated FASTA that take up extra space.

Cheers and please let me know if there is anything I can help with,

Cheers,

Halocaridina

Anurag Priyam

unread,
Sep 2, 2012, 10:35:56 PM9/2/12
to sequenc...@googlegroups.com
On Mon, Sep 3, 2012 at 6:20 AM, halocaridina <haloca...@gmail.com> wrote:
> [19:42:51] [halocaridina@deathstar ~]$ gem list sequenceserver
>
> *** LOCAL GEMS ***
>
> sequenceserver (0.8.0)
> -----------------------------
>
> I literally updated to 0.8.0 a day or two before starting this thread.

Ok, I am out of ideas then as to why SS rejects All_taxa_08_27_12.nal :-/.

> I would encourage the discussion between the SS developers regarding options
> on DB grouping. I think most people would like this type of feature as well
> as not having the redundancy of concatenated FASTA that take up extra space.

Yep, we are on it :). Though, I will refrain from estimating the
development time for now.

--
Anurag Priyam

halocaridina

unread,
Sep 3, 2012, 12:18:52 AM9/3/12
to sequenc...@googlegroups.com
Hi Anurag,

I'll keep an eye on my system and "play" around with filenames in the meantime to see I can help shine some light on this.

Thanks, totally understandable regarding development cycles.  Now that I know the "trick" for naming alias files, it will be nothing but a little combination of grep/sed/awk to take care of populating them for my production purposes.

Cheers,

Halocaridina

Bert Brutzel

unread,
Aug 28, 2013, 10:51:47 AM8/28/13
to sequenc...@googlegroups.com
Good Day,

as well thanks for developing and maintaining SequenceServer. I believe I have a similar problem. I am trying to work with the MD5nr database, which as well uses an alias file:

$ cat md5nr
.pal
#
# Alias file created Sun Apr  1 16:21:26 2012
#
#
TITLE md5nr
#
DBLIST md5nr
.00 md5nr.01 md5nr.02 md5nr.03 md5nr.04 md5nr.05 md5nr.06 md5nr.07
#
#GILIST
#
#OIDLIST
#

The individual files are

$ ls
build_env     md5nr
.00.psi  md5nr.01.psi  md5nr.02.psi    md5nr.03.psi  md5nr.04.psi  md5nr.05.psi  md5nr.06.psi    md5nr.07.psi
md5nr          md5nr
.00.psq  md5nr.01.psq  md5nr.02.psq    md5nr.03.psq  md5nr.04.psq  md5nr.05.psq  md5nr.06.psq    md5nr.07.psq
md5nr
.00.phr  md5nr.01.phr  md5nr.02.phr  md5nr.03.phr    md5nr.04.phr  md5nr.05.phr  md5nr.06.phr  md5nr.07.phr    md5nr_blast.tar.gz
md5nr
.00.pin  md5nr.01.pin  md5nr.02.pin  md5nr.03.pin    md5nr.04.pin  md5nr.05.pin  md5nr.06.pin  md5nr.07.pin    md5nr.pal
md5nr
.00.psd  md5nr.01.psd  md5nr.02.psd  md5nr.03.psd    md5nr.04.psd  md5nr.05.psd  md5nr.06.psd  md5nr.07.psd



I am as well getting the ignored message:

 
[2013-08-28T16:36:46.204690 #6471]  INFO -- : Found a multi-part database volume at /home/jfk/Software/MD5nr/md5nr.01 - ignoring it.

When I run a sequence against the database I see the "turning wheel" but never get any results. I am not sure if this is due to the ignored databases or if the machine I am using is old (P4 with 1 Gig RAM) and it actually takes forever due to the 6Gb size of the database.

Help or pointers are much appreciated.

Ben Woodcroft

unread,
Aug 28, 2013, 9:26:37 PM8/28/13
to sequenceserver
Hi Bert,

I wonder what stage seqserv is being slow on. What happens if you run a query (and so get the spinning wheel), and then look at the command line where seqserv is running from? Also, if you use the command "top", what process is taking the most CPU?

ben


--
You received this message because you are subscribed to the Google Groups "sequenceserver" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sequenceserve...@googlegroups.com.

To post to this group, send email to sequenc...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

Bert Brutzel

unread,
Aug 29, 2013, 3:19:07 AM8/29/13
to sequenc...@googlegroups.com
Hi Ben,

thanks for the quick reply.

Serverside output:

D
, [2013-08-29T09:13:12.854989 #10647] DEBUG -- : method: blastp
D
, [2013-08-29T09:13:12.855219 #10647] DEBUG -- : sequence: MTGTTGATAWR
D
, [2013-08-29T09:13:12.855494 #10647] DEBUG -- : database: ["69d7ff233621b78e5ef844130befbae9"]
D
, [2013-08-29T09:13:12.855640 #10647] DEBUG -- : advanced:


top:

top - 09:15:03 up 1 day, 16:50,  3 users,  load average: 0,77, 0,31, 0,15
Tasks: 144 total,   2 running, 142 sleeping,   0 stopped,   0 zombie
%Cpu(s): 20,9 us,  7,7 sy,  0,0 ni, 26,0 id, 45,1 wa,  0,0 hi,  0,3 si,  0,0 st
KiB Mem:   1017684 total,   950224 used,    67460 free,    31272 buffers
KiB Swap:  2074620 total,   370312 used,  1704308 free,   602548 cached

D
, [2013-08-29T09:14:38.228354 #10707] DEBUG -- : sequence: MTGTTGATAWR                                                                                                  
10742 jfk       20   0 2301m 308m 306m R  50,4 31,1   0:12.15 blastp                                                                                                      
   
23 root      20   0     0    0    0 S   9,3  0,0   0:54.39 kswapd0                                                                                                    
 
9096 root      20   0     0    0    0 S   0,3  0,0   0:01.67 kworker/0:2                                                                                                
10737 jfk       20   0 23300 1644 1136 R   0,3  0,2   0:00.12 top        

The runtime is so high, since it does not seem to be stopped since my last trials yesterday, despite me killing the process.

Bert Brutzel

unread,
Aug 29, 2013, 3:24:00 AM8/29/13
to sequenc...@googlegroups.com
WOW...it suddenly worked with a short query. The output also indicates that the full database is read. Thank you and sorry for bothering...I simply need more powerful hardware and some old machine i had lying around.....
Reply all
Reply to author
Forward
0 new messages