lsd-admin hanging without crashing, even with NWORKERS=2; lsd-query OK

12 views
Skip to first unread message

Will Clarkson

unread,
Feb 14, 2016, 9:58:55 PM2/14/16
to lsd-users
Hello all - I am finding that lsd-admin hangs without crashing even when attempting to create a table; however, lsd-query on an existing database seems to work fine.

I think the problem must be memory allocation as in Edouard Bernard’s thread on a similar issue (thread started 5/21/15); however, the solutions I see in that thread are not solving the problem for me; lsd-admin is hanging with NWORKERS=4 or even NWORKERS=2. 

This is with a very recent version of lsd (from the past week) installed onto a 16 GB, 8-core HP Linux workstation running Scientific Linux 7.2 (although the same problem occurred under Ubuntu 15.10). I am the only user of this machine, and the only application I am (knowingly) running is the Terminal (to launch lsd). So I don’t think the machine is under unusually high load.

An example session and output are both given below. Like Edouard, I see lsd taking a very small amount of system resources (immediately on running lsd-admin create table the CPU Usage was as high as 24% but it quickly dropped to below 1%). Unlike that thread, however, I do not see any defunct processes.

Does anyone have any suggestion for a way to get round this problem? Are there methods to reduce the memory requirements other than setting NWORKERS?

I noticed at the end of the 5/21/15 thread a suggestion to reduce the iteration block size, but I don’t see how one does that… can anyone let me know how to accomplish that please? 

Thanks all!!

Will

$ source /Users/Shared/Soft/lsd-install/environment.sh 
$ source /Users/Shared/Soft/lsd-install/lsd_environment.sh 
$ export LSD_DB=/home/wiclarks/Shared/Data/db
$ lsd-query --version
Large Survey Database, version 
$ lsd-query --bounds='beam(200, 40, 1)' 'select ra, dec, g-r as gr, r from sdss where (g - r > 0.5) & (g - r < 0.6)'
 [3 el.]# ra dec gr r
198.76639995  39.68804213   0.568  20.100
199.08998169  39.31682169   0.587  21.935
199.14925463  39.26829537   0.595  20.067
[extensive screen output clipped]

$ export NWORKERS=2
$ lsd-admin create table --schema=/home/wiclarks/Shared/Data/db_schema/galex.yaml galexTEST

[hangs. Meanwhile on another terminal...]

$top
top - 21:16:53 up 58 min,  3 users,  load average: 0.00, 0.01, 0.05
Tasks: 229 total,   1 running, 228 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 16194656 total, 15056816 free,   438824 used,   699016 buff/cache
KiB Swap:  8191996 total,  8191996 free,        0 used. 15491160 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                               
17505 wiclarks  20   0  555580  71048  13556 S   0.3  0.4   0:01.22 lsd-admin                                                                                             
    1 root      20   0  192100   7400   2612 S   0.0  0.0   0:02.14 systemd                                                                                               
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.00 kthreadd                                                                                              
    3 root      20   0       0      0      0 S   0.0  0.0   0:00.00 ksoftirqd/0                                                                                           
    4 root      20   0       0      0      0 S   0.0  0.0   0:00.08 kworker/0:0       
    [etc.]

Bertrand Goldman

unread,
Feb 15, 2016, 3:53:06 AM2/15/16
to lsd-...@googlegroups.com
Hi Will,

  more experts can help better, and you probably thought about that already, but did you remove some special files .__* in the directories where your databases would (typically?) be created? Those files prevent parallel modifications/creations of tables.
  Otherwise, I had similar problems as yours, but my problems disappeared when the computer was not overloaded by other users. I guess the point when lsd starts hanging will depend on the computer specifications.

  Cheers,
    Bertrand.
--
You received this message because you are subscribed to the Google Groups "lsd-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lsd-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

goldman.vcf

Eddie Schlafly

unread,
Feb 15, 2016, 1:46:41 PM2/15/16
to lsd-...@googlegroups.com
That surprises me. The lsd-admin create command just makes a little
schema file, as far as I know, and shouldn't be resource intensive.
My guess is that Bertrand is right, and there's some leftover database
transaction or lock file that needs to be removed. I'm guessing the
create command is hanging as it tries to obtain that lock. A related
test would be to set your LSD_DB to something new and see if that
works.

Will Clarkson

unread,
Feb 15, 2016, 1:59:40 PM2/15/16
to lsd-users

Thanks, Eddie and Bertrand, for such rapid responses.

Indeed there were two lock-files in my LSD_DB directory; running in a pristine new LSD_DB location worked fine. I therefore removed the .__dblock.lock file and also the .__transaction file in my old LSD_DB location. lsd-admin now appears to work fine there too.

So I think my problem is solved, thank you both very much!  

Will

Eddie Schlafly

unread,
Feb 15, 2016, 2:06:20 PM2/15/16
to lsd-...@googlegroups.com
> Indeed there were two lock-files in my LSD_DB directory; running in a
> pristine new LSD_DB location worked fine. I therefore removed the
> .__dblock.lock file and also the .__transaction file in my old LSD_DB
> location. lsd-admin now appears to work fine there too.

For what it's worth, these are intended to prevent multiple operations
from simultaneously editing a database, and should be automatically
removed when the db modification is complete. If the modification
hangs for whatever reason, they can get left behind and need to be
manually cleaned up. It should not be problematic to delete them if
you have killed the processes associated with the modification that
did not complete.
Reply all
Reply to author
Forward
0 new messages