Some programs have problems reading from BeeGFS

550 views
Skip to first unread message

Jesper Frandsen

unread,
Jan 20, 2016, 5:32:40 AM1/20/16
to beegfs-user
Hi,

We are running BeeGFS 2015.03-r9 on two instances (one with Infiniband and one on Ethernet).
All our users have their home folder on BeeGFS.

Case 1)
Unfortunately, users of the program LCMODEL for spectroscopy analysis have huge issues with this program, as it cannot read the directory list.

No files or directories show up in the "open file" selector in the program.

It is quite easy to reproduce, as a free evaluation copy of the program can be downloaded from
http://www.s-provencher.com/pages/lcm-test.shtml

As long as the "open file" selector looking on the BeeGFS filesystem, nothing shows up. However if one starts by looking on the local file system, or on NFS mounted drives, everything acts as expected. The programs is only on BeeGFS.

This is a major problem for those of our users, that run LCMODEL. We have the registered version of the program, and it is locked to machine running BeeGFS.

Case 2)
Another example is not a show-stopper, but easy to reproduce.

Installing the "Adblock" Addon in Firefox fails!
We have tested several versions of Firefox (including v43.0.4).
Start Firefox -> Add-ons -> Search for "adblock" -> Select Install

We get "There was an error installing Adblock Plus"

If the ".mozilla" folder is placed on a local filesystem and ".mozilla" is a symlink to the local folder, installation of AdBlock is completed without errors.

Case 3)
Probably the same as case 2. Users running Thunderbird as mail klient with their homefoldes on BeeGFS are not able to use the Address Book. No entries can be added.

Yours
  Jesper Frandsen






Ely de Oliveira

unread,
Jan 22, 2016, 12:02:53 PM1/22/16
to fhgfs...@googlegroups.com
Hi Jesper,

Have you tried to use strace to check which syscalls are failing?

Best regards,

Ely

On 01/20/2016 11:32 AM, Jesper Frandsen wrote:
> Hi,
>
> We are running BeeGFS 2015.03-r9 on two instances (one with Infiniband
> and one on Ethernet).
> All our users have their home folder on BeeGFS.
>
> *Case 1)*
> Unfortunately, users of the program LCMODEL for spectroscopy analysis
> have huge issues with this program, as it cannot read the directory list.
>
> No files or directories show up in the "open file" selector in the program.
>
> It is quite easy to reproduce, as a free evaluation copy of the program
> can be downloaded from
> http://www.s-provencher.com/pages/lcm-test.shtml
>
> As long as the "open file" selector looking on the BeeGFS filesystem,
> nothing shows up. However if one starts by looking on the local file
> system, or on NFS mounted drives, everything acts as expected. The
> programs is only on BeeGFS.
>
> This is a *major* problem for those of our users, that run LCMODEL. We
> have the registered version of the program, and it is locked to machine
> running BeeGFS.
>
> *Case 2)*
> Another example is not a show-stopper, but easy to reproduce.
>
> Installing the "Adblock" Addon in Firefox fails!
> We have tested several versions of Firefox (including v43.0.4).
> Start Firefox -> Add-ons -> Search for "adblock" -> Select Install
>
> We get "There was an error installing Adblock Plus"
>
> If the ".mozilla" folder is placed on a local filesystem and ".mozilla"
> is a symlink to the local folder, installation of AdBlock is completed
> without errors.
>
> *Case 3)*
> Probably the same as case 2. Users running Thunderbird as mail klient
> with their homefoldes on BeeGFS are not able to use the Address Book. No
> entries can be added.
>
> Yours
> Jesper Frandsen
>
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "beegfs-user" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to fhgfs-user+...@googlegroups.com
> <mailto:fhgfs-user+...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

Jesper Frandsen

unread,
Jan 25, 2016, 8:05:18 AM1/25/16
to beegfs-user, ely.ol...@itwm.fraunhofer.de
> Den fredag den 22. januar 2016 kl. 18.02.53 UTC+1 skrev Ely de Oliveira:Hi Jesper,
>
> Have you tried to use strace to check which syscalls are failing?
> Best regards,
> Hi Ely,

Hi, and thanks for the replay,

The traces are very very long, but here are the lines that have problems.

1) First for LCMODEL:

When browsing the local filesystem, LCMODEL is able to list the directories (example for the root of the file system):

chdir("/")                              = 0
getcwd("/", 4097)                       = 2
stat64("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
open("/", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 5
fstat64(5, {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
fcntl64(5, F_SETFD, FD_CLOEXEC)         = 0
getdents64(5, /* 24 entries */, 4096)   = 616
getdents64(5, /* 0 entries */, 4096)    = 0
close(5)                                = 0
stat64("/bin", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat64("/boot", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat64("/dev", {st_mode=S_IFDIR|0755, st_size=4400, ...}) = 0
stat64("/etc", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat64("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat64("/initrd.img", {st_mode=S_IFREG|0644, st_size=29851961, ...}) = 0
stat64("/lib", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat64("/lib64", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat64("/lost+found", {st_mode=S_IFDIR|0700, st_size=16384, ...}) = 0
stat64("/media", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat64("/mnt", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat64("/opt", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat64("/proc", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
stat64("/root", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
stat64("/run", {st_mode=S_IFDIR|0755, st_size=740, ...}) = 0
stat64("/sbin", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat64("/srv", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat64("/sys", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
stat64("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
stat64("/usr", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat64("/var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0


But browsing a folder on the BeeGFS files system, strace looks like

chdir("/mnt/beegfs/foo")                = 0
getcwd("/mnt/beegfs/foo", 4097)         = 16
lstat64("/mnt", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/mnt/beegfs", {st_mode=S_IFDIR|0777, st_size=1, ...}) = 0
stat64("/mnt/beegfs/foo/", {st_mode=S_IFDIR|0755, st_size=33, ...}) = 0
open("/mnt/beegfs/foo/", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 5
fstat64(5, {st_mode=S_IFDIR|0755, st_size=33, ...}) = 0
fcntl64(5, F_SETFD, FD_CLOEXEC)         = 0
old_mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff74af000
getdents64(5, /* 35 entries */, 524288) = 1144
munmap(0xf74af000, 528384)              = 0
close(5)                                = 0
getcwd("/mnt/beegfs/foo", 4097)         = 16
lstat64("/mnt", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/mnt/beegfs", {st_mode=S_IFDIR|0777, st_size=1, ...}) = 0
stat64("/mnt/beegfs/foo/", {st_mode=S_IFDIR|0755, st_size=33, ...}) = 0
open("/mnt/beegfs/foo/", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 5
fstat64(5, {st_mode=S_IFDIR|0755, st_size=33, ...}) = 0
fcntl64(5, F_SETFD, FD_CLOEXEC)         = 0
old_mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff74af000
getdents64(5, /* 35 entries */, 524288) = 1144
munmap(0xf74af000, 528384)              = 0
close(5)                                = 0
chdir("/mnt/beegfs/foo/.lcmodel/bin")   = 0
write(3, "\2\30\4\0\235\0\240\2\2\0\0\0\377\377\377\0\2\0\4\0\235\0\240\2\2\0\0\0\377\377\377\0"..., 64) = 64
select(4, [3], [], [], {0, 0})          = 0 (Timeout)
write(3, "5\30\4\0\252\0\240\2\235\0\240\2O\1\27\0F\0\5\0\252\0\240\2!\0\240\2\0\0\0\0"..., 44) = 44
read(3, 0xffcf5bb0, 32)                 = -1 EAGAIN (Resource temporarily unavailable)
select(4, [3], NULL, NULL, NULL)        = 1 (in [3])
read(3, "\1\30}\21\0\0\0\0\301\0\0\0\0\0\0\0O\1\27\0\0\0\0\0\0\0\0\0\0\0\0\0", 32) = 32
write(3, "J\30\7\0\252\0\240\0027\0\240\2\5\0\20\0\n\0fid*  FID*F\0\5\0"..., 268) = 268
select(4, [3], [], [], {0, 0})          = 0 (Timeout)
select(4, [3], [], [], {0, 0})          = 0 (Timeout)
write(3, "5\30\4\0\252\0\240\2[\0\240\2\326\0v\2+\0\1\0", 20) = 20
read(3, 0xffcf5e00, 32)                 = -1 EAGAIN (Resource temporarily unavailable)
select(4, [3], NULL, NULL, NULL)        = 1 (in [3])
read(3, "\1\1\215\21\0\0\0\0\206\0\240\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 32) = 32
write(3, "F\30\5\0\252\0\240\2I\0\240\2\0\0\0\0\326\0v\2F*\5\0\252\0\240\2\\\0\240\2"..., 96) = 96
read(3, 0xffcf5a10, 32)                 = -1 EAGAIN (Resource temporarily unavailable)
select(4, [3], NULL, NULL, NULL)        = 1 (in [3])
read(3, "\1\30\222\21\0\0\0\0\301\0\0\0\0\0\0\0\326\0v\2\0\0\0\0\0\0\0\0\0\0\0\0", 32) = 32
write(3, "J\30\5\0\252\0\240\2J\0\240\2\27\0\f\0\1\0/\0F*\5\0\252\0\240\2\\\0\240\2"..., 96) = 96
read(3, 0xffcf59a0, 32)                 = -1 EAGAIN (Resource temporarily unavailable)
select(4, [3], NULL, NULL, NULL)        = 1 (in [3])
read(3, "\1\30\227\21\0\0\0\0\301\0\0\0\0\0\0\0\326\0v\2\0\0\0\0\0\0\0\0\0\0\0\0", 32) = 32
write(3, "J\30\6\0\252\0\240\2J\0\240\2\36\0\33\0\3\0mnt\0\5\0F\0\5\0\252\0\240\2"..., 100) = 100
read(3, 0xffcf5930, 32)                 = -1 EAGAIN (Resource temporarily unavailable)


It seems that LCMODEL sees the 35 files/folders in /mnt/beegfs/foo but they are NOT shown in the file dialog (and they do not seem to be examined, as there is no "stat64" calls like for the local filesystem).
Maybe the LCMODEL program cannot detect the 16 folders out of the list of 35 entries??

2) Installing a plugin for Firefox

Here the strace files is ~30000 lines, but this is what happens when firefox tries to install the plugin:

First, when the Firefox profile folder is on BeeGFS:

access("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/extensions/trash", F_OK) = -1 ENOENT (No such file or directory)               
mkdir("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/extensions/trash", 0755) = 0                                                    
access("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/extensions/{d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}", F_OK) = -1 ENOENT (No such \
file or directory)                                                                                                                       
access("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/extensions/{d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}.xpi", F_OK) = -1 ENOENT (No s\
uch file or directory)                                                                                                                   
stat("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/extensions/staged/{d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}.xpi", {st_mode=S_IFREG|0\
600, st_size=1001911, ...}) = 0                                                                                                          
write(25, "\372", 1)                    = 1                                                                                              
stat("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/extensions/staged/{d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}.xpi", {st_mode=S_IFREG|0\
600, st_size=1001911, ...}) = 0                                                                                                          
access("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/extensions", F_OK) = 0                                                         
stat("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/extensions", {st_mode=S_IFDIR|0700, st_size=0, ...}) = 0                         
rename("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/extensions/staged/{d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}.xpi", "/mnt/beegfs/foo\
/.mozilla/firefox/1k6x37bj.default/extensions/{d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}.xpi") = -1 EBUSY (Device or resource busy)          
mmap(NULL, 65536, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f40ad116000                                     
write(1, "1453709154270\taddons.xpi\tERROR\tF"..., 13121453709154270    addons.xpi      ERROR   Failed to move file /mnt/beegfs/foo/.mozi\
lla/firefox/1k6x37bj.default/extensions/staged/{d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}.xpi to /mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.de\
fault/extensions: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIFile.moveTo]"  nsresult: "0x80004005 \
(NS_ERROR_FAILURE)"  location: "JS frame :: resource://gre/modules/addons/XPIProvider.jsm :: SIO_installFile :: line 371"  data: no] Stac\
k trace: SIO_installFile()@resource://gre/modules/addons/XPIProvider.jsm:371 < SIO_installDirEntry()@resource://gre/modules/addons/XPIPro\
vider.jsm:451 < SIO_move()@resource://gre/modules/addons/XPIProvider.jsm:472 < DirInstallLocation_installAddon()@resource://gre/modules/a\
ddons/XPIProvider.jsm:7206 < AI_startInstall/<()@resource://gre/modules/addons/XPIProvider.jsm:5557 < next()@self-hosted:706 < TaskImpl_r\
un()@resource://gre/modules/Task.jsm:330 < Handler.prototype.process()@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promi\
se-backend.js:934 < this.PromiseWalker.walkerLoop()@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:813 <\
 this.PromiseWalker.scheduleWalkerLoop/<()@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:747 < <file:un\
known>                                                                                                                                   
) = 1312                                                                                                                                 
write(1, "1453709154271\taddons.xpi\tERROR\tF"..., 2211453709154271     addons.xpi      ERROR   Failure moving /mnt/beegfs/foo/.mozilla/f\
irefox/1k6x37bj.default/extensions/staged/{d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d}.xpi to /mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default\
/extensions                                                                                                                              
) = 221
stat("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/extensions/trash", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0                   
chmod("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/extensions/trash", 0755) = 0                                                    
stat("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/extensions/trash", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0                   
lstat("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/extensions/trash", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0                  
openat(AT_FDCWD, "/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/extensions/trash", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 72   
getdents(72, /* 2 entries */, 32768)    = 48                                                                                             
getdents(72, /* 0 entries */, 32768)    = 0                                                                                              
close(72)                               = 0                                                                                              
rmdir("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/extensions/trash") = 0                                                          
write(1, "1453709154273\taddons.xpi\tWARN\tFa"..., 12561453709154273    addons.xpi      WARN    Failed to install /tmp/tmp-25a.xpi from h\
ttps://addons.mozilla.org/firefox/downloads/file/387464/adblock_plus-2.7.1-sm+tb+an+fx.xpi?src=api: [Exception... "Component returned fai\
lure code: 0x80004005 (NS_ERROR_FAILURE) [nsIFile.moveTo]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: resource://\
gre/modules/addons/XPIProvider.jsm :: SIO_installFile :: line 371"  data: no] Stack trace: SIO_installFile()@resource://gre/modules/addon\
s/XPIProvider.jsm:371 < SIO_installDirEntry()@resource://gre/modules/addons/XPIProvider.jsm:451 < SIO_move()@resource://gre/modules/addon\
s/XPIProvider.jsm:472 < DirInstallLocation_installAddon()@resource://gre/modules/addons/XPIProvider.jsm:7206 < AI_startInstall/<()@resour\
ce://gre/modules/addons/XPIProvider.jsm:5557 < next()@self-hosted:706 < TaskImpl_run()@resource://gre/modules/Task.jsm:330 < Handler.prot\
otype.process()@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:934 < this.PromiseWalker.walkerLoop()@res\
ource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:813 < this.PromiseWalker.scheduleWalkerLoop/<()@resource://g\
re/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:747 < <file:unknown>


And this is when the Firefox profile folder is on a local harddisk:

rename("/tmp/mozilla/firefox/1k6x37bj.default/prefs-1.js", "/tmp/mozilla/firefox/1k6x37bj.default/prefs.js") = 0                         
mkdir("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/adblockplus", 0755) = 0                                                         
futex(0x7fb3fccd6bcc, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7fb3fccd6bc8, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1                            
futex(0x7fb413500040, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable)                                        
futex(0x7fb413500040, FUTEX_WAKE_PRIVATE, 1) = 0  
[snip]
lstat("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/adblockplus/elemhide.css", {st_mode=S_IFREG|0600, st_size=6861, ...}) = 0       
stat("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/adblockplus/elemhide.css", {st_mode=S_IFREG|0600, st_size=6861, ...}) = 0        
open("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/adblockplus/elemhide.css", O_RDONLY) = 36                                        
stat("/mnt/beegfs/foo/.mozilla/firefox/1k6x37bj.default/adblockplus/elemhide.css", {st_mode=S_IFREG|0600, st_size=6861, ...}) = 0   


(maybe I have cut to much from the strace file, but there was a lot a noise....)

Yours
 Jesper Frandsen

Sven Breuner

unread,
Jan 29, 2016, 8:56:40 PM1/29/16
to fhgfs...@googlegroups.com, jrfra...@gmail.com
Hi Jesper,

regarding the LCMODEL strace: I can't see any wrong behavior of BeeGFS in the
strace so far.

Just as you said, in the case of the root directory, getdents64 is the syscall
to query the directory contents. The interesting thing here is that it gets
called twice: The first time it returns 24 entries and then it gets called
again, returning 0 entries (indicating the end of directory). Afterwards the
attributes of each entry are queried via stat64, which (depending on which
particular information the LCMODEL file selector needs) might only be necessary
for file systems that do not return the entry type (i.e. "file", "directory",
...) in the getdents64 result list.

BeeGFS does return entry types in the getdents64 result list, so some programs
can avoid an extra stat64() call due to this.

However, in the case of the "/mnt/beegfs/foo" directory, BeeGFS is returning 35
entries in the first call to getdents64(). So far, everything looks good and
correct.
Then, instead of calling getdents64 again to query more entries like it happend
in case of the root file system, it seems like LCMODEL decides to change into
the subdir "/mnt/beegfs/foo/.lcmodel/bin". I assume LCMODEL learned about the
existence of this subdir from the getdents64 result, so this also seems to be ok.
I guess this subdir does not exist in the case of the root dir, so that's where
the two test cases differ.
However, also the change to this subdir works fine according to the strace.
The rest of the strace is then about the file descriptors 3 and 4, for which we
don't see information in the strace part that you provided. What are they
referring to? Some network communication that is failing with timeouts maybe?

Best regards,
Sven
> [...]

Jesper Frandsen

unread,
Feb 2, 2016, 8:37:10 AM2/2/16
to beegfs-user, jrfra...@gmail.com, sven.b...@itwm.fraunhofer.de
Hi again,

The LCMODEL program works for local disk and for NFS mounted disks, but not for BeeGFS mounted disks.

I have tried the tuneEarlyCloseResponse = false option for the BeeGFS client, but it made no difference.
Any suggestions are very welcome :-)

Furthermore I had a closer look at the strace files.
The first observation is that there is a difference in the return values from lstat64, which returns st_size=4096 for local disk, but other values for BeeGFS folders:


< lstat64("/mnt", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
< lstat64("/mnt/beegfs", {st_mode=S_IFDIR|0777, st_size=1, ...}) = 0
< lstat64("/mnt/beegfs/foo", {st_mode=S_IFDIR|0755, st_size=35, ...}) = 0
< lstat64("/mnt/beegfs/foo/.lcmodel", {st_mode=S_IFDIR|0755, st_size=20, ...}) = 0
< lstat64("/mnt/beegfs/foo/.lcmodel/lib", {st_mode=S_IFDIR|0755, st_size=3, ...}) = 0
< lstat64("/mnt/beegfs/foo/.lcmodel/lib/tcl", {st_mode=S_IFDIR|0755, st_size=17, ...}) = 0
< access("/mnt/beegfs/foo/.lcmodel/lib/tcl/init.tcl", F_OK) = 0
< stat64("/mnt/beegfs/foo/.lcmodel/lib/tcl/init.tcl", {st_mode=S_IFREG|0644, st_size=22451, ...}) = 0
< open("/mnt/beegfs/foo/.lcmodel/lib/tcl/init.tcl", O_RDONLY|O_LARGEFILE) = 3
---
> lstat64("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
> lstat64("/tmp/foo", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
> lstat64("/tmp/foo/.lcmodel", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> lstat64("/tmp/foo/.lcmodel/lib", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> lstat64("/tmp/foo/.lcmodel/lib/tcl", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> access("/tmp/foo/.lcmodel/lib/tcl/init.tcl", F_OK) = 0
> stat64("/tmp/foo/.lcmodel/lib/tcl/init.tcl", {st_mode=S_IFREG|0644, st_size=22451, ...}) = 0
> open("/tmp/foo/.lcmodel/lib/tcl/init.tcl", O_RDONLY|O_LARGEFILE) = 3

I do not know if that is a problem. The trace above was from LCMODEL starting and reading setup files.

The following is strace when examining a folder with 100 subfolders:

First on localdisk: (again see that lstat64 and stat64 always returns st_size=4096):
getcwd("/tmp/foo/Data", 4097)           = 14
lstat64("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
lstat64("/tmp/foo", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
stat64("/tmp/foo/Data/", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
open("/tmp/foo/Data/", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 5
fstat64(5, {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0

fcntl64(5, F_SETFD, FD_CLOEXEC)         = 0
getdents64(5, /* 102 entries */, 4096)  = 3248
brk(0x8ee1000)                          = 0x8ee1000
brk(0x8ee2000)                          = 0x8ee2000

getdents64(5, /* 0 entries */, 4096)    = 0
close(5)                                = 0
lstat64("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
lstat64("/tmp/foo", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp/foo/Data", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
stat64("/tmp/foo/Data/Data001", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
lstat64("/tmp/foo", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp/foo/Data", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
stat64("/tmp/foo/Data/Data002", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
lstat64("/tmp/foo", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp/foo/Data", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
stat64("/tmp/foo/Data/Data003", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
lstat64("/tmp/foo", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp/foo/Data", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
stat64("/tmp/foo/Data/Data004", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
lstat64("/tmp/foo", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp/foo/Data", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
stat64("/tmp/foo/Data/Data005", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
lstat64("/tmp/foo", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp/foo/Data", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
stat64("/tmp/foo/Data/Data006", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
lstat64("/tmp/foo", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp/foo/Data", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
stat64("/tmp/foo/Data/Data007", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
lstat64("/tmp/foo", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp/foo/Data", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
stat64("/tmp/foo/Data/Data008", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=4096, ...}) = 0
lstat64("/tmp/foo", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat64("/tmp/foo/Data", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
[etc]


The following is a trace of LCMODEL on a BeeGFS disk when looking a folder with 100 subfolders.
But the program does not "see" the subfolders.

getcwd("/mnt/beegfs/foo/Data", 4097)    = 21

lstat64("/mnt", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat64("/mnt/beegfs", {st_mode=S_IFDIR|0777, st_size=1, ...}) = 0
lstat64("/mnt/beegfs/foo", {st_mode=S_IFDIR|0755, st_size=38, ...}) = 0
stat64("/mnt/beegfs/foo/Data/", {st_mode=S_IFDIR|0775, st_size=100, ...}) = 0
open("/mnt/beegfs/foo/Data/", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 5
fstat64(5, {st_mode=S_IFDIR|0775, st_size=100, ...}) = 0

fcntl64(5, F_SETFD, FD_CLOEXEC)         = 0
old_mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7551000
getdents64(5, /* 102 entries */, 524288) = 3248
munmap(0xf7551000, 528384)              = 0
close(5)                                = 0
chdir("/mnt/beegfs/foo")                = 0
write(3, "\2\30\4\0\235\0\240\2\2\0\0\0\377\377\377\0\2\0\4\0\235\0\240\2\2\0\0\0\377\377\377\0"..., 108) = 108
read(3, "\10\3\317\7\343]|\31\301\0\0\0\233\0\240\2\0\0\0\0\206\2\227\0\213\0\22\0\0\0\0\3", 32) = 32
read(3, "\10\4\317\7\343]|\31\301\0\0\0\227\0\240\2\233\0\240\2\206\2\227\0\214\0\22\0\0\0\0\3", 32) = 32
read(3, "\10\4\317\7\343]|\31\301\0\0\0\222\0\240\2\227\0\240\2\206\2\227\0\214\0!\0\0\0\0\3", 32) = 32
read(3, "\10\4\317\7\343]|\31\301\0\0\0X\0\240\2\222\0\240\2\206\2\227\0\225\0&\0\0\0\0\3", 32) = 32
read(3, "\10\4\317\7\343]|\31\301\0\0\0W\0\240\2X\0\240\2\206\2\227\0\225\0&\0\0\0\0\3", 32) = 32
read(3, "\10\4\317\7\343]|\31\301\0\0\0V\0\240\2W\0\240\2\206\2\227\0\231\0*\0\0\0\0\3", 32) = 32
read(3, "\10\4\317\7\343]|\31\301\0\0\0U\0\240\2V\0\240\2\206\2\227\0\231\0*\0\0\0\0\3", 32) = 32
read(3, "\7\3\317\7\343]|\31\301\0\0\0P\0\240\2\0\0\0\0\206\2\227\0\231\0*\0\0\0\0\3", 32) = 32
read(3, "\1\30\335\7\0\0\0\0\301\0\0\0\0\0\0\0O\1\27\0\0\0\0\0\0\0\0\0\0\0\0\0", 32) = 32
write(3, "J\30\7\0\27\0\240\0027\0\240\2\5\0\20\0\n\0fid*  FID*F\377\5\0"..., 376) = 376
read(3, 0xffb5c0c0, 32)                 = -1 EAGAIN (Resource temporarily unavailable)

select(4, [3], NULL, NULL, NULL)        = 1 (in [3])
read(3, "\10\3\346\7\352]|\31\301\0\0\0P\0\240\2\0\0\0\0\206\2\227\0\231\0*\0\0\0\0\3", 32) = 32
read(3, "\7\4\346\7\352]|\31\301\0\0\0U\0\240\2V\0\240\2\206\2\227\0\231\0*\0\0\0\0\3", 32) = 32
read(3, "\7\4\346\7\352]|\31\301\0\0\0V\0\240\2W\0\240\2\206\2\227\0\231\0*\0\0\0\0\3", 32) = 32
read(3, "\7\4\346\7\352]|\31\301\0\0\0W\0\240\2X\0\240\2\206\2\227\0\225\0&\0\0\0\0\3", 32) = 32
read(3, "\7\4\346\7\352]|\31\301\0\0\0X\0\240\2\222\0\240\2\206\2\227\0\225\0&\0\0\0\0\3", 32) = 32
read(3, "\7\4\346\7\352]|\31\301\0\0\0\222\0\240\2\227\0\240\2\206\2\227\0\214\0!\0\0\0\0\3", 32) = 32
read(3, "\7\4\346\7\352]|\31\301\0\0\0\227\0\240\2\233\0\240\2\206\2\227\0\214\0\22\0\0\0\0\3", 32) = 32
read(3, "\7\3\346\7\352]|\31\301\0\0\0\233\0\240\2\0\0\0\0\206\2\227\0\213\0\22\0\0\0\0\3", 32) = 32
read(3, "\1\2\360\7\0\0\0\0\206\0\240\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 32) = 32
write(3, "F\30\5\0\30\0\240\2I\0\240\2\0\0\0\0\326\0v\2F*\5\0\30\0\240\2\\\0\240\2"..., 96) = 96
read(3, 0xffb5bcd0, 32)                 = -1 EAGAIN (Resource temporarily unavailable)

select(4, [3], NULL, NULL, NULL)        = 1 (in [3])
read(3, "\1\30\365\7\0\0\0\0\301\0\0\0\0\0\0\0\326\0v\2\0\0\0\0\0\0\0\0\0\0\0\0", 32) = 32
write(3, "J\30\5\0\30\0\240\2J\0\240\2\27\0\f\0\1\0/\0F*\5\0\30\0\240\2\\\0\240\2"..., 96) = 96
read(3, 0xffb5bc60, 32)                 = -1 EAGAIN (Resource temporarily unavailable)

select(4, [3], NULL, NULL, NULL)        = 1 (in [3])
read(3, "\1\30\372\7\0\0\0\0\301\0\0\0\0\0\0\0\326\0v\2\0\0\0\0\0\0\0\0\0\0\0\0", 32) = 32
write(3, "J\30\6\0\30\0\240\2J\0\240\2\36\0\33\0\3\0mnt\0\5\0F\0\5\0\30\0\240\2"..., 100) = 100
read(3, 0xffb5bbf0, 32)                 = -1 EAGAIN (Resource temporarily unavailable)

select(4, [3], NULL, NULL, NULL)        = 1 (in [3])
read(3, "\1\30\377\7\0\0\0\0\301\0\0\0\0\0\0\0\326\0v\2\0\0\0\0\0\0\0\0\0\0\0\0", 32) = 32
write(3, "J\30\6\0\30\0\240\2J\0\240\2%\0*\0\6\0beegfsF\0\5\0\30\0\240\2"..., 100) = 100
read(3, 0xffb5bb80, 32)                 = -1 EAGAIN (Resource temporarily unavailable)

select(4, [3], NULL, NULL, NULL)        = 1 (in [3])
read(3, "\1\30\4\10\0\0\0\0\301\0\0\0\0\0\0\0\326\0v\2\0\0\0\0\0\0\0\0\0\0\0\0", 32) = 32
write(3, "J\30\6\0\30\0\240\2J\0\240\2,\0009\0\3\0foo\0fsF\0\5\0\30\0\240\2"..., 180) = 180
read(3, 0xffb5bb10, 32)                 = -1 EAGAIN (Resource temporarily unavailable)

select(4, [3], NULL, NULL, NULL)        = 1 (in [3])
read(3, "\1\30\r\10\0\0\0\0\301\0\0\0\0\0\0\0\326\0v\2\0\0\0\0\0\0\0\0\0\0\0\0", 32) = 32
write(3, "J\30\6\0\30\0\240\2K\0\240\0023\0H\0\4\0Data\0sC\0\5\0\30\0\240\2"..., 516) = 516
read(3, 0xffb5b5a0, 32)                 = -1 EAGAIN (Resource temporarily unavailable)

select(4, [3], NULL, NULL, NULL)        = 1 (in [3])
read(3, "\1\2\"\10\0\0\0\0\206\0\240\2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 32) = 32
[snip]

LCMODEL can for some reason not see the 100 subfolders??

File pointers 3 and 4 appears to be:
socket(PF_LOCAL, SOCK_STREAM, 0)        = 3
open("/mnt/beegfs/foo/.lcmodel/profiles/1/gui-defaults", O_RDONLY|O_CREAT|O_LARGEFILE, 0644) = 4

Any help is much appreciated!

Yours
 Jesper Frandsen






Peter Serocka

unread,
Feb 2, 2016, 9:13:03 AM2/2/16
to fhgfs...@googlegroups.com, jrfra...@gmail.com, sven.b...@itwm.fraunhofer.de
st_size is supposed to give the size in bytes.
For directories that is the space of the entries table
(or whatever data structure is used).

Apparently for the “local” filesystem here that is
at least one 4KB block.

For BeeGFS we see that st_size holds the number of files/subdirs.

Without the LCMODEL source code one could not say
how exactly the application gets confused by receiving
the smaller number here. Maybe it makes the plausible
assumption that the st_size of a directory should be
at least large enough to hold all names of its entries
(files, immediate subdirs).

fwiw

— Peter
> --
> You received this message because you are subscribed to the Google Groups "beegfs-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to fhgfs-user+...@googlegroups.com.

Jesper Frandsen

unread,
Feb 26, 2016, 3:45:57 AM2/26/16
to beegfs-user, jrfra...@gmail.com, sven.b...@itwm.fraunhofer.de
Hi Pete et al

I'm sure that you're right, but BeeGFS still behaves differently than NFS or a local ext3/4 file system for some programs.

It is a big problem for us for the LCMODEL program.

We do not have access to the source code, so it is not possible for us to change the program.

But as I mentioned the same (I think) problem can be easily tested using the Firefox webbrowser.
Running the Linux version (we run the 64-bit version) of Firefox where Firefox has it's ".mozilla" folder on BeeGFS it is not possible to install the AdBlock Plus extension.
This is quite easy to test, and I would very much like to know if it is possible for others.

We're running 2015.03-r9 on the production system and 2015.03-r10 on a test system, and we have the same problems on both systems.
On the production system both client machine and servers run Ubuntu 14.04 64-bit server. We run Ubuntu 15.10 64-bit on the test system.

Yours,
 Jesper Frandsen

Peter Serocka

unread,
Feb 29, 2016, 11:00:00 AM2/29/16
to fhgfs...@googlegroups.com
Jesper

you could check wether LCMODEL runs successfully
with a “fixed” BeeGFS that you might create
from the published sources. Where the value
is assigned to st_size for a directory, multiply
it by the maximum possible filename length NAME_MAX.

The Firefox issue might be caused by
something else like links not being
created in other folders.

Cheers

— Peter

Sven Breuner

unread,
Mar 27, 2016, 8:18:11 PM3/27/16
to beegfs-user, Jesper Frandsen
Hi Jesper,

just a shot in the dark, but I noticed that LCMODEL (or more specifically
lcmgui) is a 32bit program. Can you check whether the problem also exists with a
64bit build? Because depending on the build options of a 32bit program, this
makes a difference (e.g. like not being able to deal with 64bit inode numbers
and such).

(By the way, as Peter also said, I think the Firefox problem is a completely
different thing.)

Best regards,
Sven

Jesper Frandsen

unread,
Apr 11, 2016, 4:14:07 AM4/11/16
to beegfs-user, jrfra...@gmail.com, sven.b...@itwm.fraunhofer.de
OK, lets ignore LCMODEL for the time being.

But it is still not optimal that users (that all have their homefolder on BeeGFS) cannot use certain addons in Firefox and it seems that the AddressBook in Thunderbird does not work, if ones homefolder is on BeeGFS.

I have attached a screenshot of the of the error message you get if you try to install Adblock addon in Firefox.

It is run on a test-setup:

Metaserver: 64-bit Ubuntu Server 16.04 Beta2
Storageserver: 64-bit Ubuntu Server 16.04 Beta2
Client: 64-bit Ubuntu Desktop 16.04 LTS

All machines are running BeeGFS 2015.03-r11. Meta server and storage server are set up according to the BeeGFS recommendations.

Does anyone else experience similar problems?

Yours,
 Jesper




BeeGFS_problem1.png
BeeGFS_problem2.png

nlm...@g.clemson.edu

unread,
Apr 21, 2016, 4:05:44 AM4/21/16
to beegfs-user, jrfra...@gmail.com, sven.b...@itwm.fraunhofer.de
Do these programs use any databases e.g. sqlite? If I recall, the sqlite code relies on file locking which is not available in a lot of cluster file systems desgined for HPC.

Thanks,

Nick Mills
Graduate Research Assistant
Clemson University

nlm...@g.clemson.edu

unread,
Apr 22, 2016, 1:57:38 AM4/22/16
to beegfs-user, jrfra...@gmail.com, sven.b...@itwm.fraunhofer.de, nlm...@g.clemson.edu
Nope, I'm mistaken. Sqlite appears to work. Maybe it was Berkeley DB I was thinking of?

-Nick

Sven Breuner

unread,
Apr 22, 2016, 11:54:47 AM4/22/16
to fhgfs...@googlegroups.com, jrfra...@gmail.com, nlm...@g.clemson.edu
Hi Nick,

nlm...@g.clemson.edu wrote on 22.04.2016 07:57:
> Nope, I'm mistaken. Sqlite appears to work. Maybe it was Berkeley DB I was
> thinking of?
>
> -Nick
>
> On Thursday, April 21, 2016 at 4:05:44 AM UTC-4, nlm...@g.clemson.edu wrote:
>
> Do these programs use any databases e.g. sqlite? If I recall, the sqlite
> code relies on file locking which is not available in a lot of cluster file
> systems desgined for HPC.


for BeeGFS, lockf() / flock() / fcntl(F_SETLK) work by default to synchronize
processes running on the same machine.
Enabling global locks to synchronize processes that are running on different
machines is also possible by setting "tuneUseGlobalFileLocks=true" in
beegfs-client.conf.

Best regards,
Sven



> Thanks,
>
> Nick Mills
> Graduate Research Assistant
> Clemson University
>
> On Monday, April 11, 2016 at 4:14:07 AM UTC-4, Jesper Frandsen wrote:
>
> OK, lets ignore LCMODEL for the time being.
>
> But it is still not optimal that users (that all have their homefolder
> on BeeGFS) cannot use certain addons in Firefox and it seems that the
> AddressBook in Thunderbird does not work, if ones homefolder is on BeeGFS.
>
> I have attached a screenshot of the of the error message you get if you
> try to install Adblock addon in Firefox.
>
> It is run on a test-setup:
>
> Metaserver: 64-bit Ubuntu Server 16.04 Beta2
> Storageserver: 64-bit Ubuntu Server 16.04 Beta2
> Client: 64-bit Ubuntu Desktop 16.04 LTS
>
> All machines are running BeeGFS 2015.03-r11. Meta server and storage
> server are set up according to the BeeGFS recommendations.
>
> Does anyone else experience similar problems?
>
> Yours,
> Jesper
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "beegfs-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to fhgfs-user+...@googlegroups.com
> <mailto:fhgfs-user+...@googlegroups.com>.
Reply all
Reply to author
Forward
0 new messages