File descriptor of child higher than MAX_FD

170 views
Skip to first unread message

Xander Cage

unread,
Apr 8, 2021, 8:58:28 AM4/8/21
to help-cfengine

hi,

i see this messages in the outputs. not sure, but  this occurs since i doing remote copy (pulling log files from clients to policy server) with backgrounding the agent for parallelization. i saw some old references from old cf2 days, but i have no segfaults or something, beside this its not linux but aix here.


root@nimvie: /var/cfengine/outputs # grep descriptor *
cf_nimvie__1617872228_Thu_Apr__8_10_57_08_2021_102: warning: File descriptor 180 of child 34996634 higher than MAX_FD, check for defunct children
cf_nimvie__1617872828_Thu_Apr__8_11_07_08_2021_104: warning: File descriptor 380 of child 12255978 higher than MAX_FD, check for defunct children

wbr

chris

Aleksey Tsalolikhin

unread,
Apr 9, 2021, 7:05:24 PM4/9/21
to Xander Cage, help-cfengine
Hi Christian,

Looks like you're hitting MAX_FD which is hardcoded to 128:


Here is where the error message comes from:


You said you are parallelizing work -- could it be you are spawning more than 128 children?

Best,
Aleksey

-- 
Founder
Vertical Sysadmin, Inc.
Achieve real learning.


--
You received this message because you are subscribed to the Google Groups "help-cfengine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to help-cfengin...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/help-cfengine/7ac69ef9-f7eb-4086-8d0e-408428c7b986n%40googlegroups.com.

Nick Anderson

unread,
Apr 9, 2021, 8:02:54 PM4/9/21
to Aleksey Tsalolikhin, Xander Cage, help-cfengine

Xander Cage

unread,
Apr 10, 2021, 5:32:15 AM4/10/21
to help-cfengine
aleksey,  i'm pulling one logfile from every client (about 450) on every schedule (5min i think),  with setting action => bg("1", "4"), so i guess 128 childs are easily exceeded.

nick, i looked at the ticket, i have maxconnections set to 800, nofiles is set to 8192 in /etc/security/limits.conf, so it seems this has no effect.

Xander Cage

unread,
Apr 11, 2021, 5:09:48 AM4/11/21
to help-cfengine
l looked at it a little bit closer and could not find anything in cf-serverd's source code that handles MAX_FD dynamic, seems to be a static thing.


getFileDescriptorLimit() {
long long sysconfResult = sysconf(_SC_OPEN_MAX);
struct rlimit rl;
long long rlimitResult;
if (getrlimit(RLIMIT_NOFILE, &rl) == -1) {
    rlimitResult = 0;
} else {
    rlimitResult = (long long) rl.rlim_max; } long result;
if (sysconfResult > rlimitResult) { result = sysconfResult;
} else { result = rlimitResult;
}
if (result < 0) { // Both calls returned errors. result = 9999;
} else if (result < 2) { // The calls reported broken values. result = 2;
}
return result;
}



maybe CFE-2557 was closed unitended or before the daily dose of caffeine  ;-)

Xander Cage

unread,
Apr 12, 2021, 7:07:58 AM4/12/21
to help-cfengine
hmm...this problem is not as easy to handle as it seems...if i assume right, the MAX_FD setting is a  per thread setting and it already
makes a realloc if the descriptors get exhausted.  this needs higher knowledge of an cfengine expert.

static void ChildrenFDSet(int fd, pid_t pid)
{
    int new_max = 0;

    if (fd >= MAX_FD)
    {
        Log(LOG_LEVEL_WARNING,
            "File descriptor %d of child %jd higher than MAX_FD, check for defunct children",
            fd, (intmax_t) pid);
        new_max = fd + 32;
    }

    ThreadLock(cft_count);

    if (new_max)
    {
        CHILDREN = xrealloc(CHILDREN, new_max * sizeof(pid_t));
        MAX_FD = new_max;
    }

    CHILDREN[fd] = pid;
    ThreadUnlock(cft_count);
}

Nick Anderson

unread,
Apr 12, 2021, 4:03:20 PM4/12/21
to Xander Cage, help-cfengine

Xander Cage <christia...@itsv.at> writes:

hmm…this problem is not as easy to handle as it seems…if i assume right, the MAX_FD setting is a per thread setting and it already makes a realloc if the descriptors get exhausted. this needs higher knowledge of an cfengine expert.

Hi Christian,

Probably it would be best if you can collect the information you have and put it into our issue tracker along with reproduction information.

https://tracker.mender.io/projects/CFE/issues

Xander Cage

unread,
Apr 14, 2021, 2:35:42 AM4/14/21
to help-cfengine
Reply all
Reply to author
Forward
0 new messages