error: Open connections exceed FD_SETSIZE limit (1024 >= 1024)

27 views
Skip to first unread message

Markus Rexhepi-Lindberg

unread,
Nov 12, 2025, 2:25:13 AM (yesterday) Nov 12
to help-cfengine
I have a bundle that runs on the hub which collects files from all clients that was last seen 2 hours ago. This bundle create a lot of open connections towards the clients.

Here is a snippet of the bundle (in the ful bundle 22 files in total are declared to be collected):

bundle agent collect_files
{
  vars:
    # get a list with all clients which has connected to policy master within the last two hours
    cf_clients" slist => hostsseen("2","lastseen","name") ;

  files:
    "/tmp/cmdb_collect"
      create => "true",
      perms => m("600");

  methods:
    "any" usebundle => filecollector("$(cf_clients)"),
       action => silent_sometimes;
}

body action silent_sometimes {
  background => "true";
  ifelapsed => "30";
  expireafter => "30";
}

bundle agent filecollector(cf_client)
{
  files:  
      "/tmp/cmdb_collect/$(cf_client)"
      perms => m("600"),
      copy_from => no_backup_short_timeout_rcp("${sys.workdir}/collect/report.csv", "$(cf_client)"),
      action => immediate,
      comment => "Get cmd data from all clients",
      move_obstructions => "true";
}

body copy_from no_backup_short_timeout_rcp(from,server)
{
      servers     => { "$(server)" };
      source      => "$(from)";
      compare     => "mtime";
      copy_backup => "false";
      timeout     => "5";
}

While running this bundle on the hub and monitoring the file descriptor count for the `cf-agent` process it starts to error with the following error message as soon as the file descriptor count reaches ~1024:

error: Open connections exceed FD_SETSIZE limit (1024 >= 1024)

I monitor the fd count using the following command:

# ls -1 /proc/$(pidof cf-agent)/fd | wc -l

I have tried increasing the fd limit:

# grep "Max open files" /proc/$(pidof cf-agent)/limits
Max open files 24000 28000 files

But I believe FD_SETSIZE is not something I can alter for the kernel.

I have maxconnections set to 13000 in cf_serverd.cf:

# grep control_server_maxconnections /var/cfengine/inputs/def.json
    "control_server_maxconnections": "13000",

I recently upgraded my environment from 3.21.2 to 3.21.7 and the hub server from Ubuntu 22.04 to 24.04 and I _believe_ that this was not an issue before.

Am I missing something or is the method I use to collect files from clients not proper?

--
Markus


Nick Anderson

unread,
Nov 12, 2025, 12:40:42 PM (16 hours ago) Nov 12
to help-c...@googlegroups.com

Hey Markus,

The first thing that jumped out at me from above was how you call your bundle to copy the file:

methods:
  "any" usebundle => filecollector("$(cf_clients)"),
     action => silent_sometimes;

You reference the list of clients with $. So, the bundle will be actuated once for each client. Have you instead tried passing the list so that the bundle is called just once and the iteration on the client happens inside the bundle?

Like:

methods:
  "any" usebundle => filecollector("@(cf_clients)"),
     action => silent_sometimes;

Regarding the maximum number of connections, I see that you mention setting control_server_maxconnections, note that variable is used by body server control (cf-serverd). For your policy, where you have one agent that is creating a bunch of connections from cf-agent you might consider setting default:def.control_agent_maxconnections. It's set to 30 in the MPF by default.

I would link you to the docs that mention it, but it doesn't seem to be explicitly called out yet so I just filed an issue about it here https://northerntech.atlassian.net/browse/CFE-4602

Nick Anderson

unread,
Nov 12, 2025, 1:52:19 PM (15 hours ago) Nov 12
to help-cfengine

Markus Rexhepi-Lindberg

unread,
3:42 AM (1 hour ago) 3:42 AM
to help-cfengine
Hi Nick,

Thanks for the suggestion to change from $ to @ when referencing the list of clients. This helped in the way that rest of the promises get to run even if FD_SETSIZE is being hit by the cf-agent process.

I forgot to mention it but I did try altering the control_agent_maxconnections variable via augments but I did not see any difference. Thanks for updating the documentation though!

Is FD_SETSIZE a limitation in the cf-agent program itself perhaps? While looking at the core source (with my VERY limited C knowledge) it seems that the select function is used in the TryConnect function [1] which to my understanding has a limitation of 1024 file descriptors. I noticed the 1953 pull request [2] which suggested to change select() to poll() instead but was ultimately closed due to Windows not supporting poll().

[1] https://github.com/cfengine/core/blob/e639010a9d2453ed7d8559fb9f15d78b65ebfb52/libcfnet/net.c#L615
[2] https://github.com/cfengine/core/pull/1953
Reply all
Reply to author
Forward
0 new messages