error: Open connections exceed FD_SETSIZE limit (1024 >= 1024)

38 views
Skip to first unread message

Markus Rexhepi-Lindberg

unread,
Nov 12, 2025, 2:25:13 AMNov 12
to help-cfengine
I have a bundle that runs on the hub which collects files from all clients that was last seen 2 hours ago. This bundle create a lot of open connections towards the clients.

Here is a snippet of the bundle (in the ful bundle 22 files in total are declared to be collected):

bundle agent collect_files
{
  vars:
    # get a list with all clients which has connected to policy master within the last two hours
    cf_clients" slist => hostsseen("2","lastseen","name") ;

  files:
    "/tmp/cmdb_collect"
      create => "true",
      perms => m("600");

  methods:
    "any" usebundle => filecollector("$(cf_clients)"),
       action => silent_sometimes;
}

body action silent_sometimes {
  background => "true";
  ifelapsed => "30";
  expireafter => "30";
}

bundle agent filecollector(cf_client)
{
  files:  
      "/tmp/cmdb_collect/$(cf_client)"
      perms => m("600"),
      copy_from => no_backup_short_timeout_rcp("${sys.workdir}/collect/report.csv", "$(cf_client)"),
      action => immediate,
      comment => "Get cmd data from all clients",
      move_obstructions => "true";
}

body copy_from no_backup_short_timeout_rcp(from,server)
{
      servers     => { "$(server)" };
      source      => "$(from)";
      compare     => "mtime";
      copy_backup => "false";
      timeout     => "5";
}

While running this bundle on the hub and monitoring the file descriptor count for the `cf-agent` process it starts to error with the following error message as soon as the file descriptor count reaches ~1024:

error: Open connections exceed FD_SETSIZE limit (1024 >= 1024)

I monitor the fd count using the following command:

# ls -1 /proc/$(pidof cf-agent)/fd | wc -l

I have tried increasing the fd limit:

# grep "Max open files" /proc/$(pidof cf-agent)/limits
Max open files 24000 28000 files

But I believe FD_SETSIZE is not something I can alter for the kernel.

I have maxconnections set to 13000 in cf_serverd.cf:

# grep control_server_maxconnections /var/cfengine/inputs/def.json
    "control_server_maxconnections": "13000",

I recently upgraded my environment from 3.21.2 to 3.21.7 and the hub server from Ubuntu 22.04 to 24.04 and I _believe_ that this was not an issue before.

Am I missing something or is the method I use to collect files from clients not proper?

--
Markus


Nick Anderson

unread,
Nov 12, 2025, 12:40:42 PMNov 12
to help-c...@googlegroups.com

Hey Markus,

The first thing that jumped out at me from above was how you call your bundle to copy the file:

methods:
  "any" usebundle => filecollector("$(cf_clients)"),
     action => silent_sometimes;

You reference the list of clients with $. So, the bundle will be actuated once for each client. Have you instead tried passing the list so that the bundle is called just once and the iteration on the client happens inside the bundle?

Like:

methods:
  "any" usebundle => filecollector("@(cf_clients)"),
     action => silent_sometimes;

Regarding the maximum number of connections, I see that you mention setting control_server_maxconnections, note that variable is used by body server control (cf-serverd). For your policy, where you have one agent that is creating a bunch of connections from cf-agent you might consider setting default:def.control_agent_maxconnections. It's set to 30 in the MPF by default.

I would link you to the docs that mention it, but it doesn't seem to be explicitly called out yet so I just filed an issue about it here https://northerntech.atlassian.net/browse/CFE-4602

Nick Anderson

unread,
Nov 12, 2025, 1:52:19 PMNov 12
to help-cfengine

Markus Rexhepi-Lindberg

unread,
Nov 13, 2025, 3:42:33 AMNov 13
to help-cfengine
Hi Nick,

Thanks for the suggestion to change from $ to @ when referencing the list of clients. This helped in the way that rest of the promises get to run even if FD_SETSIZE is being hit by the cf-agent process.

I forgot to mention it but I did try altering the control_agent_maxconnections variable via augments but I did not see any difference. Thanks for updating the documentation though!

Is FD_SETSIZE a limitation in the cf-agent program itself perhaps? While looking at the core source (with my VERY limited C knowledge) it seems that the select function is used in the TryConnect function [1] which to my understanding has a limitation of 1024 file descriptors. I noticed the 1953 pull request [2] which suggested to change select() to poll() instead but was ultimately closed due to Windows not supporting poll().

[1] https://github.com/cfengine/core/blob/e639010a9d2453ed7d8559fb9f15d78b65ebfb52/libcfnet/net.c#L615
[2] https://github.com/cfengine/core/pull/1953

Nick Anderson

unread,
Nov 13, 2025, 1:47:20 PMNov 13
to help-cfengine
Humm, this is on Ubuntu?

How many hosts are you copying from in this way? I wonder if you could slice up your list and make multiple methods calls for the bundle one for each section of the list.

You said switching to list allowed other policies to continue, they didn't have issues with further file-copies (or maybe don't make any more)?

Devs say that's a compile time setting.
Reply all
Reply to author
Forward
0 new messages