[feature/proposal] Amazon EFA collector

91 views
Skip to first unread message

Perif

unread,
Mar 28, 2023, 12:44:36 PM3/28/23
to Prometheus Developers
Hi,

We wrote a collector for Amazon EFA which is a high-speed network interface similar to Infiniband. 

This interface is used for tightly coupled applications in HPC (WRF, Ansys Fluent, Gromacs...) and distributed ML (think LLMs like BLOOM, OPT... or Diffusion based models like Stable diffusion). The metrics are used for optimization and troubleshooting of these computational workloads. The collector we wrote is based on the one used by Infiniband and involved changes on ProcFS as well as EFA metrics are exposed similarly.

Would the team be open for us to create a PR to add a new collector for this network interface?

Thanks,

Perif

Matthias Rampke

unread,
Apr 16, 2023, 7:00:20 AM4/16/23
to Perif, Prometheus Developers
To clarify, you are asking about adding this to the node exporter?

I am torn between "this seems very specific" and "I guess it won't hurt anyone who doesn't need it".

IMO adding support to the procfs package makes sense, whether it's then consumed by node exporter or a more specific one.

/MR

--
You received this message because you are subscribed to the Google Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-devel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/06dd2715-8c52-4bf9-9516-43c5ef134357n%40googlegroups.com.

Perif

unread,
May 8, 2023, 2:01:23 PM5/8/23
to Prometheus Developers
Hello Matthias,

A PR for procfs has been submitted: https://github.com/prometheus/procfs/pull/515

Regarding the specificity for the node exporter I was debating that as well in regards of the Infiniband collector which is similar. Happy to discuss it further here or Slack.

Thanks,

PY
Reply all
Reply to author
Forward
0 new messages