MPI Tool Information Interface (MPI_T), details on collective communication

17 views
Skip to first unread message

Anna R.

unread,
Apr 23, 2025, 4:10:34 AMApr 23
to Open MPI users

Hi,

I'm currently working on a project that aims to perform detailed measurements of internal MPI communication, particularly based on the PML layer. For example, I would like to measure internal point-to-point communication within collective operations.

I’ve seen that using the monitoring module (enabled via --mca pml_monitoring_enable 2), Open MPI already provides summaries of the underlying point-to-point operations after program execution. This is very helpful.

My goal is to access this information dynamically through the MPI_T interface and trace the data for different MPI functions at runtime.

However, I’m running into a limitation:

  • The coll_monitoring_messages_count PVAR only shows counts for collective operations.

  • The pml_monitoring_messages_count PVAR only seems to capture activity from explicit point-to-point operations (like MPI_Send/MPI_Recv), but not from collectives such as MPI_Bcast.

My question is:
Is it currently possible to observe the internal point-to-point operations triggered by collectives (as visible in the monitoring summary) through MPI_T performance variables?

I’d greatly appreciate any insights or recommendations.

Best regards
Anna


George Bosilca

unread,
Apr 23, 2025, 11:31:14 AMApr 23
to us...@lists.open-mpi.org
Anna,

The monitoring PML tracks all activity on the PML but might choose to only expose that one that the user can be interested in, aka its own messages, and hide the rest of the traffic. This is easy in OMPI because all internal messages are generated using negative tags (which are not allowed for users use). Basically, setting pml_monitoring_enable to 2 enabled filtered monitoring, and all internal traffic is counted differently, but not exposed through the normal PVAR API. If you set pml_monitoring_enable to 1, then all traffic will be counted and exposed through the PVAR, but you will not be able to differentiate between internal and user generated traffic. Look in the function mca_common_monitoring_record_pml for more info.

Adding PVARs for the internal traffic should be fairly easy, just copy the code that works on the pml_data and pml_count arrays and make them work on filtered_pml_data and filtered_pml_count arrays.

  George.


To unsubscribe from this group and stop receiving emails from it, send an email to users+un...@lists.open-mpi.org.
Reply all
Reply to author
Forward
0 new messages