Probing the CPU for metrics / info

462 views
Skip to first unread message

Mani Sarkar

unread,
Nov 29, 2018, 5:19:24 PM11/29/18
to mechanica...@googlegroups.com
Hi,

Haven't written here for a long time. I have a query about probing CPUs (via commands and/or programs) to find out vital information about them mostly speed/performance related. 

I have put together this cheat-sheet for doing something like that for GPUs - https://gist.github.com/neomatrix369/256913dcf77cdbb5855dd2d7f5d81b84, and would like to do something similar for CPUs as well, covering all three OSes.

I know the GPU list is missing a good deal for the MacOS and Windows. Some of you might say Macs do have GPUs, many do have.

Any thoughts?

Cheers
Mani
--

@theNeomatrix369  |  Blog  | @adoptopenjdk @graalvm @graal @truffleruby | Dev. communities | Bitbucket  |  Github  |  Slideshare  | LinkedIn

Come to Devoxx UK 2019: http://www.devoxx.co.uk/

Don't chase success, rather aim for "Excellence", and success will come chasing after you!

Greg Young

unread,
Nov 29, 2018, 5:23:29 PM11/29/18
to mechanica...@googlegroups.com

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
Studying for the Turing test

Wojciech Kudla

unread,
Nov 29, 2018, 6:08:14 PM11/29/18
to mechanica...@googlegroups.com
If you're interested in capturing performance metrics per CPU (with zero overhead) I suggest looking at  /proc/interrupts and /proc/softirqs.
For more detailed data you'd need to use the PMU to capture on- and off-core events but that's:
either
1) non-zero overhead
or 
2) inaccurate for more than a few counters gathered at the same time (would require multiplexing).

Hope this helps

Mani Sarkar

unread,
Dec 1, 2018, 7:03:47 AM12/1/18
to mechanica...@googlegroups.com
Thanks everyone who responded, they are good starting points. These look like will work on Linux or macos machibes what about Windows? 

Do you having any pointers for GPUs? 

Martin Thompson

unread,
Dec 1, 2018, 9:13:46 AM12/1/18
to mechanical-sympathy

Thanks everyone who responded, they are good starting points. These look like will work on Linux or macos machibes what about Windows? 

For general profiling on Windows then perfview is work trying.


There is the excellent overview by Sasha Goldshtein.


Martin...
 

Mani Sarkar

unread,
Mar 21, 2019, 7:48:41 AM3/21/19
to mechanica...@googlegroups.com
As a thank you, I have added your suggestions to this page
https://github.com/neomatrix369/awesome-ai-ml-dl/blob/master/cloud-devops-infra/README.md#cpu

And also cited the mailing list.

Please do share and feel free to create PRs to it.

Peter Booth

unread,
Mar 22, 2019, 1:38:43 AM3/22/19
to mechanical-sympathy
A couple of comments:

1. Brendan Gregg's homepage, and his last book are worth reading http://www.brendangregg.com/
2. Not everything in the /proc pseudofilesystem can be read at no cost. For processes with large, complex memory footprints, reading /proc/<pid>/smaps can be expensive and can, in some cases, can crash hosts.

Peter 

Wojciech Kudla

unread,
Mar 22, 2019, 3:30:56 AM3/22/19
to mechanical-sympathy
Hi Peter,

This sounds very interesting. 
Could you please expand on the case of extra overhead/risk of crashing the host by reading smaps? 



--

Abhinav

unread,
Mar 22, 2019, 3:41:33 AM3/22/19
to mechanica...@googlegroups.com, pboo...@gmail.com
Hey Peter,
By any chance, do you have source quoting #2 or configuration that can help reproduce the host crash?

--

Peter Booth

unread,
Mar 26, 2020, 3:08:13 PM3/26/20
to mechanical-sympathy
Just saw this, one year later. 

So I observed this about 4 years ago in a different job to the one I have today - so my recollection might be imperfect.
I was working with HP dual socket hosts with Haswell and with 144G to 384G running RHEL 6.6 or 6.7. At the time i was experimenting with scripts like like https://raw.githubusercontent.com/pixelb/ps_mem/master/ps_mem.py
which try to come up with better estimates of "real" memory usage per process.
The hosts had Azul Systems zing installed, which meant a kernel module that 
uses a different kind of VMA than the malloc/glibc VMA for large heap JVM processes.
I saw that I could crash hosts almost at will by traversing /proc/pid/smaps for processes
that were using between 5G and 40G. 

At the time I found RedHat problem reports describing similar issues, which related the issues 
to locking in the procfs implementation. This report might even refer to the issue I saw:  https://access.redhat.com/solutions/441543
Hey Peter,
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages