How to get better at Mechanical Sympathy?

Smit Shah

unread,

Jun 12, 2016, 8:38:58 AM6/12/16

to mechanical-sympathy

Hello folks,

I started programming as many young folks do these days, diving into scripting language (Ruby) and using it to build CRUD apps etc.

As time went on, I started dealing with more interesting problems and solving them efficiently became increasingly complex. I started looking into more performant languages (which lets you exercise more control), concurrent programs and distributed systems etc.

However, I realized that all these complex systems are built on fundamental knowledge of how computers work, eg.e TCP/UDP/IP network stack, disks, memory, processors, compilers etc. I think it's paramount to know such things to come up with projects like Disruptor/Aeron etc.

So my question is rather a simple one, how does one start to get good at these things? Did you folks pick these things up from books and whitepapers? Where does one start basically? Also, what should I pick up first? There is just so much to learn e.g Algorithms, Data structures, Design Fundamentals, etc.

Finally, I want to thank you for creating this group, many times when I am reading through a post or a thread I feel how little I know and how far I have to go :)

Avi Kivity

unread,

Jun 12, 2016, 1:47:40 PM6/12/16

to mechanica...@googlegroups.com

Lurk on lists dealing with these issues (lkml is good, if noisy); use a low-level language and examine the assembly after you compile to get a feel for what the compiler does. Run profiling tools regularly, in assembly annotation mode.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Daniel Janzon

unread,

Jun 12, 2016, 5:29:42 PM6/12/16

to mechanica...@googlegroups.com

I don't think examining the assembly is the best place to start. You will end up trying to save one store operation or something that will be pipe-lined anyway. Algorithms and data structures are more important when you want to gain magnitudes of performance. Learn the trade-offs of the various kinds of linked lists, binary trees, heaps, hash maps, etc. Then threads and locking is important to look into if you do a lot of multi-thread programming.

Start with the programs you have written already. Where is most time spent in the end-to-end solution? What's the bottleneck? If it is uniformly slow, can you think of bold redesign of the whole thing?

Just my two cents,

Daniel

William Pietri

unread,

Jun 12, 2016, 6:06:35 PM6/12/16

to mechanica...@googlegroups.com

Great question, Smit. I agree with Daniel:

On 06/12/2016 05:29 PM, Daniel Janzon wrote:

Start with the programs you have written already. Where is most time spent in the end-to-end solution? What's the bottleneck? If it is uniformly slow, can you think of bold redesign of the whole thing?

A lot of knowledge comes with looking at things you already understand to some extent and then tinkering.

I'd add one piece of advice that I give to a lot of young programmers. Install some system monitoring widget and always give it some room on your screen. My laptop is Linux, so I use Gkrellm, but things like it exist for every platform. The right half-inch of my screen shows:

graph of CPU usage
top 3 processes and CPU used by each
graph of recent disk read and write activity
graphs of activity on each network, real and virtual
memory stats

Any time you're running a program of yours, you should be glancing often at numbers like that. It will give you a good moment-to-moment feel of what the current bottleneck is. And before you start a program running, try to predict which bottlenecks you'll hit and when. You'll often be wrong at first, which is fine; you'll have a lot of interesting puzzles to solve. That will prod you to learn about other tools to tell you what's happening.

Mechanical sympathy for actual machines comes from watching and listening to them as you work with them. Computers are harder to understand; they've become mostly quiet, and most of the blinkenlights have gone away. That means you have to re-expose what's going on if you want to develop a good intuition for what they're up to.

William

Michael Barker

unread,

Jun 12, 2016, 11:58:41 PM6/12/16

to mechanica...@googlegroups.com

One of the behaviours I see from people that have a strong understanding of mechanical sympathy or doing well at learning about it, is that they write a lot of benchmarks. This is a good way to get started. Start asking yourself lots of questions around (for stuff you care about) is X faster than Y, under what conditions? Build the tests that demonstrate the performance differences. Get others to review the tests, so that you are testing what you think you are testing. Then use the tests to generate profiling information so that you can understand why the software and hardware behaves the way it does. Keep digging until you have satisfactory explanation for the behaviour that you see (this may require a bunch of reading around to increase your understanding, e.g. source code for underlying system, CPU optimisation manuals, etc.). Discuss your observations with others to validate. If you are willing to follow the investigation all the way to its conclusion, you will probably learn quite a bit with a relatively small number of investigations. You may find yourself digging into kernel code, assembler, understanding compiler optimisations, database implementation, hardware subsystems etc.

As well as the mechanical sympathy blog, look at the blogs from Nitsan Wakart (http://psy-lob-saw.blogspot.co.nz/) and Aleskey Shiplev (http://shipilev.net/) for good examples of data driven performance investigations.

Be curious, experimental, data-driven and have fun.

Mike.

--

Ross Bencina

unread,

Jun 13, 2016, 1:37:16 AM6/13/16

to mechanica...@googlegroups.com

On 12/06/2016 10:38 PM, Smit Shah wrote:
> I started programming as many young folks do these days, diving into
> scripting language (Ruby) and using it to build CRUD apps etc.
>
> As time went on, I started dealing with more interesting problems
> and solving them efficiently became increasingly complex. I started
> looking into more performant languages (which lets you exercise more
> control), concurrent programs and distributed systems etc.
>
> However, I realized that all these complex systems are built on
> fundamental knowledge of how computers work, eg.e TCP/UDP/IP network
> stack, disks, memory, processors, compilers etc. I think it's
> paramount to know such things to come up with projects like
> Disruptor/Aeron etc.
>
> So my question is rather a simple one, how does one start to get good
> at these things? Did you folks pick these things up from books and
> whitepapers? Where does one start basically? Also, what should I pick
> up first? There is just so much to learn e.g Algorithms, Data
> structures, Design Fundamentals, etc.

Hi Smit,

I agree with the advice of others: "write code and profile it"-- you
need empirical feedback about what really works.

But when you're starting out, it also helps to have a map of the
territory. All of the things that you mentioned are relevant -- some are
new, and some are old, some are changing. The new and changing stuff
maybe you need to research on line, the old stuff maybe a second-hand
book from 20 years ago will be fine.

I read a lot of research papers when I'm starting on a new problem. That
gives me a feel for what researchers have found. Search for papers on
line with Google scholar. You can often also browse the listings of
papers at relevant conferences. (i.e.: find a paper that interests you,
find what conference it was published at, review other publications from
that conference, recurse into the references of relevant papers.)

I've put some book recommendations below, because I like books, but a
lot of this information is available on-line.

For TCP/IP networking, Steven's "Unix Network Programming" is the bible.
After that you can start reading people's papers about optimizing
network stacks.

You probably want to read a book on operating systems. I don't have a
good recommendation, but any OS internals documentation will teach you
something. (Recently, I learned a lot from reading the RSX11M design
documents (an old precursor of VMS):
http://www.mirrorservice.org/sites/www.bitsavers.org/pdf/dec/pdp11/rsx11/
Same goes for compiler internals documentation. For example, I remember
reading some good papers about IBM's XL compiler implementation at one
stage.

If you haven't taken a data structures course it would be worth reading
an introductory text. Then at least implement linked lists (single,
double), a red-black or AVL tree, and a hash table (or some other
similar data structures that you're interested in). Test, profile,
optimize, understand the performance constraints etc. Alternative, once
you understand an algorithm you could compare, profile and improve other
people's implementations. For example, I was surprised of the diversity
when I started looking into the different ways of implementing singly
linked lists:
http://www.rossbencina.com/static/blog/singly-linked-list-data-structure-variants/singly-linked-lists.pdf

The algorithms and data structures bible, of course, is Knuth's "Art of
Computer Programming." But there's plenty of info on the web once you
know what to look for. NIST has a dictionary of algorithms and data
structures that will at least give you names for things, then you can
search for implementations, or information about how to implement them:
https://xlinux.nist.gov/dads/

When I was starting out I read a lot of books on software design. You
can go many different ways with this and I hesitate to make
recommendations. (In the 90's I went down the OO Booch, Martin, Lakos
(3 books) path, which was good because it covered micro to macro design
considerations. On the other hand SICP is popular in some circles.) One
way or another you need to learn to design software, not just plug
components together. Knowledge in this area has been incrementally
growing and there are good books from every era.

It's nice to at least try to implement a compiler of some sort.

Memory allocation is another low-level thing you should know something
about. (e.g. different types of garbage collectors, different allocation
algorithms and their constraints). One specific example: I like this
talk about Bonwick's Slab allocator (or you could just read the original
paper):
http://paperswelove.org/2015/video/ryan-zezeski-memory-by-the-slab/

Most of the above are not really mechanical sympathy though, just basic
CS stuff from the pre-virtualized age.

Cache interactions are an important factor in mechanical sympathy. You
need to learn about how caches are structured, latencies, how they
communicate and synchronise. Any Computer systems architecture book will
have a chapter on cache coherency protocols, or you could start here:
https://en.wikipedia.org/wiki/Cache_coherence
https://en.wikipedia.org/wiki/MOESI_protocol

Also this book:
David Loshin, "Efficient Memory Programming"
(In summary: know about how data locality and access patterns interact
with caches. There are also heaps of blog posts about cache-optimized
data structures and algorithms.)

Going deeper:
Curt Schimmel, "Unix Systems for Modern Architectures" covers
kernel-level CPU cache management and multi-processor kernel architectures.

At some point it's a good idea to read CPU architecture manuals (e.g.
from Intel) and papers published about specific algorithms.

Regarding concurrency, lock-free concurrency is a popular topic, but
probably not worth getting deeply into until you've mastered the above.
There are two good books though:

Scott, "Shared-Memory Synchronisation"

Herlihy, Shavit, "The Art of Multiprocessor Programming"

The Scott book is a really excellent summary/introduction to the
mechanisms of both locking and lock-free synchronization. It covers
low-level details like memory barriers and caches. I wish I had read it
15 years ago when I was struggling to learn that stuff. Highly recommended!

Nitsan Wakart collected a bunch of interesting recent papers about
lock-free algorithms, you might like to take a look, just to see if you
understand:
https://github.com/JCTools/JCTools/tree/master/resources
(but there are also a bunch of older papers worth looking at).

If you're so inclined, doing a close-to-the-metal project like
programming a microcontroller in assembly language, that might be
informative. It helps to have a reason to do this though.

A link that needs no introduction:

http://www.agner.org/optimize/

Finally, if you're practically minded, working on projects that require
maximum performance will be a powerful motivator.

Ok, enough. I hope this helps,

Good luck!

Ross.

Eric Wong

unread,

Jun 13, 2016, 3:20:30 AM6/13/16

to mechanica...@googlegroups.com

Smit Shah <who...@gmail.com> wrote:
> So my question is rather a simple one, how does one start to get good at
> these things?

Limit yourself.

Use slower, older hardware, slower Internet connection so you
force yourself to eke out every bit of performance out of
what you have.

> Did you folks pick these things up from books and
> whitepapers? Where does one start basically? Also, what should I pick up
> first? There is just so much to learn e.g Algorithms, Data structures,
> Design Fundamentals, etc.

Mailing lists and archives (don't forget historical stuff).
LKML as Avi mentioned is great, as is tying it to git commit
history. git and Linux kernel hackers tend to write very
informative commit messages.

Also, study the design and history of tools you already use
and might be familiar with. git is a great example IMHO.

Rinse, repeat, and give yourself time :)

Alen Vrečko

unread,

Jun 14, 2016, 3:42:11 AM6/14/16

to mechanica...@googlegroups.com

Might be interesting as an aside.

Build a "computer" out of Nand gates.

https://www.coursera.org/learn/build-a-computer
http://www.nand2tetris.org/

Nitsan Wakart

unread,

Jun 16, 2016, 5:39:56 PM6/16/16

to mechanica...@googlegroups.com

On top of the rest of the fine advice here, I'd suggest expanding your profiling & monitoring toolkit. Here's 2 fine choices:
- Watch/read Brendan Gregg for great overviews: http://www.brendangregg.com/ . Learn how to use perf.- Toplev is a great tool to help you learn about intel hardware counters (which will throw up interesting questions on how the damn thing works): https://github.com/andikleen/pmu-tools/wiki/toplev-manual

Stevo Slavić

unread,

Jun 20, 2016, 2:12:48 AM6/20/16

to mechanica...@googlegroups.com

Martin Thompson summarizes it well, what where from and how to learn, in "Engineering you" talk: https://www.infoq.com/presentations/engineer-practices-techniques

On Thu, Jun 16, 2016 at 11:39 PM, 'Nitsan Wakart' via mechanical-sympathy <mechanica...@googlegroups.com> wrote:

On top of the rest of the fine advice here, I'd suggest expanding your profiling & monitoring toolkit. Here's 2 fine choices:
- Watch/read Brendan Gregg for great overviews: http://www.brendangregg.com/ . Learn how to use perf.- Toplev is a great tool to help you learn about intel hardware counters (which will throw up interesting questions on how the damn thing works): https://github.com/andikleen/pmu-tools/wiki/toplev-manual

Vinay Emani

unread,

Jul 15, 2016, 1:08:52 AM7/15/16

to mechanical-sympathy

One book that helped me a lot in understanding how things work at lower level is Computer Systems: A Programmer's Perspective by Bryant and O'Hallaran. This book is probably what you're looking for.

Remko Popma

unread,

Jul 31, 2016, 4:57:33 AM7/31/16

to mechanical-sympathy

There used to be a Coursera course https://www.coursera.org/course/hwswinterface called The Hardware/Software interface.

The course worked through some of the chapters in that book (Computer Systems: A Programmer's Perspective by Bryant and O'Hallaran).

I liked it a lot.

Eric Pederson

unread,

Jul 31, 2016, 12:50:04 PM7/31/16

to mechanical-sympathy

That was a great class. Berkeley's CS 61C covers some of the same material. It is quite good. https://www.youtube.com/watch?v=flQuXQQaYE8&list=PL-XXv-cvA_iDHtKXLFJbDG-i6L9oDr5X9.

Reply all

Reply to author

Forward