Lock and Little's Law

395 views
Skip to first unread message

Dain Ironfoot

unread,
Sep 14, 2017, 9:30:55 PM9/14/17
to mechanical-sympathy
Hi Guys, 

Can someone verify if my understanding of Little's law is correct in the context of locking.

Suppose I have a system where I acquire a lock, do some work and release it. Further, suppose that doing some "work" takes no time.


λ = L/ W              ( λ = throughout, L=Average number of customer in a stable system, W=Average time spent in the system)

λ = 1/ W              (Since a lock will only allow one thread to execute)

λ = 1/10 micros   (Supposed average time taken to acquire the lock)

λ 100,000 per second


Therefore, just by using a lock, the throughput of my system is capped at 100,000 per second.

Is my reasoning correct?


Thanks

Martin Thompson

unread,
Sep 16, 2017, 2:46:25 AM9/16/17
to mechanica...@googlegroups.com
Little's law can be used to describe a system in steady state from a queuing perspective, i.e. arrival and leaving rates are balanced. In this case it is a crude way of modelling a system with a contention percentage of 100% under Amdahl's law, in that throughput is one over latency.

However this is an inaccurate way to model a system with locks. Amdahl's law does not account for coherence costs. For example, if you wrote a microbenchmark with a single thread to measure the lock cost then it is much lower than in a multi-threaded environment where cache coherence, other OS costs such as scheduling, and lock implementations need to be considered.

Universal Scalability Law (USL) accounts for both the contention and the coherence costs.


When modelling locks it is necessary to consider how contention and coherence costs vary given how they can be implemented. Consider in Java how we have biased locking, thin locks, fat locks, inflation, and revoking biases which can cause safe points that bring all threads in the JVM to a stop with a significant coherence component.

Shripad Agashe

unread,
Sep 20, 2017, 9:41:43 AM9/20/17
to mechanical-sympathy
There is USL4J by Coda Hale to make it a bit easier to use USL.



I've a different query on this. As the response time depends on number of concurrent users, and if there are parallel paths in the flow being measured, you will need to measure scalability characteristics of each path. Any thoughts on this?

Shripad

Martin Thompson

unread,
Sep 20, 2017, 12:40:37 PM9/20/17
to mechanical-sympathy
I've a different query on this. As the response time depends on number of concurrent users, and if there are parallel paths in the flow being measured, you will need to measure scalability characteristics of each path. Any thoughts on this?

It is often the case that a number of parallel paths can exist making up a graph. If you map out the graph and determine the critical path you can then get the contention component for what cannot be parallel, you also can then get and service time for queuing theory.

Even on a single thread this manifests as IPC (Instructions Per Cycle) as the CPU extras parallelism which is not constrained by data dependencies.

Shripad Agashe

unread,
Sep 21, 2017, 1:24:48 AM9/21/17
to mechanical-sympathy
Martin,
Can the critical path itself change based on concurrent load if each of the paths making up the graph has different serialised characteristics? I'm postulating that the path having minimum service time at N=1 but higher contention may never appear to be the critical path at lower concurrency but it will manifest at higher concurrency. 

Shripad

Kirk Pepperdine

unread,
Sep 21, 2017, 7:26:59 AM9/21/17
to mechanica...@googlegroups.com
Theoretically yes. However there maybe other things going on that will lower that cap.

Kind regards,
Kirk

Martin Thompson

unread,
Sep 22, 2017, 3:23:48 AM9/22/17
to mechanical-sympathy
Can the critical path itself change based on concurrent load if each of the paths making up the graph has different serialised characteristics? I'm postulating that the path having minimum service time at N=1 but higher contention may never appear to be the critical path at lower concurrency but it will manifest at higher concurrency. 

I think it is too difficult to answer that generically in any sort of meaningful way.

The costs often are often impacted in subtle ways by mechanical sympathy, which can become the dominant factor. For example, two threads using a database that may be contending on the same table of data can have an obvious contention point. Then mix in if they use prepared statements or not. If not using prepared statements on a DB like Oracle then you can end up contending on the shared pool for statement parsing, execution planning, or cursor caching. Similar can be said for threads using shared data structures that happen to either be sibling hyperthreads, same socket, or communicating across QPI links. It can be blurred if these costs are contention or coherence penalty, or a combination of both.

I like to break things down into the major computing resources - CPU, memory, network, storage. Measure the utilisation of each major resource under increasing load to get a scaling model. When one becomes limiting it gives the place to focus. Without a clear resource as the bottleneck then you have a dependency issue and for that you do a critical path analysis to find the contended component, which may not be exhibiting high utilisation, but it could be being used in a way it was not really designed for, e.g. a spinning disk being exercised with a arbitrary access pattern.
Reply all
Reply to author
Forward
0 new messages