We discussed the 6th chapter from Model Thinker—Power Law distributions.
There is a good progression in the book. In the 5th chapter, the Central Limit Theorem (CLT) defines that the sum of random variables that are independent and from identical distribution (IID) with finite variance would approximate normal distribution. It then goes on to relax the "identical distribution" condition. As long as few random variables don't explain majority of variation, CLT would still hold. This is called Lindeberg Condition. I wrote one blog on it: Why do we say heights data is normally distributed?
The 6th chapter now relaxes the "independent" condition as well. The random variables (that are to be summed) are not independent any more—one influences the other in the form of positive feedbacks (Mathew Effect - more begets more), or someother way. This breaks the CLT and the resultant distribution would thus be a power law.
I recall from Stats textbooks that there were many methods for finding if two random variables are independent. Now I see the relevance of those methods.
The chapter takes various examples where Power law distributions are observed (city sizes, traffic jams, etc). We struggled to find ways to implement it in our works. Others who have read the chapter, please add to this thread if you can think of any. I added a few in the
reading list.
A simple preferential attachment model explaining the web
Self-Organized Criticality visual
California on Fire: An Illustration of Self-Organized Criticality
One insight we got during the discussion is how project managers can use the methods we learned in chapters 5 and 6, in estimating the project cost.
This example is also mentioned in 6th chapter.
- Break down your project into list of sub-projects
- Assign a distribution to each of the sub-project's costs. One could be a normal distribution with some mean and standard deviation. Another could be a log-normal distribution. Assign it based on historical data or expert's insights.
- Treat each of those sub-project cost distributions as random variables.
- If those random variables are independent, i.e., one sub-project's cost should not impact another sub-project's cost, then the sum of all those sub-project's costs will approximate normal distribution under Lindeberg condition.
- If those random variables are not independent, then you'd see a power law distribution (which means your total project cost has higher chance of overshooting).
I think a project manager should ensure that all sub-projects are independent. That allows for more reliable total estimates and reducing cost escalation, which is pretty common in big projects.
For the next week, we planned to read till the 7th chapter (Linearity) of Model Thinker The call will be on August 31st, Sunday, at 10AM tentatively. Please RSVP to the GMeet invite when you receive it.
Please add anything I missed or misrepresented.
Best,