I didn't study Might-Darais closely -- Russ Cox did, and I'm relying on his observations.
You can (and I do) think of packratting and backtacking as the same approach, it's just deciding whether to take the efficiency hit in space or in time.
My idea is that, if you want a strong parserr, you should use a parse engine designed (like Earley's) to be strong from the ground up. Almost everybody else starts with a more or less weak parser, and hacks on the power, knowing that's worst-case inefficient, but hoping it will work out in enough cases of interest. Since Earley's is O(n) for every deterministic CFG, I don't see why folks insist on going for the weaker algorithms. But my approach is definitely the minority one so far.
Re "2 seconds to parse a 31-line Python file" -- I really don't pay much attention to specific speed claims in theoretical papers -- as opposed to big-O worse-case results. With Marpa, you don't have to hope your grammar is O(n) -- you can know for sure. Resorting to claims for specific grammars on specific processors to my mind shows the approach has run out of steam.