Yes, this should be doable with a few tools.
librosa has a time-varying tempo estimator, so you should be able to measure this reasonably well.
To "straighten" the metrical grid, you can use either
pytsmod or
(py)rubberband. These packages both provide tools that let you map anchor point times to desired output times, see eg:
It'll be on you to determine how exactly you want to construct that mapping from the tempo estimates. I haven't compared the two packages head-to-head, but the interfaces are similar enough that it should be easy enough for you to do.