$ python -i linearregression.py
>>> cost_function(0, 0, training_set)
49541
>>> gradient_descent_step(0.001, 0, 0, training_set)
(0.29625, 482.8045, 276892161805.86005)
>>> cost_function(0.29625, 482.8045, training_set)
276892161805.86005
BTW, Wolfram Alpha linear regresses the example data to theta0 = -42.7833, theta1 = 0.22962 so it's clear the gradient descent step above is totally out of whack.