Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Understanding runs per win (redux)

0 views
Skip to first unread message

Robert Glass

unread,
Sep 10, 1999, 3:00:00 AM9/10/99
to

A week or so ago I asked for some help in understanding how runs per win are
calculated. Michael Wolverton and Arne Olson independently directed me to an
article by Clay Davenport at the Baseball Prospectus Web site, for which I
heartily thank them both. The former might regret having given me a hand,
seeing how we've wound up on opposite sides of the Great Koufax Debate, and
all I can say is that, from what I've seen of the arguments here at r.s.bb.,
we'll wind up as allies another time.

To recap, I asked about the merits of two different formulas for calculating
runs per win. One was a simplified version of that used by Total Baseball:

RPW = 10*sqrt(RPG, both teams/9)

The other was the Pythagorean formula, favored here at r.s.bb.:

RPW = (4*LgRuns)/(N*LgGames)

where N = 1.83

I've been using the Pythagorean formula, but I gradually came to suspect
that the use of a fixed value for N gave it a bias towards low-offense eras,
inflating the value of great performances in low-offense seasons and
diminishing the value of similar performances in high-offense seasons. Clay
Davenport's article proposed what I'll call the revised Pythagorean formula,
in which

N = 1.5*log(RPG, both teams)+.45

This gives a result of 1.83 at around 8.32 RPG. I believe the closest we've
come to that this century was in the National League in 1974.

I've been running the revised Pythagorean formula through its paces and it
seems to give significantly better results than the unrevised version,
though I admit this conclusion is based more on gut feeling than any
substantial or systematic research. With the unrevised version, a
disproportionate number of monster seasons came in low-offense years; with
the revised version, there's a more even distribution of such seasons among
low-offense, high-offense, and average years. Applying the formula to my
four-year peak-value list for pitchers, Grove moves up dramatically to
number two (where I thought he should be all along), the other pitchers of
the '20s and '30s (Vance, Hubbell) also move up, the pitchers of the '90s
(Clemens, Maddux) move up a few decimal points, the pitchers of the '60s
(alas, poor Koufax!) lose a point or two. Compared to the TB formula, the
revised Pythagorean formula gives almost the same result in the
highest-scoring seasons (a .05 difference in the NL of 1930), but the two
gradually diverge, leading to a .69 difference in the NL of 1908, the
revised Pythagorean formula giving the lower RPW figure and thus the greater
value to achievements in low-offense years. For those who feel the TB
formula undervalues such performances, this is as it should be.

The standard disclaimer applies: I'm not a statistician, just a guy with a
calculator and a heap of curiosity about baseball history, and I apologize
if everything I've said here is absolute rubbish. I'm posting this in the
hope that someone might find it useful, and if I've made any sort of
ghastly mistake, I hope someone will point it out to me.

The revised list of the top four-consecutive-year peaks for pitchers of the
twentieth century follows (I haven't gotten around to the nineteenth-century
ones yet). Koufax-haters will be pleased to note that he's slipped a little.

RG


Walter Johnson 1912-15 27.4
Lefty Grove 1929-32 23.1
Greg Maddux 1992-95 22.1
Sandy Koufax 1963-66 21.9
Cy Young 1901-04 21.0
Pete Alexander 1914-17 20.8
Hal Newhouser 1944-47 20.5
Carl Hubbell 1933-36 19.8
Christy Mathewson 1908-11 19.7
Ed Walsh 1907-10 19.5
Lefty Grove 1935-38 19.4
Mordecai Brown 1906-09 19.0
Walter Johnson 1916-19 18.8
Roger Clemens 1989-92 18.4
Bob Gibson 1968-71 18.2
Tom Seaver 1968-71 18.0
Dazzy Vance 1927-30 17.7
Juan Marichal 1963-66 17.6
Rube Waddell 1902-05 17.6
Robin Roberts 1951-54 17.4
Christy Mathewson 1902-05 17.0
Jim Palmer 1975-78 16.5


--
Robert Glass
Military historian, film buff, and (alas) Minnesota Twins fan
Remove "harlech" from my address to reach me by e-mail


2. excès
exclure la raison, n'admettre que la raison.
--Pascal, Pensées

0 new messages