Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

RC for Pitchers?

8 views
Skip to first unread message

James Tuttle

unread,
Apr 26, 1998, 3:00:00 AM4/26/98
to

Has anybody ever thought about calculating RC for pitchers?

The full-blown formula for hitters uses hits, walks, HBP, TB, CS, GIDP,
IBB, SB, SH, SF, and PAs -- 11 pieces of data.

Some of this is easily available for pitchers -- hits, walks, HB, TB and
IBB. PAs are batters faced, and that must be available. The stolen
base data SB and CS must also be available for pitchers. That leaves
only GIDP, SH and SF. I've never heard of those three being maintained
for pitchers. Does anybody know?

WP data is also maintained, and maybe that would go into any RC formula
as well.

Anyway, it's just a thought. For pitchers, RC wouldn't be called RC I
suppose ... Runs Allowed (RA) sounds better but that's already in use.

My guess is that an RC calculation like this for pitchers ought to be
comparable in value to RC for batters, and an RC calculation for
pitchers is one way to get rid of the inherited runner problem in the
allocation of regular runs allowed.

Can anybody comment?

Clifford Blau

unread,
Apr 26, 1998, 3:00:00 AM4/26/98
to

James Tuttle <j...@daedal.net> wrote:

>Has anybody ever thought about calculating RC for pitchers?

>My guess is that an RC calculation like this for pitchers ought to be


>comparable in value to RC for batters, and an RC calculation for
>pitchers is one way to get rid of the inherited runner problem in the
>allocation of regular runs allowed.
>
>Can anybody comment?

I've done this. I found that the average error between actual runs
allowed and RC was about 10% of actual runs.

------------
Clifford Blau
http://pw2.netcom.com/~proboy/orb.htm

James Tuttle

unread,
Apr 26, 1998, 3:00:00 AM4/26/98
to

Clifford Blau wrote:

>
> James Tuttle wrote:
>
> >Has anybody ever thought about calculating RC for pitchers?
>
> >My guess is that an RC calculation like this for pitchers ought to
> >be comparable in value to RC for batters, and an RC calculation for
> >pitchers is one way to get rid of the inherited runner problem in
> >the allocation of regular runs allowed.
>
> I've done this. I found that the average error between actual runs
> allowed and RC was about 10% of actual runs.

Interesting. Without looking at a statistics book, this means that if
the median calculated value coincides with the actual mean, then half
the time the calculated value is higher than actual, and half the time
it's lower. Since the average error is 10% of actual, more than 50% of
the calculated results are within plus or minus 10% -- the central part
of the bell curve.

What I don't know without looking at a book is the relationship between
mean error (absolute value, or maybe RMS) and the standard deviation.

Seems that this might provide a measure of the accuracy of RC.

Chris Dial

unread,
Apr 26, 1998, 3:00:00 AM4/26/98
to

Clifford Blau wrote in message
<3543ac5e...@dfw-ixnews3.ix.netcom.com>...

>James Tuttle <j...@daedal.net> wrote:
>
>>Has anybody ever thought about calculating RC for pitchers?
>
>>My guess is that an RC calculation like this for pitchers ought to be
>>comparable in value to RC for batters, and an RC calculation for
>>pitchers is one way to get rid of the inherited runner problem in the
>>allocation of regular runs allowed.
>>
>>Can anybody comment?

>
>I've done this. I found that the average error between actual runs
>allowed and RC was about 10% of actual runs.


This is what I use to evaluate pitchers for Strat. Of course, I do it for
lefty-righty, based on what the Strat card looks like.

IIRC, 10% is the estimate for unearned runs. Most of those wouldn't show up
in the RA.

I don't understand why RC would be used for batters and not for pitchers.
The closest explanation anyone has attempted to give me is that ERA is
readily available and/or that there are "clutch" pitchers (pitching better
with men on base).

I have made cursory glances, and there aren't many pitchers (with lots of
BF) that do. David Cone and Roger Clemens. Clemens really is about the
same. And if "everybody" hits better with ROB, then all pitchers must pitch
worse (on average).

So why isn't RA or Opponents' OPS used to evaluate pitchers?

Chris Dial

Gregory Bunimovich

unread,
Apr 27, 1998, 3:00:00 AM4/27/98
to

Chris Dial" @intrex.net> (acdial<nospam) wrote:

: I don't understand why RC would be used for batters and not for pitchers.


: The closest explanation anyone has attempted to give me is that ERA is
: readily available and/or that there are "clutch" pitchers (pitching better
: with men on base).

: I have made cursory glances, and there aren't many pitchers (with lots of
: BF) that do. David Cone and Roger Clemens. Clemens really is about the
: same. And if "everybody" hits better with ROB, then all pitchers must pitch
: worse (on average).

: So why isn't RA or Opponents' OPS used to evaluate pitchers?

I would argue that Runs Created does not measure exact value. It is an
estimate, a good one, but an estimate nonetheless. It is not absolute truth.
There's also the whole linear weights approach. ERA/RA measures more precisely
what we want to know, that is how many runs did the players contribute to the
team (vs. average or the mythical replacement level). Now, RC might have a
lower variance that ERA/RA, so that would be an advantage, but it's not as
true a measure.

BTW, I assume you meant RC, not RA, in the last line of your post. ERA vs.
RA is a whole other discussion.

Greg

User Name

unread,
Apr 27, 1998, 3:00:00 AM4/27/98
to

I think this is a very interesting idea and am somewhat surprised
it hasn't been pursued more seriously. Stats like WHIP are
calculated but little is done with them.

Certainly, as noted, RA is a better measure of what a pitcher has
actually accomplished than RC would be. It doesn't really matter,
when assessing the past, whether a pitcher got out of a bases loaded
jam with a hanging curve ball (that was popped up) or an unhittable
slider.

But when trying to predict the future, we do want to know exactly
how well a pitcher pitched. I don't think very many people know
just how high the variance in ERA is. For a pitcher who throws
about 80 innings, I think the standard deviation is about 1.0.
(It would be somewhat less for a pitcher who seldom gave up
more than one run in an inning and higher for a pitcher prone
to big innings.) Assuming this, if you believed a pitcher should
have an ERA of 3.0 and still believed this after watching him
pitch for a season, ERA would be unlikely to discredit your judgment.
To reject the null hypothesis, on the basis of 80 innings, one
would need an ERA under 1.0 or over 5.0. -- and this doesn't
even take error from inherited runners into account.

RC should have a much lower variance since the number of batters
faced is much larger than the number of innings -- I haven't
tried to estimate it though. The downside is that some pitchers
may give up more or fewer runs than RC suggests. There are factors
like how well pitchers hold runners on, frequency of double plays,
when they give up their walks, etc. that might give RC a consistent
error one way or the other for a particular pitcher.

Ray Heitmann

Paul G. Wenthold

unread,
Apr 27, 1998, 3:00:00 AM4/27/98
to

User Name wrote:
>
> gbun...@fas.harvard.edu (Gregory Bunimovich) wrote:
> >Chris Dial" @intrex.net> (acdial<nospam) wrote:
>
> >
> >: So why isn't RA or Opponents' OPS used to evaluate pitchers?
> >
> >I would argue that Runs Created does not measure exact value. It is an
> >estimate, a good one, but an estimate nonetheless. It is not absolute truth.
> >There's also the whole linear weights approach. ERA/RA measures more precisely
> >what we want to know, that is how many runs did the players contribute to the
> >team (vs. average or the mythical replacement level). Now, RC might have a
> >lower variance that ERA/RA, so that would be an advantage, but it's not as
> >true a measure.
> >
> >BTW, I assume you meant RC, not RA, in the last line of your post. ERA vs.
> >RA is a whole other discussion.
>
> I think this is a very interesting idea and am somewhat surprised
> it hasn't been pursued more seriously. Stats like WHIP are
> calculated but little is done with them.

Believe it or not, one of the first things I did when I
got into analysis was to look at sort-of-LW calculation
of ERA. You can imagine my surprise when, no matter what
I tried, I couldn't get ERA to depend on strike out rate!
Every approach I used had SO either non-contributing or
postively correlating with ERA. Certainly an eye opener.

Personally, I prefer using models to estimate pitchers'
contribution, because I think they are a better predictor
of future performance as the bullpen and ER/R effects will tend
to even out. However, in today's era of including situational
effects (James as put batting with runners on base into the
new RC) in order to get a more precise reflection of what the
players have done, they will continue to rely heavily on ERA.
Personally, I don't necessarily agree that this is the way
it should go, and that the more detailed past approach gives
too much credit/penalty for things that are beyond a players
control, but I don't worry too much about it.

paul
--
Invention is 93% perspiration, 6% electricity, 4% inspiration,
and 2% butterscotch ripple --- Willie Wonka

Don Malcolm

unread,
Apr 27, 1998, 3:00:00 AM4/27/98
to

User Name wrote:
>
> gbun...@fas.harvard.edu (Gregory Bunimovich) wrote:
> >Chris Dial" @intrex.net> (acdial<nospam) wrote:
>
> >
> >: So why isn't RA or Opponents' OPS used to evaluate pitchers?
> >
> >I would argue that Runs Created does not measure exact value. It is an
> >estimate, a good one, but an estimate nonetheless. It is not absolute truth.
> >There's also the whole linear weights approach. ERA/RA measures more precisely
> >what we want to know, that is how many runs did the players contribute to the
> >team (vs. average or the mythical replacement level). Now, RC might have a
> >lower variance that ERA/RA, so that would be an advantage, but it's not as
> >true a measure.
> >
> >BTW, I assume you meant RC, not RA, in the last line of your post. ERA vs.
> >RA is a whole other discussion.
>
> I think this is a very interesting idea and am somewhat surprised
> it hasn't been pursued more seriously. Stats like WHIP are
> calculated but little is done with them.

[more technical discussion snipped]

Ray, just to let you and others participating in this thread know:
BBBA has been providing RC/G data for pitchers since 1989. It's been
part of the data we purchase from STATS. Some of it is currently
available at the BBBA web site; the rest of it will be added over
the rest of this year.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Available now from Masters Press @ 1-800-9SPORTS
THE 1998 BIG BAD BASEBALL ANNUAL
--Off-beat, on-target, overflowing with data--
++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1998 seasonal coverage available three times a week in
"Thoughts Out of Season" at BBBA web site--
http://www.backatcha.com

User Name

unread,
Apr 27, 1998, 3:00:00 AM4/27/98
to

Don Malcolm <dmal...@backatcha.com> wrote:
>Ray, just to let you and others participating in this thread know:
>BBBA has been providing RC/G data for pitchers since 1989. It's been
>part of the data we purchase from STATS. Some of it is currently
>available at the BBBA web site; the rest of it will be added over
>the rest of this year.
>

Don,

Have you or anyone else studied the predictive value of RC/G?
To be specific, if you want to predict a player's ERA (or RA)
in a given season, is it more useful to know the ERA from the
preceding season or the RC/G from the preceding season?
[paralleling James"s study that ERA is a better predictor
of W-L than W-L is]
In short, is it better to have the more accurately measured
indicator even when it doesn't quite measure the right thing?

Ray Heitmann

User Name

unread,
Apr 27, 1998, 3:00:00 AM4/27/98
to

"Paul G. Wenthold" <went...@dorothy.chem.ttu.edu> wrote:
>Believe it or not, one of the first things I did when I
>got into analysis was to look at sort-of-LW calculation
>of ERA. You can imagine my surprise when, no matter what
>I tried, I couldn't get ERA to depend on strike out rate!
>Every approach I used had SO either non-contributing or
>postively correlating with ERA. Certainly an eye opener.

You might be running into a biased sample problem here. High
strikeouts correspond to good pitching -- but not by as much
as people believe. So, marginal players with high strikeout
totals make the majors while similarly talented pitchers
languish in AAA.

It's strange trying to figure out what makes a good pitcher.
I saw a supposedly good prospect in a high school all-star
game in Denver and he didn't do so well. He really had
marvelous stuff and most of his pitches were unhittable -
so daunting in fact that the opposing batters kept their bats
on their shoulders, even for the strikes. But some of his pitches
were mistakes and they swung at those. In contrast, Rick Reuschel
convinced batters that they could hit his good pitches --
and they did hit them, just not where they wanted.


>
>Personally, I prefer using models to estimate pitchers'
>contribution, because I think they are a better predictor
>of future performance as the bullpen and ER/R effects will tend

>to even out. I don't necessarily agree that this is the way


>it should go, and that the more detailed past approach gives
>too much credit/penalty for things that are beyond a players
>control, but I don't worry too much about it.

My concern is not the ER/R or bullpen effects. One thing beyond
a pitcher's control is just what the batter will do with a
pitch once he's thrown it. Good pitches are hit over the wall
and bad ones are lined right at somebody. ERA, which weights
crucial situations more heavily, is more affected by sample size
problems than RC is.

Ray Heitmann

Geoffrey E. Caveney

unread,
Apr 28, 1998, 3:00:00 AM4/28/98
to

Chris Dial" @intrex.net> (acdial<nospam) wrote:
:
: I don't understand why RC would be used for batters and not for pitchers.
: The closest explanation anyone has attempted to give me is that ERA is
: readily available and/or that there are "clutch" pitchers (pitching better
: with men on base).

: I have made cursory glances, and there aren't many pitchers (with lots of
: BF) that do. David Cone and Roger Clemens. Clemens really is about the
: same. And if "everybody" hits better with ROB, then all pitchers must pitch
: worse (on average).

: So why isn't RA or Opponents' OPS used to evaluate pitchers?

Whether all pitchers pitch worse with ROB isn't the question. The
question is *how much worse* they pitch with ROB than they do with bases
empty. If the league average is to allow, say, .030 higher OPS with ROB,
then a pitcher who allows .010 higher OPS with ROB is a "clutch pitcher";
in other words, his overall OPS allowed or RC allowed is *not* an accurate
measure of his value. This is why it's best to stick with some form of
ERA. (For starters, that is. For relievers, I rather like the "change in
win expectancy" computations that someone has been posting to this ng.)
Nolan Ryan is often cited as an example of a pitcher with misleadingly
good OPS allowed stats. He consistently allowed more runs than one would
expect from the OPS allowed, because his pitching with ROB suffered
relatively more compared to other top pitchers.

Ron Johnson

unread,
Apr 28, 1998, 3:00:00 AM4/28/98
to

In article <6i3uc6$7sm$1...@hirame.wwa.com>,

Geoffrey E. Caveney <cav...@wwa.com> wrote:

> Nolan Ryan is often cited as an example of a pitcher with misleadingly
>good OPS allowed stats. He consistently allowed more runs than one would
>expect from the OPS allowed, because his pitching with ROB suffered
>relatively more compared to other top pitchers.


But this doesn't follow.

OPS works as an estimate because the things that it doesn't take into
account *tend* to come out in the wash.

Stolen bases aren't terribly important, so you'll usually do OK
if you ignore them. Not in Ryan's case though. He was very easy
to run on.

Likewise DP support. It's basicly a function of runners on base
and number of ground balls. Ryan was an extreme flyball pitcher
and therefore had poor DP support.

In addition, OPS is really made up of 3 components. BA, ISO and OBP.
BA is over-valued in OPS. Again in most cases you don't need to pay
too much attention to this but Ryan is an exteme case here too.

Combine these factors and it's easy to see that Ryan need not have
pitched poorly with runners on to allow more runs than OPS alone
would predict.

We've only got situational stats for Ryan from 1984 on. I don't
recall them being poor (or even relatively poor) but I will check.

--
RNJ

Chris Dial

unread,
Apr 28, 1998, 3:00:00 AM4/28/98
to

Geoffrey E. Caveney wrote in message <6i3uc6$7sm$1...@hirame.wwa.com>...

>Chris Dial" @intrex.net> (acdial<nospam) wrote:
>:
>: I don't understand why RC would be used for batters and not for pitchers.
>: The closest explanation anyone has attempted to give me is that ERA is
>: readily available and/or that there are "clutch" pitchers (pitching
better
>: with men on base).
>
>: I have made cursory glances, and there aren't many pitchers (with lots of
>: BF) that do. David Cone and Roger Clemens. Clemens really is about the
>: same. And if "everybody" hits better with ROB, then all pitchers must
pitch
>: worse (on average).
>
>: So why isn't RA or Opponents' OPS used to evaluate pitchers?
>
> Whether all pitchers pitch worse with ROB isn't the question. The
>question is *how much worse* they pitch with ROB than they do with bases
>empty. If the league average is to allow, say, .030 higher OPS with ROB,
>then a pitcher who allows .010 higher OPS with ROB is a "clutch pitcher";

Except that the league average is just .015 higher in OPS w/ ROB. Now the
pitcher has to have a OPS w/ ROB lower than his usual.

Shouldn't that be the same for clutch hitters? And then is that difference
not random variation or within a std. deviation?

>in other words, his overall OPS allowed or RC allowed is *not* an accurate
>measure of his value. This is why it's best to stick with some form of
>ERA. (For starters, that is. For relievers, I rather like the "change in
>win expectancy" computations that someone has been posting to this ng.)

I have heard that, but I haven't seen that. You say it isn't an accurate
measure of his value, but your explanation isn't convincing.


> Nolan Ryan is often cited as an example of a pitcher with misleadingly
>good OPS allowed stats. He consistently allowed more runs than one would
>expect from the OPS allowed, because his pitching with ROB suffered
>relatively more compared to other top pitchers.

Could you show me his ranking, and the tables from whence this came?

And I don't know how that is different from hitters.

The last three years, Bonds' OPS with ROB (1.140) is higher than his regular
OPS (1.039)(more so than one might expect). Would he be a clutch hitter?
Shouldn't this be part of his value? 100 points of OPS is alot, and it
isn't all IBBs, his OBP is +58, and his SLG is +43.

Chris Dial

John Clay Davenport

unread,
Apr 28, 1998, 3:00:00 AM4/28/98
to

In article <6i4igd$7...@cosmos.ccrs.emr.ca>,

Ron Johnson <joh...@cosmos.ccrs.emr.ca> wrote:
>In article <6i3uc6$7sm$1...@hirame.wwa.com>,
>Geoffrey E. Caveney <cav...@wwa.com> wrote:
>
>> Nolan Ryan is often cited as an example of a pitcher with misleadingly
>>good OPS allowed stats. He consistently allowed more runs than one would
>>expect from the OPS allowed, because his pitching with ROB suffered
>>relatively more compared to other top pitchers.
>
>
>But this doesn't follow.

Sure it does, although it is *not* conclusive proof.

>OPS works as an estimate because the things that it doesn't take into
>account *tend* to come out in the wash.
>
>Stolen bases aren't terribly important, so you'll usually do OK
>if you ignore them. Not in Ryan's case though. He was very easy
>to run on.
>
>Likewise DP support. It's basicly a function of runners on base
>and number of ground balls. Ryan was an extreme flyball pitcher
>and therefore had poor DP support.

Both true. One of the other things that a model like OPS implicitly assumes,
though, is that the various events are randomly distributed. If you have a
pitcher, or team, for whom the productive events are not randomly distributed,
but are instead clumped together, you'll get more runs.

Extreme example: suppose, in a game, a team had five singles and nothing else.
EQA would say this team would average about 0.70 runs; RC says .78; LW can't
really handle this problem, and gives a value of about 0.10.

All of them say "good chance of shutout", which we'd expect.

But we have no knowledge of how clumped or spread those singles are, and the
~.75 estimate is the result of some sort of "average clumping". Give us the
knowledge that all five hits came in one inning, and I'd say 2.5 runs off the
top of my head (RC says 3.1, EQR 2.8, LW 2.1 for a 5 singles in one inning
situation).

Concentrate your good events and you'll get more runs out of them.

>In addition, OPS is really made up of 3 components. BA, ISO and OBP.
>BA is over-valued in OPS. Again in most cases you don't need to pay
>too much attention to this but Ryan is an exteme case here too.

Disagree. I'd say the biggest problem for OPS is that ISO is too high.

>Combine these factors and it's easy to see that Ryan need not have
>pitched poorly with runners on to allow more runs than OPS alone
>would predict.

True. But to get to the magnitude of the difference, I think you have to
include it.

>We've only got situational stats for Ryan from 1984 on. I don't
>recall them being poor (or even relatively poor) but I will check.

I recall them being horrid, but don't have the materials here to check.

At the risk of having my stathead card revoked, I think there's a good chance
that, from the windup, Nolan Ryan was the greatest pitcher ever.

From the stretch, though, he was nothing special, and so the total package
only comes out as "pretty good", nowhere near the top.

Clay D.
--
Clay Davenport cdave...@nesdis.noaa.com Meteorologist
NESDIS/NOAA, 5200 Auth Rd Rm 601, Camp Springs, MD 20746
Phone 301-763-8251 x36
Author, Baseball Prospectus 1998 www.baseballprospectus.com

Nelson Lu

unread,
Apr 28, 1998, 3:00:00 AM4/28/98
to

In article <6i51hg$5nj$1...@hovis.rdc.noaa.gov>,

John Clay Davenport <cl...@orbit.nesdis.noaa.gov> wrote:

>At the risk of having my stathead card revoked, I think there's a good chance
>that, from the windup, Nolan Ryan was the greatest pitcher ever.

In '84-'92, Ryan was .273/.293 with bases empty, and .316/.347 with runners on.
I think that hardly qualifies as horrible. In fact, it's a significant
difference but not so big as to strongly support the notion of clutch pitching.

===============================================================================
GO ANAHEIM ANGELS!
===============================================================================
Nelson Lu (n...@cs.stanford.edu)

DougP001

unread,
Apr 28, 1998, 3:00:00 AM4/28/98
to

In article <6i542g$dro$1...@nntp.Stanford.EDU>, n...@Xenon.Stanford.EDU (Nelson Lu)
writes:

>>At the risk of having my stathead card revoked, I think there's a good
>chance
>>that, from the windup, Nolan Ryan was the greatest pitcher ever.
>
>In '84-'92, Ryan was .273/.293 with bases empty, and .316/.347 with runners
>on.
>I think that hardly qualifies as horrible. In fact, it's a significant
>difference but not so big as to strongly support the notion of clutch
>pitching.

Moreover, there's no need to resort to terms like "clutch pitching" to
describe this phenomenon. Pitching from the windup is physiologically
different enough from pitching from the stretch -- especially for a power
pitcher like Ryan, with a slow delivery -- as to constitute a separate but
related skill.
Doug Pappas

London David

unread,
Apr 28, 1998, 3:00:00 AM4/28/98
to

In article <6i2vmv$qor$1...@geraldo.cc.utexas.edu> User Name

And I'd like to add to this request. What is the best tool for
predicting future pitching performance? There are a bunch of stats out
there which can be used: ERA/RA, OOPS (or RC/G), K/W ratio, QMAX. Has
anyone ever looked at how these stats correlate with pitching
performance from year to year?

For example, I'd always heard that a young pitcher's K rate, or K/W
ratio, was the best predictor of future performance. But the BBBA
pushes QMAX. QMAX basically measures how well the pitcher prevents
hits and walks (at the moment, it doesn't include how well the pitcher
prevents extra-base hits). So this is a quite different measure from
the K/W rate. If a young finesse pitcher prevents hits and walks, but
doesn't K a lot of hitters, he'll rank much better in QMAX than in K/W
ratio. OTOH, a power pitcher may give up a lot of hits and walks, and
even have a high(ish) ERA. But if he has a good K rate, he may have a
good K/W ratio, despite not having a good QMAX rating. So these are
quite different ways of assessing pitching performance, but one
probably correlates better with future performance than the
other. Presumably the guys at the BBBA have looked at the QMAX data --
how well does it do?

This type of question is, of course, important when one tries to rate
young pitchers in the minors. I believe that the current baseball
wisdom is to look at the K/W rate, as well as the ERA (W/L records
seem to be of less importance, even for baseball insiders). Thus,
power pitchers get a lot of chances, whereas young control pitchers
often never get promoted, even if their ERA's are excellent. Is this
conventional wisdom correct?

It may well be, given the size of the variance in pitching performance
due to injuries, confidence, etc., that none of the measures do a
particularly good job in correlations of year-to-year performance. But
that would also be interesting to know.

David London


Keri Olsen and Arne Olson

unread,
Apr 28, 1998, 3:00:00 AM4/28/98
to


Chris Dial

> > Whether all pitchers pitch worse with ROB isn't the question. The
> >question is *how much worse* they pitch with ROB than they do with bases
> >empty. If the league average is to allow, say, .030 higher OPS with ROB,
> >then a pitcher who allows .010 higher OPS with ROB is a "clutch pitcher";
>
> Except that the league average is just .015 higher in OPS w/ ROB. Now the
> pitcher has to have a OPS w/ ROB lower than his usual.

Seems like an awfully small difference to me. Source?

> Shouldn't that be the same for clutch hitters? And then is that difference
> not random variation or within a std. deviation?

The difference is that you have huge sample sizes if all you're trying to do is
establish that the league, as a whole, hits better with runners on base. Hence,
even a small coefficent like .015 can be significant. I assume this has been
accounted for in the clutch hitting studies..

> >in other words, his overall OPS allowed or RC allowed is *not* an accurate
> >measure of his value. This is why it's best to stick with some form of
> >ERA. (For starters, that is. For relievers, I rather like the "change in
> >win expectancy" computations that someone has been posting to this ng.)
>
> I have heard that, but I haven't seen that. You say it isn't an accurate
> measure of his value, but your explanation isn't convincing.

OPS assumes that events are distributed randomly, i.e., that no hitter possesses
the ability to hit better with runners on. Since no hitter has been shown to
possess that ability to a greater degree than any other, the slight deviation
from pure randomness doesn't hurt. There appears to be evidence that pitchers
do possess this skill to varying degrees, and OPS against would penalize those
pitchers. Actual runs allowed would not (though it has other problems as a
performance measure).

> The last three years, Bonds' OPS with ROB (1.140) is higher than his regular
> OPS (1.039)(more so than one might expect). Would he be a clutch hitter?
> Shouldn't this be part of his value? 100 points of OPS is alot, and it
> isn't all IBBs, his OBP is +58, and his SLG is +43.

First off, three years may not be enough data to establish a pattern. Second,
if everybody hits higher with runners on, doesn't that make everybody a clutch
hitter?

Arne

David Grabiner

unread,
Apr 29, 1998, 3:00:00 AM4/29/98
to

Keri Olsen and Arne Olson <keri...@ix.netcom.com> writes:

> Chris Dial



> > Shouldn't that be the same for clutch hitters? And then is that difference
> > not random variation or within a std. deviation?

> The difference is that you have huge sample sizes if all you're trying
> to do is establish that the league, as a whole, hits better with
> runners on base. Hence, even a small coefficent like .015 can be
> significant. I assume this has been accounted for in the clutch
> hitting studies..

I allow for a similar effect in my clutch hitting study. The average
player loses 21 points of OPS in the late innings of close games,
because more of the pitching is done by good starters and relief aces.



> Second, if everybody hits higher with runners on, doesn't that make
> everybody a clutch hitter?

I would say no; a clutch hitter is a hitter with a special ability to
take advantage of the situation. There are several other reasons why
everyone would hit better with runners on base. Runners are more likely
to be on base with a bad pitcher on the mound. The defense also adjusts
to runners on base in ways which will help the batter to prevent runners
from advancing, by holding a runner on first base, playing the infield
in or at double-play depth, and throwing pitchouts.

--
David Grabiner, grab...@math.lsa.umich.edu
http://www.math.lsa.umich.edu/~grabiner
Shop at the Mobius Strip Mall: Always on the same side of the street!
Klein Glassworks, Torus Coffee and Donuts, Projective Airlines, etc.

Don Malcolm

unread,
Apr 29, 1998, 3:00:00 AM4/29/98
to

London David wrote:

> In article <6i2vmv$qor$1...@geraldo.cc.utexas.edu> User Name
> <user...@mail.utexas.edu> writes:
> >
> >Don Malcolm <dmal...@backatcha.com> wrote:
> >>Ray, just to let you and others participating in this thread know:
> >>BBBA has been providing RC/G data for pitchers since 1989. It's been
> >>part of the data we purchase from STATS. Some of it is currently
> >>available at the BBBA web site; the rest of it will be added over
> >>the rest of this year.
> >>
> >
> >Don,
> >
> >Have you or anyone else studied the predictive value of RC/G?
> >To be specific, if you want to predict a player's ERA (or RA)
> >in a given season, is it more useful to know the ERA from the
> >preceding season or the RC/G from the preceding season?
> >[paralleling James"s study that ERA is a better predictor
> >of W-L than W-L is]
> >In short, is it better to have the more accurately measured
> >indicator even when it doesn't quite measure the right thing?
>
> And I'd like to add to this request. What is the best tool for
> predicting future pitching performance? There are a bunch of stats out
> there which can be used: ERA/RA, OOPS (or RC/G), K/W ratio, QMAX. Has
> anyone ever looked at how these stats correlate with pitching
> performance from year to year?

Sounds like a job for Ron Johnson, actually. :-)

Seriously, I don't know of anyone who's made a correlation
study of all of the stats/methods you've listed. It's important,
though, to separate these out in terms of what they measure
and how they measure it. There is no one answer here, and
the issues involved in pitching performance are multivariate
enough to produce a lot of noise.

Also keep in mind that QMAX is designed to work only with
starting pitchers. (I'm sure you're already aware of that,
David, but others may not be.)

> For example, I'd always heard that a young pitcher's K rate, or K/W
> ratio, was the best predictor of future performance. But the BBBA
> pushes QMAX. QMAX basically measures how well the pitcher prevents
> hits and walks (at the moment, it doesn't include how well the pitcher
> prevents extra-base hits). So this is a quite different measure from
> the K/W rate. If a young finesse pitcher prevents hits and walks, but
> doesn't K a lot of hitters, he'll rank much better in QMAX than in K/W
> ratio. OTOH, a power pitcher may give up a lot of hits and walks, and
> even have a high(ish) ERA. But if he has a good K rate, he may have a
> good K/W ratio, despite not having a good QMAX rating. So these are
> quite different ways of assessing pitching performance, but one
> probably correlates better with future performance than the
> other. Presumably the guys at the BBBA have looked at the QMAX data --
> how well does it do?

Those gambling buddies from my Vegas days (one of whom was *not*
Joe Sneed, BTW) tipped me off on the fact that hits/walks seemed
to work best for projecting pitching performance. (Those of you
who've actually read Sneed's comments about his stats--BTW, you
should *always* be concerned when someone names his statistics
after himself--may recall that he mentions the same thing; it
turns out that Sneed has some pretty hefty Vegas ties.)

Note that they didn't say total bases and walks, just hits and
walks. That was one of the original starting points for QMAX:
let the runs take care of themselves in the distribution.

Some of the initial studies we did prior to launching the tool
in the 1997 book looked at K rates, K/W rates, WHIP, etc. as
a way to gauge pitching performance. Of these, the WHIP data
was the best, but it was merely deterministic. We wanted something
that was more probabilistic in nature. Hence a two-dimensional
matrix with value ranges: a quality matrix (QMAX).

QMAX was tested against a lot of the measures listed above
(though RC/G wasn't one of them). We found that pitchers were
a lot more clustered in their ISO data than hitters, which made
it possible to bypass (for the most part, at least) the XB
component. The most interesting data I recall off the top of
my head involved the test of QMAX "1S" games with high-K
games (where pitchers strike out at least a batter an inning).
The "1S" games (outings where the starter pitches at least
four more innings than the number of hits allowed) beat the
high-K games in winning percentage (~.770 to ~.680).

Interestingly, when we took out the common games (low hits
*and* high strikeouts), the gap between the two widened ("1S"
games with less than a K per IP had a ~.750 winning percentage;
high-K games outside the "1S" range had a ~.620 WPCT).

Now as far as identifying "power" and "finesse" pitchers per
your discussion above: it's difficult to do this in the same
way in QMAX, because we don't worry about the K rate. Not at
the individual game level, that is. There's no question that
Bill James was correct in his judgment that pitchers with
higher K rates have longer careers; pitchers with lower
QMAX "S" rates (hit prevention) do, too. (Consider, for example,
Nolan Ryan). But we weren't trying to replicate Bill's insight:
we were hoping to stumble on something that had more predictive
value due to its probabilistic elements. There's still a lot to be
done--the data we need is at the game level, not the season
level.

> This type of question is, of course, important when one tries to rate
> young pitchers in the minors. I believe that the current baseball
> wisdom is to look at the K/W rate, as well as the ERA (W/L records
> seem to be of less importance, even for baseball insiders). Thus,
> power pitchers get a lot of chances, whereas young control pitchers
> often never get promoted, even if their ERA's are excellent. Is this
> conventional wisdom correct?

Rating young pitchers is complicated by the existence of the
PCL. (And this has been added to by the expansion of the PCL to
include a large portion of the AA: there's something a bit
surreal about reading ESPNet and discovering that the Cardinals
have optioned Manny Aybar to Memphis of the PCL. That's Memphis,
*Tennessee* we're talking about.) Sean Forman has taken an
interesting stab at this problem at his web site, where he
uses QMAX a bit differently to try to measure potential upside.
But we're all still scratching the surface of this.



> It may well be, given the size of the variance in pitching performance
> due to injuries, confidence, etc., that none of the measures do a
> particularly good job in correlations of year-to-year performance. But
> that would also be interesting to know.

I think many people already believe this to be true, and it
might well be the case. That shouldn't dissuade us from trying
to generate a measure that provides a better correlation,
however.

Keri Olsen and Arne Olson

unread,
Apr 29, 1998, 3:00:00 AM4/29/98
to


David Grabiner wrote:

> Keri Olsen and Arne Olson <keri...@ix.netcom.com> writes:
>
> > Chris Dial
>
> > > Shouldn't that be the same for clutch hitters? And then is that difference
> > > not random variation or within a std. deviation?
>
> > The difference is that you have huge sample sizes if all you're trying
> > to do is establish that the league, as a whole, hits better with
> > runners on base. Hence, even a small coefficent like .015 can be
> > significant. I assume this has been accounted for in the clutch
> > hitting studies..
>
> I allow for a similar effect in my clutch hitting study. The average
> player loses 21 points of OPS in the late innings of close games,
> because more of the pitching is done by good starters and relief aces.
>
> > Second, if everybody hits higher with runners on, doesn't that make
> > everybody a clutch hitter?
>
> I would say no; a clutch hitter is a hitter with a special ability to
> take advantage of the situation. There are several other reasons why
> everyone would hit better with runners on base. Runners are more likely
> to be on base with a bad pitcher on the mound. The defense also adjusts
> to runners on base in ways which will help the batter to prevent runners
> from advancing, by holding a runner on first base, playing the infield
> in or at double-play depth, and throwing pitchouts.

I agree. I was assuming that the definition of "clutch hitter" would be exclusive
in some way; by demonstrating that everybody possesses this ability, I was trying
to imply that a clutch hitter would have to improve significantly more with runners
on than the average hitter.


Arne


vi...@baseball.org

unread,
Apr 29, 1998, 3:00:00 AM4/29/98
to

User Name <user...@mail.utexas.edu> writes:

> Have you or anyone else studied the predictive value of RC/G?
> To be specific, if you want to predict a player's ERA (or RA)
> in a given season, is it more useful to know the ERA from the
> preceding season or the RC/G from the preceding season?
> [paralleling James"s study that ERA is a better predictor
> of W-L than W-L is]

I've read in this newsgroup that looking at Opponents' OPS is a better
predictor of future pitching performance than ERA. In fact OOPS is a
better predictor of future *ERA* than past ERA is.

Chris Dial

unread,
Apr 29, 1998, 3:00:00 AM4/29/98
to

Keri Olsen and Arne Olson wrote in message
<3546B9FB...@ix.netcom.com>...

>Chris Dial
>
>> > Whether all pitchers pitch worse with ROB isn't the question. The
>> >question is *how much worse* they pitch with ROB than they do with bases
>> >empty. If the league average is to allow, say, .030 higher OPS with ROB,
>> >then a pitcher who allows .010 higher OPS with ROB is a "clutch
pitcher";
>>
>> Except that the league average is just .015 higher in OPS w/ ROB. Now
the
>> pitcher has to have a OPS w/ ROB lower than his usual.
>
>Seems like an awfully small difference to me. Source?


You know, I'm sure that came from somewhere, but re-checking, it is more
like .030. Guess I should have checked then... This is what happens when
our selective memory kicks in.

>
>> Shouldn't that be the same for clutch hitters? And then is that
difference
>> not random variation or within a std. deviation?
>
>The difference is that you have huge sample sizes if all you're trying to
do is
>establish that the league, as a whole, hits better with runners on base.
Hence,
>even a small coefficent like .015 can be significant. I assume this has
been
>accounted for in the clutch hitting studies..
>

>> >in other words, his overall OPS allowed or RC allowed is *not* an
accurate
>> >measure of his value. This is why it's best to stick with some form of
>> >ERA. (For starters, that is. For relievers, I rather like the "change in
>> >win expectancy" computations that someone has been posting to this ng.)
>>
>> I have heard that, but I haven't seen that. You say it isn't an accurate
>> measure of his value, but your explanation isn't convincing.
>
>OPS assumes that events are distributed randomly, i.e., that no hitter
possesses
>the ability to hit better with runners on.

They *all* hit better.

>Since no hitter has been shown to
>possess that ability to a greater degree than any other, the slight
deviation
>from pure randomness doesn't hurt. There appears to be evidence that
pitchers
>do possess this skill to varying degrees, and OPS against would penalize
those
>pitchers. Actual runs allowed would not (though it has other problems as a
>performance measure).


As I asked before, where is this evidence? Grabiner explained some of this
to me, but I haven't seen any good evidence of ERA being better than OOPS in
describing how well a pitcher pitches.


>
>> The last three years, Bonds' OPS with ROB (1.140) is higher than his
regular
>> OPS (1.039)(more so than one might expect). Would he be a clutch hitter?
>> Shouldn't this be part of his value? 100 points of OPS is alot, and it
>> isn't all IBBs, his OBP is +58, and his SLG is +43.
>
>First off, three years may not be enough data to establish a pattern.

May not? It is ~800 PAs. Pretty damn impressive.

>Second, if everybody hits higher with runners on, doesn't that make
everybody a >clutch hitter?


Arne, this sentence is the opposite of what you said about pitchers. No, if
everybody hits better with ROB, then the guys a std dev above them would be
clutch hitters (or some other line).

CDial

vi...@baseball.org

unread,
Apr 29, 1998, 3:00:00 AM4/29/98
to

"Chris Dial" <acdial<nospam>@intrex.net> writes:
> Clifford Blau wrote in message
> >James Tuttle <j...@daedal.net> wrote:

> >>Has anybody ever thought about calculating RC for pitchers?

> This is what I use to evaluate pitchers for Strat. Of course, I do it for


> lefty-righty, based on what the Strat card looks like.

I use the same thing for Diamond Mind (actually, I just use OPS).
First, it helps me spot pitchers vs. lefties or righties (I juggle my
rotation a lot to get better matchups). Second, the outcomes in DMB
are based on the events allowed by a pitcher, which are more
accurately measure by OPS than by runs allowed (ie, a pitcher's 1997
OOPS gives me a better idea of how he'll do in 1997 DMB than his 1997
ERA will). Third, OOPS correlates more with future performance than
does ERA (I don't have an exact source, but I know I've seen that in
this group more than once).

> I don't understand why RC would be used for batters and not for pitchers.
> The closest explanation anyone has attempted to give me is that ERA is
> readily available and/or that there are "clutch" pitchers (pitching better
> with men on base).

Not necessarily "clutch" pitchers. Pitching is different with men on
base (holding runners, stretch vs. windup, etc).

Additionally, Diamond Mind has a "Jam" rating, which measures pitching
with runners on base; if a pitcher has a neutral Jam rating, I can
neglect this factor.

> So why isn't RA or Opponents' OPS used to evaluate pitchers?

In this year's STATS Scoreboard, they calculate Predicted ERA (OBA *
SA * 31). They note that a pitcher whose ERA greatly outperformed his
PERA is a good bet to falter in '98, and vice versa.

In the STATS All-Time Handbook, they calculate Component ERA
(opponents' RC) for every pitcher-season. Rob Neyer discussed it in a
column last week, I think. Somebody on STATS' web site had a column
about it (I forget who; he noted that most of the underachievers are
recent pitchers and the overachievers are old-timers).

So I think those measures are used fairly often, even if they're not
regularly posted to this group.

Ron Johnson

unread,
Apr 30, 1998, 3:00:00 AM4/30/98
to

In article <3547AE...@backatcha.com>,

Don Malcolm <dmal...@backatcha.com> wrote:
>London David wrote:
>
>> In article <6i2vmv$qor$1...@geraldo.cc.utexas.edu> User Name
>> <user...@mail.utexas.edu> writes:
>> >
>> >Don,
>> >
>> >Have you or anyone else studied the predictive value of RC/G?
>> >To be specific, if you want to predict a player's ERA (or RA)
>> >in a given season, is it more useful to know the ERA from the
>> >preceding season or the RC/G from the preceding season?
>> >[paralleling James"s study that ERA is a better predictor
>> >of W-L than W-L is]
>> >In short, is it better to have the more accurately measured
>> >indicator even when it doesn't quite measure the right thing?
>>
>> And I'd like to add to this request. What is the best tool for
>> predicting future pitching performance? There are a bunch of stats out
>> there which can be used: ERA/RA, OOPS (or RC/G), K/W ratio, QMAX. Has
>> anyone ever looked at how these stats correlate with pitching
>> performance from year to year?
>
>Sounds like a job for Ron Johnson, actually. :-)

As it happens, I've just started a study to see if I can get a first
cut estimate of how much impact team defense has on pitching.

The same data can be used to take a stab at this. Though the results
I get here won't be close to being conclusive. (Smallish study since
I'm only using pitchers with 100+ innings who only appeared for one
team in a year and only using 1995-97)

--
RNJ

jshe...@ucla.edu

unread,
Apr 30, 1998, 3:00:00 AM4/30/98
to

In article <lfp11.3036$av.45...@carnaval.risq.qc.ca>,

lon...@ERE.UMontreal.CA (London David) wrote:
>
> In article <6i2vmv$qor$1...@geraldo.cc.utexas.edu> User Name
> <user...@mail.utexas.edu> writes:
> >
> >Don Malcolm <dmal...@backatcha.com> wrote:
> >>Ray, just to let you and others participating in this thread know:
> >>BBBA has been providing RC/G data for pitchers since 1989. It's been
> >>part of the data we purchase from STATS. Some of it is currently
> >>available at the BBBA web site; the rest of it will be added over
> >>the rest of this year.
> >>
> >
> >Don,
> >
> >Have you or anyone else studied the predictive value of RC/G?
> >To be specific, if you want to predict a player's ERA (or RA)
> >in a given season, is it more useful to know the ERA from the
> >preceding season or the RC/G from the preceding season?
> >[paralleling James"s study that ERA is a better predictor
> >of W-L than W-L is]
> >In short, is it better to have the more accurately measured
> >indicator even when it doesn't quite measure the right thing?
>
> And I'd like to add to this request. What is the best tool for
> predicting future pitching performance? There are a bunch of stats out
> there which can be used: ERA/RA, OOPS (or RC/G), K/W ratio, QMAX. Has
> anyone ever looked at how these stats correlate with pitching
> performance from year to year?
>
> For example, I'd always heard that a young pitcher's K rate, or K/W
> ratio, was the best predictor of future performance. But the BBBA
> pushes QMAX. QMAX basically measures how well the pitcher prevents
> hits and walks (at the moment, it doesn't include how well the pitcher
> prevents extra-base hits). So this is a quite different measure from
> the K/W rate. If a young finesse pitcher prevents hits and walks, but
> doesn't K a lot of hitters, he'll rank much better in QMAX than in K/W
> ratio. OTOH, a power pitcher may give up a lot of hits and walks, and
> even have a high(ish) ERA. But if he has a good K rate, he may have a
> good K/W ratio, despite not having a good QMAX rating. So these are
> quite different ways of assessing pitching performance, but one
> probably correlates better with future performance than the
> other. Presumably the guys at the BBBA have looked at the QMAX data --
> how well does it do?
>
> This type of question is, of course, important when one tries to rate
> young pitchers in the minors. I believe that the current baseball
> wisdom is to look at the K/W rate, as well as the ERA (W/L records
> seem to be of less importance, even for baseball insiders). Thus,
> power pitchers get a lot of chances, whereas young control pitchers
> often never get promoted, even if their ERA's are excellent. Is this
> conventional wisdom correct?
>
> It may well be, given the size of the variance in pitching performance
> due to injuries, confidence, etc., that none of the measures do a
> particularly good job in correlations of year-to-year performance. But
> that would also be interesting to know.
>
> David London
>
>
Here's a simple, basic study I did for a college Intro Stats course. It has
flaws, but probably isn't totally worthless...

The question I seek to answer is this: What statistics best predict a
pitcher's ERA? I examined statistics for all Major League pitchers for the
years 1989-1992 who pitched at least 35 innings; I ask what data for the three
years 1989-1991 best predicts ERA for 1992?
I did not correct for park or league effects, mostly due to time
considerations.
The independent variables looked at were:
Earned Runs (ER), Earned Run Average (ERA=ER*9/IP), Run Average (RA=R*9/IP),
Innings Pitched (IP), Strikeouts (K,SO), Walks Allowed (BB, W), K/IP, , BB/IP,
K/BB, Hits Allowed (H), H/IP, (W+H)/IP, age, Home Runs Allowed (HRs), HR/IP.
I look at these variables for 1991, 1990, 1989, 1990-91, and 1989-91. Thus I
see whether the past one, two, or three year's performance best predicts
current year performance, and what the relative importance of the difference
past statistics is on current statistics.
I had about 240 pitchers with 35 or more innings pitched for 1992.
Since this project is essentially exploratory, the first thing I did was to
correlate all the variables with each other. Variables significantly
correlated with 1992 ERA were :
KBB91 -.3460** ERA9190 .3389** ERATOT
.3367**
ERA91 .3350** KBB9190 -.3265** KBBTOT -.3190**
RA91 .3167** KIP91 -.3025** WHIP91 .3024**
WHIP9190 .2965** KIP9190 -.2715** KBB90 -.2541**
KTOT -.2303* HIP91 .2264** BBIP9190 .2162*
BBIP90 .2009* WHIP90 .2004* BBIP91 .1952*
HIP9190 .1895* KIP90 -.1845*

*= significant at .01 level (one-tailed).
**= significant and .001 level (one-tailed).
Many interesting things can be said about this table. First, the
strongest individual predictor of 1992 ERA is 1991 K/BB ratio, followed by the
various ERA variables for the different time periods, and then the K/BB ratios
for time periods other than 1991. Thus it seems reasonable to conclude that
previous ERA and K/BB ratio are the two most important factors in predicting
1992 ERA.
Next, notice that no data pertaining solely to 1989 is found to be
significantly related to 1992 ERA. This is particularly surprising
considering that only pitchers of some quality pitch 35+ innings in four
successive years. Apparently, however, there is little relation between their
performance in the first and last of those years.
Notice also that K/BB for any period is a better predictor than either
K/IP or BB/IP for that period. Since the relation between these three is so
strong, I only consider K/BB for the rest of the analysis.
Similarly, I drop RA since it is highly correlated with ERA, but less
well related to 1992 ERA. Actually, the same logic holds for all the
variables except for K/BB and previous ERA: none of them have significant
effects on 1992 ERA independent of K/BB and previous ERA. Thus the "correct"
model is some combination of one of the K/BB variables with one of the ERA
variables.
Note that even the strongest correlations are not exceptionally good.
1991 K/BB only explains 12% of the variance in 1992 ERA. Can we find a
multiple variable model that does any better?
The first model I examined took the two variables with the highest
simple correlations to 1992 ERA: KBB91 and ERA9190. The results of this
regression were:
Multiple R .39146
R Square .15324
Adjusted R Square .14266
Standard Error 1.02840
Analysis of Variance
DF Sum of Squares Mean Square Regression
2 30.62402 15.31201
Residual 160 169.21611 1.05760
F = 14.47806 Signif F = .0000

Variable B SE B Beta T Sig T
KBB91 -.236770 .087947 -.228188 -2.692 .0079
ERA9190 .355351 .135764 .221849 2.617 .0097
(Constant) 2.946023 .611695 4.816 .0000
This looks like a robust regression. The overall model has an F of
14, significant at the .001 level. The variable coefficients are both
significant at the .01 level, and the signs are in the expected direction.
This model explains 14.3% of the variance in 1992 ERA, a slight increase over
either of the two variables alone.
The coefficients tell us that an increase of one K/BB91 lowers 1992
ERA by .24 (net ERA 9190) ,and an increase in ERA9190 of one run per nine
innings (net K/BB91) increases 1992 ERA by .35. The standardized beta weights
tell us that a one S.D. change in K/BB has roughly the same impact on 1992 ERA
as ERA9190.
I then ran a regression using K/BB91 and ERA91. The results were:
Multiple R .40693
R Square .16560
Adjusted R Square .15772
Standard Error 1.06349
Analysis of Variance
DF Sum of Squares Mean Square Regression
2 47.58521 23.79260
Residual 212 239.77273 1.13100
F = 21.03672 Signif F = .0000
Variable B SE B Beta T Sig T
ERA91 .289986 .084942 .233811 3.414 .0008
KBB91 -.292352 .079382 -.252227 -3.683 .0003
(Constant) 3.284627 .416106 7.894 .0000
We see that this model is also quite robust. The F for the overall
model is 21, significant at the .001 level. This model explains slightly more
variance than the first one, at 16%. The independent variables are both
highly significant, at the .001 level.
Looking at the coefficients tells us that a one run per nine innings
increase in ERA91 raises ERA92 by .29 runs (net K/BB91), while an increase of
one K/BB91 decreases ERA92 by .29 runs(net ERA91). Looking at the standardize
beta's, we notice that in this model K/BB seems to have a slightly stronger
effect on ERA92 than ERA91, overall and compared to the first model.
Due to the slightly higher R^2, the higher level of significance on
the independent variables, and the fact that it's doesn't force us to
calculate composite ERAs, I prefer the second model over the first.
Before we accept this model, however, an examination of residuals is
in order. First let's look at a histogram of standardized residuals:
N Exp N (* = 1 Cases, . : = Normal Curve)
0 .17 Out
2 .33 3.00 **
2 .84 2.67 :*
3 1.92 2.33 *:*
4 3.92 2.00 ***:
6 7.19 1.67 ******.
4 11.80 1.33 **** .
18 17.34 1.00 ****************:*
17 22.83 .67 ***************** .
34 26.93 .33 **************************:*******
31 28.46 .00 ***************************:***
28 26.93 -.33 **************************:*
23 22.83 -.67 **********************:
20 17.34 -1.00 ****************:***
13 11.80 -1.33 ***********:*
3 7.19 -1.67 *** .
5 3.92 -2.00 ***:*
2 1.92 -2.33 *:
0 .84 -2.67 .
0 .33 -3.00
0 .17 Out

These seem to be more or less normally distributed, which is good.
Next we should look at a plot of the residuals vs the predicted values. This
plot should be roughly scattered, with no discernible pattern. The plot
indeed looks fairly randomly scattered around the zero, zero point .
In conclusion, the "best" model is ERA92= .29*ERA91 + .29*K/BB91 +
3.30. This model is significant at the .001 level, as are both coefficients.
We must note, however, that even though the "best" model, it still only
explains about 16% of the variance in ERA92. It seems the folk wisdom of
pitchers unpredictability is, in general, correct. This study has found two
factors that can help anyone trying to predict ERA's. Neither is very
surprising. It is surprising that no other statistical indicator I examined
adds any information for predictive purposes.

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/ Now offering spam-free web-based newsreading

User Name

unread,
Apr 30, 1998, 3:00:00 AM4/30/98
to

Don Malcolm <dmal...@backatcha.com> wrote:
>
>> It may well be, given the size of the variance in pitching performance
>> due to injuries, confidence, etc., that none of the measures do a
>> particularly good job in correlations of year-to-year performance. But
>> that would also be interesting to know.
>
>I think many people already believe this to be true, and it
>might well be the case. That shouldn't dissuade us from trying
>to generate a measure that provides a better correlation,
>however.

I think the variance in pitching performance due to simple luck
tends to be greatly underestimated. OPS based systems seem more
attractive because of the larger sample size. My best guess --
without looking at lots of data -- was a standard deviation in ERA
for a typical reliever of about 1.0 and maybe 0.7 for a starter.
This would allow a starter who pitched at a 3.5 level and deviated from
the norm by two standard deviations to either win the Cy Young or
disappear from baseball (if baseball people relied solely on ERA).
Players really do have statistical offyears.

I see rating young players as a slightly different problems
because they are less mature. Goals like maximize strikeouts
and minimize WHIP are much easier to aim at and seem better for
prospects to make their goals. So those prospects who achieve
these goals would likely be the best prospects. A pitcher who
can attain a low ERA despite a high WHIP or a good W-L record
with a relatively high ERA is usually either lucky or has excellent
game management skills. I would not expect to see such skills
in a pitcher under 30 and certainly never in a minor leaguer.

Ray Heitmann


Jonathan Bernstein

unread,
May 1, 1998, 3:00:00 AM5/1/98
to

vi...@baseball.org wrote:

: I've read in this newsgroup that looking at Opponents' OPS is a better
: predictor of future pitching performance than ERA. In fact OOPS is a
: better predictor of future *ERA* than past ERA is.

I recall speculation to that effect, but not anyone saying it backed by
evidence. Am I remembering wrong?

JHB


0 new messages