Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How well do you understand randomness?

4 views
Skip to first unread message

David Grabiner

unread,
Aug 7, 1991, 3:48:55 PM8/7/91
to

Since there is a current discussion going on about clutch hitting,
choking, and winning streaks being due to luck or ability, I'd like to
challenge the net to understand luck. How likely is it that a team,
just by random chance, will have a long winning streak? How likely is
it that a player with no special ability to hit in the clutch will have
a great clutch performance?

First, write down what you consider a random sequence of 100 H's and
T's, one that might be obtained by flipping a coin.

Then, flip a coin 100 times, and write down that sequence.

Call one sequence A and the other B, and mail both to me. I should be
able to recognize the real coin flip most of the time.

Here's my example.

A:
HTTHH HTHTT TTHHT THHTH HHHTH THTTH TTTTH HHTHT TTTHT HTHTT
HHHTT THTTT HTTHH TTHTH THHHH TTTHT HHTTT TTHTH HTHHH TTHTH

B:
HTHHH THTTH THTTH THTHH THHTT HTHTH HTTTT HHHHH HTTHH HTTTH
TTTHT HTTHH HHTTT HTHHT TTHHT THTTH TTHHH HTHTT HTHHT THTTT

I would guess this one correctly; you can mail your guess with your
sequence.

--
David Grabiner, grab...@zariski.harvard.edu
"We are sorry, but the number you have dialed is imaginary."
"Please rotate your phone 90 degrees and try again."
Disclaimer: I speak for no one and no one speaks for me.

David M Tate

unread,
Aug 9, 1991, 12:10:04 AM8/9/91
to
OK, I've got the bugs worked out (I think) in the MLV formula. Here's the
gist, for those of you trying desperately to forget:

Team run-scoring can be estimated quite accurately by the
formula

R ~ OBP(team) * SLG(team) * AB(team)

For example, for last year's American League, this formula predicts
an average of 714 runs per team; the actual figure was 696.

MLV attempts to estimate the value of a player by comparing
the number of runs a league-average team would score if you
were to replace a league-average member of the lineup with
the player in question, for 162 games. This figure is computed
by using OBP(player), SLG(player), AVG(player), OBP(league),
SLG(league), and AVG(league) to find the approximate run formula
with and without the player in question.

First some results, then the formulae.

Using the latest numbers I have for AL batters, we get the following MLVs:

Player season +/- league

Danny Tartabull 786 +72
Frank Thomas 782 +68
Rafael Palmeiro 769 +55
Harold Baines 762 +48
Joe Carter 761 +47
Wade Boggs 760 +46
Jose Canseco 759 +45
Ken Griffey, Jr. 758 +44
Juan Gonzalez 754 +40
Kirby Puckett 754 +40
Ruben Sierra 750 +36
Cecil Fielder 748 +34
Dan Pasqua 748 +34
Dave Henderson 746 +32

Dave Winfield 740 +26

Mel Hall 753 (plus 39, if the league only had
right-handed pitchers...)

How about that Dan Pasqua? Is he platooning?

Danny and Frank are both producing above league average at TWICE the rate
of Cecil Fielder.

Rafael Palmeiro is in a class by himself. Unfortunately, Danny and Frank are
in a class *above* that. Not to take anything away from Raffy; he's having a
fantastic year...


All right, you want to know how to compute these things? Brace yourself.
I'll give it to you in small pieces.

Predicted Runs = OBP(team) * SLG(team) * AB(team)

OBP(team) = ( 8 * OBP(league) + OBP(player) ) / 9

SLG(team) = 8 * SLG(league) * AB(league) + SLG(player) * AB(player)
------------------------------------------------------
AB(team)

AB(league) = (PA/9) * (1 - OBP(league)) / (1 - BA(league))

AB(player) = (PA/9) * (1 - OBP(player)) / (1 - BA(player))

PA = 25.5 * 162 / (1 - OBP(team))

AB(team) = 8 * AB(league) + AB(player)

( BA is batting average, OBP is on-base percentage, SLG is slugging average)

This version actually takes into account the fact that a high-OBP player will
get a smaller fraction of his team's at bats than a low-OBP player, which can
have a noticeable effect in extreme cases (like Frank Thomas or Ozzie Guillen).

--
David M. Tate | FLAMBEAU: How do you know all this!? Are you a
dt...@unix.cis.pitt.edu | devil?
Owns more Steeleye Span | FATHER BROWN: I am a man, and therefore have all
albums than Chad Jackson | devils in my heart...

Ami A. Silberman

unread,
Aug 9, 1991, 3:59:26 AM8/9/91
to

Another predicted runs formula is given by Bill James. It comes in
two versions, one includes stolen bases, double plays etc, and the other
doesn't. I prefer the simpler version for most purposes. It is given
as follows:

RC = (total bases)*(hits + walks)/(plate appearances)

If you like, you can multiply it by 100/(plate appearances) to get
RC per 100 PA.
--
ami silberman - janitor of lunacy
sil...@cs.uiuc.edu

Glenn R. Waugaman

unread,
Aug 9, 1991, 11:09:33 AM8/9/91
to

In article <162...@unix.cis.pitt.edu>, dt...@unix.cis.pitt.edu (David M Tate) writes...

>Using the latest numbers I have for AL batters, we get the following MLVs:
>
> Player season +/- league
>
> Danny Tartabull 786 +72
> Frank Thomas 782 +68
> Rafael Palmeiro 769 +55
> Harold Baines 762 +48
> Joe Carter 761 +47
> .
> .
> .

Where's Cal Ripken? A couple of weeks ago (stats through July 21) I used
Palmer's Linear Weights to arrive essentially with the same result of a
run differential (except the formula is more exact in the slugging weights
and doesn't normalize to number of games) and Ripken was still second in
the league to Frank Thomas. I'm sure Tartabull and Palmeiro have passed
him and probably a couple others, but even though Ripken's batting average
has slipped since the All-Star break (.348 to .322), he's still been
banging doubles and a few home runs. He hasn't fallen that far, has
he?

---
Glenn Waugaman
Digital Equipment Corporation
Littleton, MA
g_wau...@nac.enet.dec.com
---

Roger Lustig

unread,
Aug 9, 1991, 10:14:38 AM8/9/91
to
In article <162...@unix.cis.pitt.edu> dt...@unix.cis.pitt.edu (David M Tate) writes:
>OK, I've got the bugs worked out (I think) in the MLV formula. Here's the
>gist, for those of you trying desperately to forget:

> Team run-scoring can be estimated quite accurately by the
> formula

> R ~ OBP(team) * SLG(team) * AB(team)

> For example, for last year's American League, this formula predicts
> an average of 714 runs per team; the actual figure was 696.

You might want to mention that this is equivalent to Bill James' Runs
Created, Formula I, i.e., OBA * TB.

[good stuff follows, regarding marginal value over league average]

Roger

David M Tate

unread,
Aug 9, 1991, 3:26:31 PM8/9/91
to
In article <1991Aug9.0...@m.cs.uiuc.edu> sil...@m.cs.uiuc.edu (Ami A. Silberman) writes:
>
>Another predicted runs formula is given by Bill James. [...]

>
>RC = (total bases)*(hits + walks)/(plate appearances)

This isn't "another" formula, this is THE SAME formula, without the
correction for the fact that there is only one Frank Thomas, not 9.
In other words, a step backwards. This formula is part of the *problem*
that MLV attempts to correct for.

David M Tate

unread,
Aug 9, 1991, 3:29:29 PM8/9/91
to
In article <25...@shlump.lkg.dec.com> g_wau...@nac.enet.dec.com (Glenn R. Waugaman) writes:
>
>In article <162...@unix.cis.pitt.edu>, dt...@unix.cis.pitt.edu (David M Tate) writes...
>>Using the latest numbers I have for AL batters, we get the following MLVs:
>
>Where's Cal Ripken?

I didn't have a recent OBP for Cal, so I couldn't do it. I'll publish a list
of leaders after I get my McWeekly this weekend.

To which end: could someone compute the AL league BA/OBP/SLG so far this year
for me? I'm using last year's numbers, and I think there's been a pretty big
shift since then.

0 new messages