Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

America's No. 1 sport: Football has replaced baseball as the national past

3 views
Skip to first unread message

Ron Johnson

unread,
Sep 10, 2002, 1:58:55 PM9/10/02
to
cross-posted to asbbo because I don't think Dennis Shea lurks in RSB
and he can add a lot to this discussion.

In article <m3u1l6b...@mac.com>,
Dale J. Stephenson <daleste...@mac.com> wrote:
>James Withrow <wit...@midway.uchicago.edu> writes:
>
>> "David J. Grabiner" wrote:
>> >
>[marginal win worth more to small market team?]
>> > It is probably worth more in attendance. Conversely, it is worth more
>> > to a large market team in TV revenue. These two effects may well
>> > balance out.
>> >
>>
>> But they probably don't balance out. David, you
>> cite Zimbalist's 1992
>> conclusions over and over, but I don't even think
>> Zimbalist himself
>> believes that marginal revenue per win for each
>> team is so similar
>> anymore.
>>
>But he hasn't done another study to show how (and where) they are
>different. Has anyone?
>

Finally got around to including the 2001 data. Here's what I get
for 1995 to 2001 (minus 1998 -- no Forbes revenue data. Also does
not include Arizona or Tampa Bay because I figured that their revenue
dynamics were going to be different.)

I've knocked out at least a dozen things that don't turn out to be
statiscally significant. Including last year's winning percentage.
Model's a fair amount different than the one I had in the BBBA


SUMMARY OUTPUT

Regression Statistics
Multiple R .966
R Square .933
Adjusted R Square .929
Standard Error 10.1
Observations 168

Coefficients Standard Error t Stat
Intercept 21.5 10.8 2.0
W% -69.0 19.4 -3.6
Trend 5.7 0.5 12.2
NS 23.7 2.3 10.5
PCI 0.7 0.2 3.2
MW% 0.3 0.0 7.5
PO 5.5 2.8 2.0
POLY 7.7 2.2 3.6
PW% 1.4 0.1 11.9
WW 10.5 5.2 2.0
WWL 14.2 5.6 2.5

W% winning percentage
Trend year - 1994
NS a dummy variable for one of the new stadiums (an area that can be
improved upon I think. Dennis Shea has pointed out that a paper he's
seen says that this starts at ~ 40 million and drops off about a
million a year. From what I can tell, the effect more or less goes
away after 7 years)
PCI Per Capita income (using 1996 Census data -- in thousands)
MW% Mike Jones' market size estimate * winning percentage (this works
a little better than the Census data. There's some indication that
he overestimates the impact of a shared market. A dummy variable
for a shared market comes back positive but insignificant)
PO dummy variable. 1 if the team made the playoffs
POLY dummy variable. 1 if the team made the playoffs in the previous season
PW% Opening day payroll * winning percentage
WW dummy variable. 1 if the team won the World series
WWL dummy variable. 1 if the team won the World series in the previous
season

It's a pain to work with and I know it doesn't quite capture the different
value of wins. Still, assuming that the Forbes revenue model is close to
being accurate, it's a pretty good explanation of revenue.

--
RNJ

Dennis Shea

unread,
Sep 10, 2002, 9:42:38 PM9/10/02
to

Ron Johnson wrote:

> cross-posted to asbbo because I don't think Dennis Shea lurks in RSB
> and he can add a lot to this discussion.


So many newsgroups, so little time


> In article <m3u1l6b...@mac.com>,
> Dale J. Stephenson <daleste...@mac.com> wrote:
>
>>James Withrow <wit...@midway.uchicago.edu> writes:
>>
>>>"David J. Grabiner" wrote:
>>>
>>[marginal win worth more to small market team?]
>>
>>>>It is probably worth more in attendance. Conversely, it is worth more
>>>>to a large market team in TV revenue. These two effects may well
>>>>balance out.
>>>>
>>>>
>>>But they probably don't balance out. David, you
>>>cite Zimbalist's 1992
>>>conclusions over and over, but I don't even think
>>>Zimbalist himself
>>>believes that marginal revenue per win for each
>>>team is so similar
>>>anymore.
>>>
>>>
>>But he hasn't done another study to show how (and where) they are
>>different. Has anyone?


John Burger from Loyola College in Baltimore has a study, not yet
published I think, that did have different MRW, but if IIRC, the larger
market teams had higher MRWs. His study also differs somewhat from
Ron's in the way he deals with new stadiums and also whether the team is
a "contender"---MRW is higher for potential contenders because of the
prospect of play-off money, offering one reason why they over-pay for
marginal talent at mid-season. There was also the back of the envelope
calculations in Doug Pappas series in Baseball Prospectus on the
marginal cost per marginal win that are interesting to look at.

Burger also had a recent paper cited by Ken Rosenthal on the MR of a
player that seemed to be very different from prior estimates. I have to
try and get back to him and get both papers.


Since the effect of winning percentage is hidden in several variables,
it helps if you do a little table. What are your data showing Ron?

James Withrow

unread,
Sep 11, 2002, 9:50:29 AM9/11/02
to

Ok, what does this mean?

Withrow

Ron Johnson

unread,
Sep 11, 2002, 2:15:33 PM9/11/02
to
In article <3D7F4A25...@midway.uchicago.edu>,

James Withrow <wit...@midway.uchicago.edu> wrote:
>
>Ok, what does this mean?
Quick and dirty explanations:

The important parts of the first bit are standard error and adjusted
r squared. The latter tells you how much of the variation in team
reavenue is explained by the variables (~93% in this case) and the
former tells you the range of results.

The second part is the equation.

Team revenue = 21.5 - (69*winning percentage) + (5.7*(year-1994)) ...

T stat is a test for the significance of the variable. You want a high
absolute value.

If you run the equation for the 168 team years you'll get a .966
correlation between the results and the revenue estimates.

Here's a quick look at how well it does on a team by team basis:

Cor is the correlation between the team's estimated revenue and what
Forbes reported.

Est is the estimated revune over the period of the study

Rev is the total revenue reported over the course of the study

THe last column is Rev/Est


Cor Est Rev
CHC .977 483.6 569.8 1.18
STL .935 474.5 539.5 1.14
COL .844 601.4 652.8 1.09
KCR .980 325.9 350.2 1.07
PIT .971 351.8 375.8 1.07
BOS .980 619.0 649.4 1.05
CHW .934 457.9 475.7 1.04
LAN .992 624.6 648.4 1.04
TOR .920 430.2 444.8 1.03
NYY .997 951.7 974.9 1.02
SDP .955 385.7 393.5 1.02
CLE .919 725.2 734.3 1.01
HOU .982 513.4 515.3 1.00
FLA .944 398.7 399.4 1.00
SEA .959 601.8 600.0 1.00
BAL .919 719.2 710.7 .99
MIL .986 363.8 356.4 .98
NYM .994 683.7 666.9 .98
SFG .989 534.7 520.7 .97
DET .993 460.3 440.4 .96
MON .958 293.0 277.8 .95
ANA .981 456.8 432.2 .95
OAK .965 394.0 372.6 .95
ATL .971 761.4 720.1 .95
TEX .946 684.0 625.2 .91
PHI .987 439.9 393.7 .89
CIN .969 411.4 360.1 .88
MIN .938 351.9 298.8 .85

A few things that stand out: Neither the Yankees nor the Expos appear
to be outliers. They both seem to take in pretty much what you'd expect
given team quality and market size.

The one team that model doesn't work worth a damn on is Colorado.
It seriously overestimates the Twins' revenue. Lease.

No idea why it misses so badly on the Reds.

I know from other studies that the Cardinals' attendance is the least
sensitive to team quality. Doesn't surprise me that they'd make
consistently more than expected. More or less the same story for the Cubs.

The other conclusion is that good players are worth paying for. I like
the idea the payroll * winning percentage is both positive and significant.

Plugging the numbers in, I get the 2001 version of Jason Giambi worth
~11 million to the As if he makes no difference to their playoff prospects.

The Yankees? Paying Giambi ~9 million more than Tino Martinez (better
than replacement level last year, but not much) rates to increase
their revenue by at least $18 million (a bunch more than that if the
difference between Giambi and Martinez wins them the World Series.
Doubtful, but Giambi's a lot better than Martinez)

As I said elsewhere, a pain to work with.

--
RNJ

James Withrow

unread,
Sep 11, 2002, 6:29:46 PM9/11/02
to

Ron Johnson wrote:
>
> In article <3D7F4A25...@midway.uchicago.edu>,
> James Withrow <wit...@midway.uchicago.edu> wrote:
> >
> >Ok, what does this mean?
> Quick and dirty explanations:
>
> The important parts of the first bit are standard error and adjusted
> r squared. The latter tells you how much of the variation in team
> reavenue is explained by the variables (~93% in this case) and the
> former tells you the range of results.
>
> The second part is the equation.
>
> Team revenue = 21.5 - (69*winning percentage) + (5.7*(year-1994)) ...
>

I don't know if I'm missing something or what, but
I didn't see the equation in the previous posts or
in this one.

<snip>

> The one team that model doesn't work worth a damn on is Colorado.

I've generally preferred to leave out all the 90s
expansion teams when running these things.

> It seriously overestimates the Twins' revenue. Lease.
>
> No idea why it misses so badly on the Reds.
>
> I know from other studies that the Cardinals' attendance is the least
> sensitive to team quality. Doesn't surprise me that they'd make
> consistently more than expected. More or less the same story for the Cubs.
>

Well, people like a consistent product. The Cards
are usually over .500 and the Cubs rarely are.

Seriously, though, Wrigley Field probably
functions like a new ballpark in the equation.
It's a mystery to me why Wrigley is so
well-liked. I still haven't had good seats and
I've tried everywhere except the center field
bleachers and I'd be less scared of walking home
from Comiskey than sitting out in the Wrigley
bleachers. Especially when I go to Cubs-Cards
games wearing my Fredbird suit.

I'm not convinced that the Cards' attendance is
less sensitive to team quality. From what I can
see, you're working with time intervals of only a
year or two when fans might have longer memories
than that. Obviously, I don't have any study to
refute yours; it's just a guess on my part.

> The other conclusion is that good players are worth paying for. I like
> the idea the payroll * winning percentage is both positive and significant.
>

I like to tell people that the invisible hand is a
thing of beauty. Usually.

> Plugging the numbers in, I get the 2001 version of Jason Giambi worth
> ~11 million to the As if he makes no difference to their playoff prospects.
>
> The Yankees? Paying Giambi ~9 million more than Tino Martinez (better
> than replacement level last year, but not much) rates to increase
> their revenue by at least $18 million (a bunch more than that if the
> difference between Giambi and Martinez wins them the World Series.
> Doubtful, but Giambi's a lot better than Martinez)
>

No kidding. Tino does all the little things
well. It's just the big things he needs to
improve-- like OBP and SLG. And the Cards traded
away just about the only hitting prospect in their
system (and a first baseman to boot) in Luis
Garcia for a couple months of Chuck Finley. But I
digress.

> As I said elsewhere, a pain to work with.

And I'd like to share your pain. Did I miss the
equations somewhere?

Withrow

Paul G. Wenthold

unread,
Sep 11, 2002, 6:42:30 PM9/11/02
to
James Withrow wrote:
>
> Seriously, though, Wrigley Field probably
> functions like a new ballpark in the equation.
> It's a mystery to me why Wrigley is so
> well-liked.

Thank you. Someone needed to say it.

I had the great opportunity to get some corporate
season tickets for earlier this year. I'm thinking,
"Great, decent seats to a Cubs game finally."
Uh-uh. Yeah, they were in the infield. Half
way under the friggin overhang. Anything higher than
a line drive, and it's lost.

Fortunately, it was rainy and cold, so we could move out
to where we could actually see everything. Unfortunately,
it was rainy and cold, and that's the game that Bonds
didn't play.

I went later this summer and got some not as lousy
seats in the right field corner. Far better there than
around the infield.

paul

Voros McCracken

unread,
Sep 11, 2002, 8:35:02 PM9/11/02
to
joh...@ccrs.nrcan.gc.ca (Ron Johnson) wrote in message news:<allbsv$2...@gcpdb.ccrs.nrcan.gc.ca>...

Okay, I guess I'll comment here. As much as i appreciate Mike's work
and it's quality, I question it's utility here. I have concerns about
the direction of causation.

In particular, Jones went through a process of guesstimating what
percentage of a market a team draws from. This really isn't a problem
except for the fact that to an extent a scenario where the effect
(additional revenue) proceeds the cause (percentage a team draws from
a market). Or more simply, the Cubs drawing additional revenue over
the White Sox led Mike to conclude that they are able to draw a larger
percentage from their market. Whether this is correct or incorrect,
isn't relevant, since the ability of a team to draw from a market is
exactly what it is your trying to measure here, so you wind up with
the system projecting an advantage for the Cubs over the White Sox,
based on the results you are trying to predict with that projection.
It can be circular here.

I think even if the accuracy drops a little, you're far better off
simply using the Census numbers or the raw Nielsen numbers for each
area and not trying to apportion things.

> PO dummy variable. 1 if the team made the playoffs
> POLY dummy variable. 1 if the team made the playoffs in the previous season
> PW% Opening day payroll * winning percentage

Okay and this I also have a problem with. There's a very good chance
that the relationship between payroll and revenues here is mostly a
case of correlation not implying causation. A team, presumably, has a
series of relevant information not contained in the model above from
which to base their payroll. They probably have a good idea of what
their expected revenues can be at a certain base of various win
levels. This payroll figure could actually be an indication of the
team's knowledge of revenue sources not included in the model, so that
a high relative payroll would indicate that the team expects to make
more revenue than the other variables would suggest. The model,
though, suggests that the payroll itself is driving the increase in
revenues. Or, more clearly, the model suggests that if you decide to
jack your payroll up $20 million, your revenues will increase
accordingly. I don't think simple correlation here shows that. I think
that's a tough sell without additional information in that regard.

When I did my model, I in fact considered using opening day payroll
and eventually decided not to based on the above rationale.

> WW dummy variable. 1 if the team won the World series
> WWL dummy variable. 1 if the team won the World series in the previous
> season

> It's a pain to work with and I know it doesn't quite capture the different
> value of wins. Still, assuming that the Forbes revenue model is close to
> being accurate, it's a pretty good explanation of revenue.

I agree, but with the two comments I made, I'm worried that to small
extents the dependent variable you're trying to estimate already
exists in a few of the variables you're using. Or, that, in a
round-a-bout way, you've either god affects preceding their causes, or
unknown third variables being the causative agent currntly attributed
to another variable.

YMMV.

--
Voros McCracken

j...@socrates.berkeley.edu

unread,
Sep 12, 2002, 12:09:01 AM9/12/02
to
In rec.sport.baseball Voros McCracken <vo...@baseballprimer.com> wrote:
:> MW% Mike Jones' market size estimate * winning percentage (this works

:> a little better than the Census data. There's some indication that
:> he overestimates the impact of a shared market. A dummy variable
:> for a shared market comes back positive but insignificant)

: Okay, I guess I'll comment here. As much as i appreciate Mike's work
: and it's quality, I question it's utility here. I have concerns about
: the direction of causation.

: In particular, Jones went through a process of guesstimating what
: percentage of a market a team draws from. This really isn't a problem
: except for the fact that to an extent a scenario where the effect
: (additional revenue) proceeds the cause (percentage a team draws from
: a market). Or more simply, the Cubs drawing additional revenue over
: the White Sox led Mike to conclude that they are able to draw a larger
: percentage from their market. Whether this is correct or incorrect,
: isn't relevant, since the ability of a team to draw from a market is
: exactly what it is your trying to measure here, so you wind up with
: the system projecting an advantage for the Cubs over the White Sox,
: based on the results you are trying to predict with that projection.
: It can be circular here.

Jumping in before Mike does...I don't think that's what Mike did for
shared market teams. I think he looked for structural reasons that one of
the teams in a shared market had an advantage regardless of either teams'
behavior. That's easiest to do with the California examples; in both the
Bay Area and Greater LA cases, the NL team was there first and is located
where the center of the local media are located. I have no question in my
mind at all that, all things being equal, the Giants and Dodgers will get
more media attention, better broadcast and cable contracts, and better
revenue from attendence. I don't know that he got the proportions right,
but that's IIRC how he went about doing it.

: I think even if the accuracy drops a little, you're far better off


: simply using the Census numbers or the raw Nielsen numbers for each
: area and not trying to apportion things.

I actually think you can make a better case against some of the peripheral
markets in Mike's study. It's pretty easy to give Colorado Springs to the
Rox, but I suspect that some of the places in OH, IL, PA, NJ, CT, etc. may
well have been influenced by prior successful marketing to be Pirates or
Reds or Phillies towns. OTOH, you have to apportion them somehow, since
ignoring them is probably as dangerous as including them.


: I agree, but with the two comments I made, I'm worried that to small


: extents the dependent variable you're trying to estimate already
: exists in a few of the variables you're using. Or, that, in a
: round-a-bout way, you've either god affects preceding their causes, or
: unknown third variables being the causative agent currntly attributed
: to another variable.

: YMMV.

I share Voros' concern about his second comment.

JHB

Ron Johnson

unread,
Sep 12, 2002, 1:06:49 PM9/12/02
to
In article <635f97d9.02091...@posting.google.com>,

Voros McCracken <vo...@baseballprimer.com> wrote:
>joh...@ccrs.nrcan.gc.ca (Ron Johnson) wrote in message news:<allbsv$2...@gcpdb.ccrs.nrcan.gc.ca>...
>> Model's a fair amount different than the one I had in the BBBA

>> PW% Opening day payroll * winning percentage


>
>Okay and this I also have a problem with. There's a very good chance
>that the relationship between payroll and revenues here is mostly a
>case of correlation not implying causation. A team, presumably, has a
>series of relevant information not contained in the model above from
>which to base their payroll. They probably have a good idea of what
>their expected revenues can be at a certain base of various win
>levels. This payroll figure could actually be an indication of the
>team's knowledge of revenue sources not included in the model, so that
>a high relative payroll would indicate that the team expects to make
>more revenue than the other variables would suggest. The model,
>though, suggests that the payroll itself is driving the increase in
>revenues. Or, more clearly, the model suggests that if you decide to
>jack your payroll up $20 million, your revenues will increase
>accordingly. I don't think simple correlation here shows that. I think
>that's a tough sell without additional information in that regard.

I did go through this in the BBBA article.

Historically perception of team quality is a fair amount more important
than actualy team quality in explaining team revenue. THat's why surprise
winners don't draw as much as many people expect.

Last year's winning percentage served as a pretty good representation
of the public's perception of team quality (and I think the most
significant thing your model does is demonstrate that it's more than
just the last season that matters) and it still works reasonably well.

But as soon as you introduce payroll into the equation last year's
record pretty much drops into insgnificance.

Best I can tell, the perception question now basically boils down to

did they make the playoffs last year
did they win the World Series
what's the payroll

My major concern about including payroll is more on the order of where
diminishing returns start.

--
RNJ

Ron Johnson

unread,
Sep 12, 2002, 2:56:25 PM9/12/02
to
In article <3D7FC3DA...@midway.uchicago.edu>,

James Withrow <wit...@midway.uchicago.edu> wrote:
>
>
>Ron Johnson wrote:
>>
>> In article <3D7F4A25...@midway.uchicago.edu>,
>> James Withrow <wit...@midway.uchicago.edu> wrote:
>> >
>> >Ok, what does this mean?
>> Quick and dirty explanations:
>>
>> The important parts of the first bit are standard error and adjusted
>> r squared. The latter tells you how much of the variation in team
>> reavenue is explained by the variables (~93% in this case) and the
>> former tells you the range of results.
>>
>> The second part is the equation.
>>
>> Team revenue = 21.5 - (69*winning percentage) + (5.7*(year-1994)) ...
>>
>
>I don't know if I'm missing something or what, but
>I didn't see the equation in the previous posts or
>in this one.

That's what I get for attempting to be brief.

The equation is the coefficients * the value summed together plus the
interecpt.

And I hear you asking for a translation of the translation.

An example probably would help. Here's the Yankees 2001

Coefficients Value Impact
Intercept 21.5
W% -69.0 .594 -41.0
Trend 5.7 7 40.1
NS 23.7 0 0.0
PCI 0.7 33.3 24.7
MW% 0.3 155.6 43.7
PO 5.5 1 5.5
POLY 7.7 1 7.7
PW% 1.4 65.2 91.8
WW 10.5 0 0.0
WWL 14.2 1 14.2
Total 208.2

Intercept is what everybody starts with.

The Yankees went 95-65. (Yeah it looks weird to see the record come out
negative -- as in costing the Yankees 41 million. But remember that
winning percentage is a component of other parts)

Trend is 2001-1994. Captures the strong upward trend in revenue. Worth
~40 million.

The Yankees don't have a New Stadium.

The high per-capita income in the New York area is worth 24.7 million

MW% is Mike Jones' market size estimate * the winning percentage.
262*.594. $43.7 total. A KC team of the same quality would bring in
$6.3 million. (38*.594*.281 -- I know it says .3)

Now an extra win (over 162 games instead of 160 -- I shouldn't have
picked a team that didn't play 162 games) would make this 44.2 for
the Yankees and 6.4 for the Royals. Thus I get the marginal value of
a win for the Yankees at $454,000 and $66,000 for the Royals.

Except of course winning percentage affects other parts of the equation.
So it's accurate to say that a random win is worth ~$390,000 more for
the Yankees. It's not accurate to say that it's worth nearly 7 times
as much. And of course these days $390,000 may as well have the same
value as zero in the decision making process.

PO, POLY, WW and WWL are all dummy variables (1 if true). So the
playoff success over 2000 and 2001 were worth $27.4 million to the
Yankees' bottom line in 2001 (and some more this year)

PW is opening day payroll * winning percentage * 1.409. 91.8 million.

Put it all together and assuming no change in payroll or whether or
not they make the playoffs and an extra win for the Yankees would
have been worth about $983,000. A KC team with the same payroll would
have picked up about $595,000.

Doesn't come close to covering a player's salary, right? But from
what I can tell much of the salary you pay a player comes back as
increased revenue. It helps to think of salary as very effective
advertizing.

--
RNJ

Voros McCracken

unread,
Sep 12, 2002, 6:16:25 PM9/12/02
to
joh...@ccrs.nrcan.gc.ca (Ron Johnson) wrote in message news:<alqhj9$9...@gcpdb.ccrs.nrcan.gc.ca>...

> Best I can tell, the perception question now basically boils down to
>
> did they make the playoffs last year
> did they win the World Series
> what's the payroll
>
> My major concern about including payroll is more on the order of where
> diminishing returns start.

I understand the logic behind it and I agree to an extent it could
reasonably be a factor (to put simply, a high payroll is good
advertising).

But I'm not sure how much of the amount the model ascribes to payroll
is due to this factor and how much is simply an indicator for other
factors that drive revenue that the model doesn't cover.

I can't imagine how one would go about studying this in a way to
separate the two (I suppose changes in payroll against changes in
revenue might be a better way, but it by no means would eliminat the
possibility of the "third-variable" problem).

Again, it's not really a problem with the model, but the utility of
it. I think the payroll element might be a situation where it works as
a way to drive revenues as long as it isn't done specifically for that
purpose. Or simply put, payroll increases might drive revenue as long
as revenue increases aren't the sole end of doing so. Possibly if a GM
increases payroll for this advertising purpose, the relationship might
break down significantly, or the "diminishing returns" scenario you
mention.

--
Voros McCracken

Dennis Shea

unread,
Sep 12, 2002, 9:20:04 PM9/12/02
to

Voros McCracken wrote:

> joh...@ccrs.nrcan.gc.ca (Ron Johnson) wrote in message news:<alqhj9$9...@gcpdb.ccrs.nrcan.gc.ca>...
>
>>Best I can tell, the perception question now basically boils down to
>>
>>did they make the playoffs last year
>>did they win the World Series
>>what's the payroll
>>
>>My major concern about including payroll is more on the order of where
>>diminishing returns start.
>>
>
> I understand the logic behind it and I agree to an extent it could
> reasonably be a factor (to put simply, a high payroll is good
> advertising).
>
> But I'm not sure how much of the amount the model ascribes to payroll
> is due to this factor and how much is simply an indicator for other
> factors that drive revenue that the model doesn't cover.
>
> I can't imagine how one would go about studying this in a way to
> separate the two (I suppose changes in payroll against changes in
> revenue might be a better way, but it by no means would eliminat the
> possibility of the "third-variable" problem).


What you would probably want to do is specify it as a simultaneous
equation model. You'd include the other variables that impact both
revenues and payroll as independent variables that could appear both in
the equation that determines revenue and the one that determines
payroll. As long as you have variables that determine one and not the
other, the equation can be estimated.

If you don't want to go to that trouble, you can do a two-stage
instrumental variable approach. You'd estimate payroll as a function of
the independent variables, then substitute its linear predictor rather
than its actual values into the revenue equation for the estimation. To
simplify a lot, as long as the linear predictor is closely correlated
with the actual vlaue (you get a good r-squared in the first stage)
you'll get a good estimate in the second stage.

David H.

unread,
Sep 12, 2002, 10:31:36 PM9/12/02
to
"Dennis Shea" <dgs...@adelphia.net> wrote in message
news:3D813D2B...@adelphia.net...

Uh, why don't you take this to alt.sports.I-don't give-a-Shit?

I'd rather talk about how insignificant the Democratic Party is going to be
in about 6 months....if we're going to diss the Orioles.

David H.


Roger Moore

unread,
Sep 13, 2002, 12:37:54 AM9/13/02
to
joh...@ccrs.nrcan.gc.ca (Ron Johnson) writes:

>Historically perception of team quality is a fair amount more important
>than actualy team quality in explaining team revenue. THat's why surprise
>winners don't draw as much as many people expect.

>Last year's winning percentage served as a pretty good representation
>of the public's perception of team quality (and I think the most
>significant thing your model does is demonstrate that it's more than
>just the last season that matters) and it still works reasonably well.

>But as soon as you introduce payroll into the equation last year's
>record pretty much drops into insgnificance.

>Best I can tell, the perception question now basically boils down to

>did they make the playoffs last year
>did they win the World Series
>what's the payroll

>My major concern about including payroll is more on the order of where
>diminishing returns start.

It depends a lot on what you're trying to achieve with your model. If
the goal is only to predict how much money a team is making, including
payroll is fine. But it frankly isn't very interesting to have the most
accurate possible estimate of team revenues. That's great if you're an
accountant or an industry analyst, but what most people looking at team
finances these days want to know is how external factors affect team
revenues.

Specifically, as a fan I really want to know:

1) How big is the permanent advantage of a large market team over a small
market team.
2) How much does winning help any team
3) Do large market teams have a relatively larger increase in revenue for
each win than small market teams.

So a useful revenue model should include terms that represent market size,
the impact of winning, and the interaction between winning and market
size. That will give a model that answers the big questions facing the
sport.

Including payroll can mess all of that up. It confuses things because
payroll can be influenced by many factors such as:

1) Teams with large fixed revenue bases (i.e. large market teams) may be
more willing to carry a large payroll even when they're doing badly.
Thus payroll may implicitly include information about market size.
2) Teams may set their payrolls to match expected revenues. This is not
classically rational market behavior, but it does seem that many teams
have budgets that are fixed largely by expected revenues. Thus including
payroll may implicitly include information derived from inherently more
accurate, team specific revenue models.
3) Teams can adjust their payrolls in mid season by either taking on
additional expensive contracts if they're doing well or dumping expensive
contracts if they're hurting for money. Thus payroll will tend to track
actual changes in in-season income.

Again, all of these things are just great if your only goal is to maximize
the accuracy of your prediction. But adding in payroll can muddy the
waters if you're trying to figure out whether large market teams have an
advantage. In essence, payroll is partially colinear with some of your
other terms, and adding a colinear term tends to wreck the analytical
utility of a model, for which you want to have terms that are as
orthogonal as possible (except for deliberately included cross terms, like
market size times winning percentage). Adding a colinear term that has no
real bearing on the questions that you're trying to answer is a big
mistake.

--
Roger Moore | Master of Meaningless Trivia | (r...@alumni.caltech.edu)

Sponsor of the Walter Johnson, Home Run Baker, and Bob Caruthers pages at
http://www.baseball-reference.com

Voros McCracken

unread,
Sep 13, 2002, 3:04:18 AM9/13/02
to
"David H." <dhag...@erols.com> wrote in message news:<alrim7$a99$1...@bob.news.rcn.net>...

> Uh, why don't you take this to alt.sports.I-don't give-a-Shit?

Congratulations on your new job as group moderator.

--
Voros McCracken

James Withrow

unread,
Sep 13, 2002, 10:37:05 AM9/13/02
to

Roger Moore wrote:

> 2) Teams may set their payrolls to match expected revenues. This is not
> classically rational market behavior, but it does seem that many teams
> have budgets that are fixed largely by expected revenues. Thus including
> payroll may implicitly include information derived from inherently more
> accurate, team specific revenue models.

It's hard to tell whether it's spin or actual
behavior, but it does look like teams tend to do
this. The Cardinals' ownership talks like they
set their payroll budget to match expected
revenues.

If they really do this, it may be partially for
tax reasons. A guy on the Cardinals listserver
makes an interesting case that the owners try to
shift revenues from operating profits to capital
gains because capital gains taxes are lower. I'm
not a tax attorney, so I can't verify the details
of his theory, but it might explain this
behavior. Teams spend as much as they take in so
that goodwill and revenues are increased, thus
increasing the value of the franchise they're
going to sell in a few years.

Withrow

Ron Johnson

unread,
Sep 13, 2002, 3:54:29 PM9/13/02
to
In article <alrq32$esd$1...@naig.caltech.edu>,

I agree up to this point. But I also want to know whether any given signing
makes sense. And I don't think I'm alone in this. (Mind you doing what ifs
with the model I have isn't the easiest thing in the world. I think
Voros' model will generally lead to the same conclusion as to whether
a signing is a good one and his is way easier to work with.)

I'd also like to build as "accurate" (I share the concerns that payroll
may be acting as a dummy variable for something else -- but if it is
that something is *very* important) a model as possible because there
are some things I'd like to work backwards on.

For instance, given that there's a reasonable revenue model, how large
is every market acting as?


>
>So a useful revenue model should include terms that represent market size,
>the impact of winning, and the interaction between winning and market
>size. That will give a model that answers the big questions facing the
>sport.
>
>Including payroll can mess all of that up. It confuses things because
>payroll can be influenced by many factors such as:
>
>1) Teams with large fixed revenue bases (i.e. large market teams) may be
>more willing to carry a large payroll even when they're doing badly.
>Thus payroll may implicitly include information about market size.

Possible and worth looking at further.

>2) Teams may set their payrolls to match expected revenues. This is not
>classically rational market behavior, but it does seem that many teams
>have budgets that are fixed largely by expected revenues. Thus including
>payroll may implicitly include information derived from inherently more
>accurate, team specific revenue models.
>3) Teams can adjust their payrolls in mid season by either taking on
>additional expensive contracts if they're doing well or dumping expensive
>contracts if they're hurting for money. Thus payroll will tend to track
>actual changes in in-season income.

Precisely why I opted to use opening day payroll.


>
>Again, all of these things are just great if your only goal is to maximize
>the accuracy of your prediction. But adding in payroll can muddy the
>waters if you're trying to figure out whether large market teams have an
>advantage. In essence, payroll is partially colinear with some of your
>other terms, and adding a colinear term tends to wreck the analytical
>utility of a model, for which you want to have terms that are as
>orthogonal as possible (except for deliberately included cross terms, like
>market size times winning percentage). Adding a colinear term that has no
>real bearing on the questions that you're trying to answer is a big
>mistake.
--

RNJ

Ronald L Matthews

unread,
Sep 14, 2002, 4:33:37 PM9/14/02
to
Voros McCracken <vo...@baseballprimer.com> trolled:
> "David H." <dhag...@erols.com> wrote...

>> Uh, why don't you take this to alt.sports.I-don't give-a-Shit?

> Congratulations on your new job as group moderator.

Was I fired? If not, why don't you just bugger off on out of
here. rec.sport.baseball.anal is the place for you and you know
it, we know it, everybody knows it.

cordially, as always,

rm

David H.

unread,
Sep 14, 2002, 6:44:22 PM9/14/02
to

"Ronald L Matthews" <r...@nospam.com> wrote in message
news:3d839...@corp-news.newsgroups.com...

My head hurts just trying to remember the thread.

Cross-Posting can be harzardous to your mental health.

And, Dennis Shea no less.

Now, *THAT'S* a "Coupe" de'grace!! ;^)

David H.


Steven Wallace

unread,
Sep 14, 2002, 9:57:53 PM9/14/02
to

"Ronald L Matthews" <r...@nospam.com> wrote in message
news:3d839...@corp-news.newsgroups.com...


Voros can stay cuz he was quoted by Bill James agreeing that balls hit by
batters aren't impacted by pitchers.

it woke us all up.


Ronald L Matthews

unread,
Sep 15, 2002, 1:17:40 AM9/15/02
to
Steven Wallace <swalla...@attbi.com> trolled:

> "Ronald L Matthews" <r...@nospam.com> wrote in message
>> Voros McCracken <vo...@baseballprimer.com> trolled:
>>> "David H." <dhag...@erols.com> wrote...

>>>> Uh, why don't you take this to alt.sports.I-don't give-a-Shit?

>>> Congratulations on your new job as group moderator.

>> Was I fired? If not, why don't you just bugger off on out of
>> here. rec.sport.baseball.anal is the place for you and you know
>> it, we know it, everybody knows it.

> Voros can stay cuz he was quoted by Bill James agreeing that


> balls hit by batters aren't impacted by pitchers.

That's all the reason more for him to be posting in the anal
group.

> it woke us all up.

You must be really light sleepers.

cordially, as always,

rm

Roger Moore

unread,
Sep 15, 2002, 4:29:19 PM9/15/02
to
joh...@ccrs.nrcan.gc.ca (Ron Johnson) writes:

>I agree up to this point. But I also want to know whether any given signing
>makes sense. And I don't think I'm alone in this. (Mind you doing what ifs
>with the model I have isn't the easiest thing in the world. I think
>Voros' model will generally lead to the same conclusion as to whether
>a signing is a good one and his is way easier to work with.)

I guess that makes sense. If there is some factor by which payroll
exerts a direct impact on attendance that would be very interesting.

>I'd also like to build as "accurate" (I share the concerns that payroll
>may be acting as a dummy variable for something else -- but if it is
>that something is *very* important) a model as possible because there
>are some things I'd like to work backwards on.

What I'm not so much worried about is whether payroll acts as a dummy
variable for something else that you're not including. What I'm
particularly worried about is that payroll is acting partly as a dummy for
something else but also as a dummy variable for several other things that
you are including in your model. Whenever you include two variable that
cover the same thing, you can wind up with a model that gives accurate
predictions but for which the meanings of the individual components are
not necessarily meaningful. I'm not terribly worried about the results of
your predictions, but whether you can use it as a model of how, say,
market size affects a team's performance is questionable.

The other thing that may be a problem with a model like this is that if
payroll is serving as a dummy variable, then it may not be very accurate
if one of the owners starts changing his behavior. If, for instance, the
payroll variable is really a dummy for the fact that large market owners
are more willing to maintain a high payroll with a weak team than small
market owners, it won't be very accurate if a small market owner decides
to emulate them.

Dennis Shea

unread,
Sep 15, 2002, 8:04:13 PM9/15/02
to

Roger Moore wrote:

> joh...@ccrs.nrcan.gc.ca (Ron Johnson) writes:
>
>
>>I agree up to this point. But I also want to know whether any given signing
>>makes sense. And I don't think I'm alone in this. (Mind you doing what ifs
>>with the model I have isn't the easiest thing in the world. I think
>>Voros' model will generally lead to the same conclusion as to whether
>>a signing is a good one and his is way easier to work with.)
>>
>
> I guess that makes sense. If there is some factor by which payroll
> exerts a direct impact on attendance that would be very interesting.
>
>
>>I'd also like to build as "accurate" (I share the concerns that payroll
>>may be acting as a dummy variable for something else -- but if it is
>>that something is *very* important) a model as possible because there
>>are some things I'd like to work backwards on.
>>
>
> What I'm not so much worried about is whether payroll acts as a dummy
> variable for something else that you're not including. What I'm
> particularly worried about is that payroll is acting partly as a dummy for
> something else but also as a dummy variable for several other things that
> you are including in your model. Whenever you include two variable that
> cover the same thing, you can wind up with a model that gives accurate
> predictions but for which the meanings of the individual components are
> not necessarily meaningful. I'm not terribly worried about the results of
> your predictions, but whether you can use it as a model of how, say,
> market size affects a team's performance is questionable.


I'm not sure I understand your worry. Including variables that may be
collinear might harm the precision of your estimate, but it does not
bias your estimate (which seems to be just the opposite of what you
appear to be saying---with collinear variables you get unbiased, but
inaccurate estimates). And it's usually worse to exclude a variable
that you think might be important because that does cause bias.

Voros' concern, that payroll is endogenous, is a more serious concern,
because if it is, then it does bias all the estimates in the model.

> The other thing that may be a problem with a model like this is that if
> payroll is serving as a dummy variable, then it may not be very accurate
> if one of the owners starts changing his behavior. If, for instance, the
> payroll variable is really a dummy for the fact that large market owners
> are more willing to maintain a high payroll with a weak team than small
> market owners, it won't be very accurate if a small market owner decides
> to emulate them.


That's true for any model where there are regime changes.

Voros McCracken

unread,
Sep 16, 2002, 2:25:08 AM9/16/02
to
Dennis Shea <dgs...@adelphia.net> wrote in message news:<3D851F51...@adelphia.net>...

> Roger Moore wrote:
> > What I'm not so much worried about is whether payroll acts as a dummy
> > variable for something else that you're not including. What I'm
> > particularly worried about is that payroll is acting partly as a dummy for
> > something else but also as a dummy variable for several other things that
> > you are including in your model.
>
> I'm not sure I understand your worry.

Not to speak for Roger, but as an example. Ron ran the regressions
with payroll included and then ran some with other variables like
previous seasons' Win%. He found a non-significant relationship...

...but if there is a collinear relationship between win% in previous
years and payroll, the reason previous years' win% is coming up
insignificant is because it's effects are already largely covered in
the payroll variable.

It may be that Opening day payroll does a fairly decent job as an
_indicator_ for the long-term popularity of the team, but if it isn't
actually causative, myself and Roger would argue for removing it in
favor of factors (like previos win%) that may be less indicative of
team popularity than payroll, but happen to have a significant causal
element in that relationship that payroll does not have.

IOW, Ron's model might be the very best _explanatory_ model, but it
might not be as effective at modeling effective revenue generating
strategies or effective payroll expenditure strategies, since a
variable like payroll might have a limited causative effect.

I think the biggest issue with all of the revenue estimating formulae
is that they all suffer from a slight GIGO problem. In many cases,
teams might embark on strategies that might not make sense in the
specific case of the team's revenues, but do make sense in the bigger
picture of the owners net worth or the overall worth of other
enterprises. Therefore when teams pursue strategies that a model
suggests are clearly unwise, it isn't necesarily because the model is
estimating revenues incorrectly but rather because the team's revenues
aren't the only factor in determining the strategy.

The suggstion about doing some studies with regard to attendance is a
good one, since that would give us larger samples with which to work,
but there is the issue that as you go farther back in time, the
relationships between the various factors and attendance may shift due
to things like cable television, night baseball and a host of other
possibilities. IOW, the things that drove attendance in 1973 might not
be the same things that drive it now, or they may do so in differing
proportions.

--
Voros McCracken

0 new messages