Help/Advice

31 views
Skip to first unread message

Ben Horrow

unread,
Nov 11, 2013, 11:36:18 AM11/11/13
to baseball-sq...@googlegroups.com
Hey guys,

I'm a bit of a novice at SQL, but in attempting to help a friend out who was writing a piece on voting for MLB awards. I tried to use the Lahman database to isolate in one query pitchers who received votes (they do pts in the DB) for the Cy Young and players receiving votes for the MVP. 

I used the standard Concat(namefirst, " ",namelast) As Name command along with a simple inner join to connect the master sheet that has the first name and last name of players with the awards list sheet that has the information I wanted via the playerID column. As usual that worked fine, but I also wanted to see the teams that the players listed played for. TeamID isn't listed in either the master or the awards sheet, but it is listed in the pitching and hitting lists. I was wondering how I could add in the team names into my query. 

I attempted to do this myself by using a Where clause to focus only on pitchers (aka Cy Young award) and then added a second join to link the teamID in from the pitching sheet via playerID. That didn't work.

Was wondering if anyone can help. 

Thanks,
Ben Horrow

John Choiniere

unread,
Nov 11, 2013, 1:03:23 PM11/11/13
to baseball-sq...@googlegroups.com
I'm similarly amateurish with this, but I think this should work:

select
concat(b.nameFirst," ",b.nameLast) AS name,
a.yearID AS year,
a.awardID AS award,
a.pointsWon AS points,
c.teamID AS team
from
AwardsSharePlayers a
inner join master b ON a.playerID = b.playerID
inner join Fielding c ON a.playerID = c.playerID
where
a.awardID = ([Cy Young id] OR [MVP id]) AND
c.POS = "P"

You have to fill in whatever the db uses for the cy young and the mvp awards.  Also, I'm not sure what this will return for a teamID in the case of someone pitching for multiple teams in the same year.  I can't test this myself, being at work right now, but I think it should work correctly.  I'm guessing the key difference between this and yours is my use of the Fielding table, which allows for both getting the team *and* filtering by position. I *think* by using AwardsSharePlayers as the "a" table you don't need to use any groupings, but if this doesn't work you could switch master to a, awardsshareplayers to b, and try GROUP BY year, award, name.  That part's almost always trial and error for me, so it might take slightly different grouping if you have to go that route.

If you post your original query I/someone else can try to see where it went wrong, too.  If you'd like.

Good luck!
-John

Matthew Dickinson

unread,
Nov 16, 2013, 11:55:52 AM11/16/13
to baseball-sq...@googlegroups.com
I tackled this very similarly to John:

SELECT
CONCAT(master.namefirst," ", master.namelast) AS name,
awards.yearID AS year,
awards.awardID AS award,
awards.pointsWon AS points,
teams.name AS team,
fielding.stint AS stint,
SUM(fielding.G) AS games_played

FROM
AwardsSharePlayers AS awards

INNER JOIN
Master
ON awards.playerID = master.playerID

INNER JOIN
Fielding
ON awards.playerID = fielding.playerID
AND awards.yearID = fielding.yearID
INNER JOIN
Teams
ON fielding.teamID = teams.teamID
AND awards.yearID = teams.yearID

WHERE AwardID IN ('MVP', 'Cy Young')

GROUP BY
CONCAT(master.namefirst," ", master.namelast),
awards.yearID,
awards.awardID,
awards.pointsWon,
fielding.stint
ORDER BY
year,
award,
points DESC,
stint
;



I added in team order and games played to account for players traded during the season.  The SUM and GROUP BY accounts for the multiple records you would get back due to players fielding multiple positions. If you do not want Games Played you could omit the field and the group by and instead use a DISTINCT clause to account for the duplicates. I agree with John that I think the 'fielding' table works best for the link for linking players to teams.  It accounts for DH, where I'm not sure how well the 'batting' table accounts for pitchers who have no plate appearances.  Also, it often useful to be able to get their position, games, and starts for many queries. 

Ben Horrow

unread,
Nov 16, 2013, 1:18:03 PM11/16/13
to Matthew Dickinson, baseball-sq...@googlegroups.com
Thanks so much for all the help, I've been able to figure it out. 
Reply all
Reply to author
Forward
0 new messages