Importing a CSV file

53 views
Skip to first unread message

Jerry Woodfield

unread,
Dec 15, 2013, 12:58:15 PM12/15/13
to baseball-sq...@googlegroups.com
Hey everyone, I'm pretty familiar with using different programming languages to do data analysis, but I work almost exclusively in SAS for my day job and only use SQL within SAS for Oracle DB pulls and smaller data manipulation.  So I am really having to hone my SQL skills with using these baseball data sources.  Needless to say, getting the various data into SQL has been more of a challenge than I anticipated.  I'm trying to import CSV files I've gotten from B-Ref and Fangraphs, but I have not found a way of importing these files into SQL other than via creating the table first and then running the LOAD DATA LOCAL INFILE.  The problem I have with this is that it is extremely time consuming if your data has a lot of fields, as the B-Ref WAR datasets do (roughly 50 columns).  So I am asking the group if people know of much easier and less time consuming method of importing CSV files?
Thanks
Jerry

stuart wallace

unread,
Dec 24, 2013, 9:28:45 AM12/24/13
to baseball-sq...@googlegroups.com
Hey Jerry,

I am running into the same issues -- is there an import wizard type of approach out there that will take the first row of your csv and create the table from there? I know SQLYog has this, but only for the Enterprise (paid) version...

Additional info: I am using MySQL Workbench.

Something I am also poking at that might be useful for you Jerry is using the sqldf library in R -- still playing with it, so can't say if using it over a standard sql database is better or worse.

Stu

David Taylor

unread,
Apr 7, 2014, 1:15:44 PM4/7/14
to baseball-sq...@googlegroups.com
PHPMyAdmin will let you import a CSV to your database as well.

Bryan Cole

unread,
Apr 8, 2014, 12:15:20 PM4/8/14
to baseball-sq...@googlegroups.com
Don't know if you have Excel, but if you do, you should look into the "MySQL for Excel" wizard, which I think shows up in the Data toolbar in Excel 2010.
http://www.mysql.com/why-mysql/windows/excel/

You have to be a little careful with it, but I really like it.

David Taylor

unread,
Apr 9, 2014, 4:16:14 PM4/9/14
to baseball-sq...@googlegroups.com
I have tried to install it but installer keeps saying can't find the file.  Is there a direct file install for it?

Jonathan Cram

unread,
Apr 9, 2014, 5:01:40 PM4/9/14
to David Taylor, baseball-sq...@googlegroups.com
Sorry, for got to include the group.


On Wed, Apr 9, 2014 at 2:59 PM, Jonathan Cram <jonath...@gmail.com> wrote:
I'm sure there is an easier way but I just create the tables manually since I know the data types will be correct. I save the scripts so I can easily re-create the tables if needed. Here's a script for Bref's standard batting (change the file paths obv)

/*
step 0: the lahman database is probably a much better choice for
analyzing these stats with sql

step 1: choose standard batting stats for the player of your choice

step 2: click on the 'csv' hyperlink to render as comma separated
text

step 3: copy the csv formatted text to the text editor of your
choice and save the file as batStand_aaronha01_career.csv
replacing aaronha01 with the player id (found in the url) and if
necessary change career to a date range (no spaces)

step 4: open this .sql file and then find/replace aaronha01 with
the appropriate player_id and if necessary find/replace career
with the appropriate date range

step 5: navigate to the appropriate database and run the script!
*/

-- creates table
drop table if exists batStand_aaronha01_career;
create table batStand_aaronha01_career
(
  `Year` varchar(4),
  Age int(2),
  Tm char(3),
  Lg char(3),
  G int(3),
  PA int(3),
  AB int(3),
  R int(3),
  H int(3),
  2B int(3),
  3B int(3),
  HR int(3),
  RBI int(3),
  SB int(3),
  CS int(3),
  BB int(3),
  SO int(3),
  BA float(4),
  OBP float,
  SLG float,
  OPS float,
  `OPS+` int(3),
  TB int(3),
  GDP int(3),
  HBP int(3),
  SH int(3),
  SF int(3),
  IBB int(3),
  POS varchar(20),
  Awards varchar(40)
);

-- loads csv file's data into the table created above
load data infile '/_dev/bbdb/data/batStand_aaronha01_career.csv'
into table batStand_aaronha01_career
fields terminated by ','
lines terminated by '\n';

-- adds the player-id useful for joining tables
alter table batStand_aaronha01_career
add player_id varchar(10) first;
update batStand_aaronha01_career SET player_id = 'aaronha01';

-- uncomment line below if you wnat to see the data
-- select * from batStand_aaronha01_career;
Reply all
Reply to author
Forward
0 new messages