Strange characters in 'TailNum' in 2001.csv and 2002.csv ?

163 views
Skip to first unread message

Rob

unread,
Aug 19, 2008, 9:08:52 AM8/19/08
to ASA Data Expo 2009
Wondering if others are seeing this same problem, or is maybe I've got
corrupted files?

I'm seeing some strange characters in the TailNum field, just in the
2001.csv and 2002.csv files (all other files seem to be ok, with no
strange characters).

Here is an example of what I'm seeing:

$ head 2001.csv
Year,Month,DayofMonth,DayOfWeek,DepTime,CRSDepTime,ArrTime,CRSArrTime,UniqueCarrier,FlightNum,TailNum,ActualElapsedTime,CRSElapsedTime,AirTime,ArrDelay,DepDelay,Origin,Dest,Distance,TaxiIn,TaxiOut,CancellationCode,Diverted,CarrierDelay,WeatherDelay,NASDelay,SecurityDelay,LateAircraftDelay
2001,1,17,3,1806,1810,1931,1934,US,375,N700äæ,85,84,60,-3,-4,BWI,CLT,
361,5,20,NA,0,NA,NA,NA,NA,NA
2001,1,18,4,1805,1810,1938,1934,US,375,N713äæ,93,84,64,4,-5,BWI,CLT,
361,9,20,NA,0,NA,NA,NA,NA,NA
2001,1,19,5,1821,1810,1957,1934,US,375,N702äæ,96,84,80,23,11,BWI,CLT,
361,6,10,NA,0,NA,NA,NA,NA,NA
2001,1,20,6,1807,1810,1944,1934,US,375,N701äæ,97,84,66,10,-3,BWI,CLT,
361,4,27,NA,0,NA,NA,NA,NA,NA
2001,1,21,7,1810,1810,1954,1934,US,375,N768äæ,104,84,62,20,0,BWI,CLT,
361,4,38,NA,0,NA,NA,NA,NA,NA
2001,1,22,1,1807,1810,1931,1934,US,375,N722äæ,84,84,61,-3,-3,BWI,CLT,
361,12,11,NA,0,NA,NA,NA,NA,NA
2001,1,23,2,1802,1810,1924,1934,US,375,N732äæ,82,84,61,-10,-8,BWI,CLT,
361,5,16,NA,0,NA,NA,NA,NA,NA
2001,1,24,3,1804,1810,1922,1934,US,375,N737äæ,78,84,60,-12,-6,BWI,CLT,
361,4,14,NA,0,NA,NA,NA,NA,NA
2001,1,25,4,1812,1810,1925,1934,US,375,N767äæ,73,84,52,-9,2,BWI,CLT,
361,6,15,NA,0,NA,NA,NA,NA,NA

$ head 2002.csv
Year,Month,DayofMonth,DayOfWeek,DepTime,CRSDepTime,ArrTime,CRSArrTime,UniqueCarrier,FlightNum,TailNum,ActualElapsedTime,CRSElapsedTime,AirTime,ArrDelay,DepDelay,Origin,Dest,Distance,TaxiIn,TaxiOut,CancellationCode,Diverted,CarrierDelay,WeatherDelay,NASDelay,SecurityDelay,LateAircraftDelay
2002,1,13,7,2231,2235,2342,2353,US,723,N709äæ,71,78,55,-11,-4,PIT,CLT,
366,3,13,NA,0,NA,NA,NA,NA,NA
2002,1,14,1,2230,2235,2347,2353,US,723,N733äæ,77,78,60,-6,-5,PIT,CLT,
366,3,14,NA,0,NA,NA,NA,NA,NA
2002,1,15,2,2230,2235,2342,2353,US,723,N758äæ,72,78,55,-11,-5,PIT,CLT,
366,3,14,NA,0,NA,NA,NA,NA,NA
2002,1,16,3,2230,2235,2340,2353,US,723,N707äæ,70,78,57,-13,-5,PIT,CLT,
366,3,10,NA,0,NA,NA,NA,NA,NA
2002,1,17,4,2227,2235,2345,2353,US,723,N713äæ,78,78,60,-8,-8,PIT,CLT,
366,5,13,NA,0,NA,NA,NA,NA,NA
2002,1,18,5,2227,2235,2346,2353,US,723,N722äæ,79,78,59,-7,-8,PIT,CLT,
366,5,15,NA,0,NA,NA,NA,NA,NA
2002,1,19,6,2234,2235,2347,2353,US,723,N705äæ,73,78,56,-6,-1,PIT,CLT,
366,4,13,NA,0,NA,NA,NA,NA,NA
2002,1,20,7,2233,2235,2346,2353,US,723,N749äæ,73,78,59,-7,-2,PIT,CLT,
366,5,9,NA,0,NA,NA,NA,NA,NA
2002,1,21,1,2228,2235,2341,2353,US,723,N701äæ,73,78,57,-12,-7,PIT,CLT,
366,4,12,NA,0,NA,NA,NA,NA,NA

(I'm working on X64 Linux - I downloaded the data using the mozilla
browser, and I uncompressed it using bunzip2).

hadley wickham

unread,
Aug 19, 2008, 9:26:23 AM8/19/08
to data-e...@googlegroups.com
Hi Rob,

It seems to be there in the original data - I'll get in touch with the
DOT people and see what the problem is.

Hadley
--
http://had.co.nz/
Reply all
Reply to author
Forward
0 new messages