Example of reading fixed width format data in readr & possible bug

1,486 views
Skip to first unread message

Bob

unread,
Jun 16, 2015, 9:50:19 AM6/16/15
to manip...@googlegroups.com
Hi All,

Here's an example of how to use readr's read_fwf function to read data with no delimiters like this:

011f1151
022f2141
031f2243
042 31 3
051m3524
062m5455
071m5344
082m4555

Each variable takes only one column except for the first one, which takes 2. You can specify how wide each variable is using the form below. If you're used to using R's built-in read.fwf function, note that you cannot skip variables by specifying a negative value, but that's easy to work around.

> read_fwf("mydataFWF.txt", 
+          fwf_widths( c(2, 1, 1, 1, 1, 1, 1) ) )

  X1 X2 X3 X4 X5 X6 X7
1  1  1  f  1  1  5  1
2  2  2  f  2  1  4  1
3  3  1  f  2  2  4  3
4  4  2     3  1 NA  3
5  5  1  m  3  5  2  4
6  6  2  m  5  4  5  5
7  7  1  m  5  3  4  4
8  8  2  m  4  5  5  5

That's handy, but if you get a single width wrong, all the following variables will be wrong too. So it's safer to specify the starting and ending columns for each variable like this:

read_fwf("mydataFWF.txt", 
         fwf_positions( c(1,3,4,5,6,7,8),
                        c(2,3,4,5,6,7,8)))

The help file says you can specify variable names using the col_names argument, but either there's a bug or I'm missing something about how to use it:

> read_fwf("mydataFWF.txt", 
+          fwf_widths( c(2, 1, 1, 1, 1, 1, 1) ),
+          col_names = c("id","workshop","gender","q1","q2","q3","q4") )

Error in read_fwf("mydataFWF.txt", fwf_widths(c(2, 1, 1, 1, 1, 1, 1)),  : 
  unused argument (col_names = c("id", "workshop", "gender", "q1", "q2", "q3", "q4"))

I get the same message using fwf_widths.

Cheers,
Bob


Hadley Wickham

unread,
Jun 16, 2015, 10:07:16 AM6/16/15
to Bob, manipulatr
Hi Bob,

col_names() is an argument of fwf_widths()/fwf_positions(), not read_fwf()

Hadley
> --
> You received this message because you are subscribed to the Google Groups
> "manipulatr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to manipulatr+...@googlegroups.com.
> To post to this group, send email to manip...@googlegroups.com.
> Visit this group at http://groups.google.com/group/manipulatr.
> For more options, visit https://groups.google.com/d/optout.



--
http://had.co.nz/

Bob

unread,
Jun 16, 2015, 10:17:49 AM6/16/15
to manip...@googlegroups.com, muen...@tennessee.edu
Thanks!  OK, so here are the col_names working in my example:

> read_fwf("mydataFWF.txt", 
+          fwf_positions( 
+            c(1,3,4,5,6,7,8),
+            c(2,3,4,5,6,7,8),
+            col_names = c("id","workshop","gender","q1","q2","q3","q4") ))
  id workshop gender q1 q2 q3 q4
1  1        1      f  1  1  5  1
2  2        2      f  2  1  4  1
3  3        1      f  2  2  4  3
4  4        2         3  1 NA  3
5  5        1      m  3  5  2  4
6  6        2      m  5  4  5  5
7  7        1      m  5  3  4  4
8  8        2      m  4  5  5  5

Reply all
Reply to author
Forward
0 new messages