mutate_if and convert to numeric

688 views
Skip to first unread message

Gouri Shankar Mishra

unread,
Jan 14, 2017, 1:06:35 AM1/14/17
to Davis R Users' Group
Hello, I have a data frame with a large number of columns. All columns are coded as factor vars. Some of the columns end with "Sum" and I want to convert these columns to numeric variables. 


The following code gives me an error: 

df <- df %>% mutate_if(ends_with("Sum"), as.numeric(.))
Error in get(as.character(FUN), mode = "function", envir = envir) : 
  object 'p' of mode 'function' was not found

Thanks for your time. 

Michael Hannon

unread,
Jan 14, 2017, 6:50:58 PM1/14/17
to davi...@googlegroups.com
Can you send us a small subset of your data frame? Thanks.

-- Mike
> --
> Check out our R resources at http://d-rug.github.io/
> ---
> You received this message because you are subscribed to the Google Groups
> "Davis R Users' Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to davis-rug+...@googlegroups.com.
> Visit this group at https://groups.google.com/group/davis-rug.
> For more options, visit https://groups.google.com/d/optout.

Brandon Hurr

unread,
Jan 14, 2017, 8:36:21 PM1/14/17
to davi...@googlegroups.com
My guess was that the select functions weren't returning what mutate_if() wanted and I think I'm right. 

The ".predicate" is wanting a function that returns a logical vector on the column names. I think the select functions don't do this, but I'm not sure. I got the same errors as you on my test (below). 

In the end I used grepl() to return a logical vector on the names of the input dataframe and it did what I think you wanted. 

library(tidyverse)
library(wakefield)

df <- 
r_data_frame(
    n = 30,
    id,
    race,
    age,
    sex,
    hour,
    iq,
    height,
    died,
    Scoring = rnorm,
    Smoker = valid
)


df %>% 
mutate_if(grepl("Sex", names(.)), as.numeric)

# A tibble: 30 × 10
      ID     Race   Age   Sex        Hour    IQ Height  Died     Scoring Smoker
   <chr>   <fctr> <int> <dbl> <S3: times> <dbl>  <dbl> <lgl>       <dbl>  <lgl>
1     01    White    20     2    00:30:00   118     67 FALSE  0.97199823   TRUE
2     02 Hispanic    29     1    01:00:00   111     63 FALSE -0.80400056  FALSE
3     03 Hispanic    25     1    01:00:00    87     68  TRUE  0.12456142   TRUE
4     04 Hispanic    32     1    01:30:00    91     70  TRUE  0.17311414  FALSE
5     05    White    26     2    01:30:00   101     72 FALSE  0.01846329  FALSE
6     06    White    35     1    03:00:00   107     69 FALSE  1.63521692  FALSE
7     07    White    22     2    03:30:00   108     73 FALSE -0.02257594  FALSE
8     08    Black    27     2    04:30:00    97     66  TRUE  0.82289439  FALSE
9     09    White    32     1    04:30:00   106     67  TRUE  0.40257795   TRUE
10    10   Native    31     1    04:30:00    91     70  TRUE -0.97988855  FALSE
# ... with 20 more rows

You can see that Sex is converted to numeric. 


Next time, seriously try and send an example dataframe that has similar or actual data, it's quite hard for us to come up with fake data. Wakefield makes it a little easier, but still unsure if it matches your data or not. 

HTH
B

On Sat, Jan 14, 2017 at 3:50 PM, Michael Hannon <jmhannon...@gmail.com> wrote:
Can you send us a small subset of your data frame?  Thanks.

-- Mike

On Fri, Jan 13, 2017 at 10:06 PM, Gouri Shankar Mishra
<gouri....@gmail.com> wrote:
> Hello, I have a data frame with a large number of columns. All columns are
> coded as factor vars. Some of the columns end with "Sum" and I want to
> convert these columns to numeric variables.
>
>
> The following code gives me an error:
>
> df <- df %>% mutate_if(ends_with("Sum"), as.numeric(.))
> Error in get(as.character(FUN), mode = "function", envir = envir) :
>   object 'p' of mode 'function' was not found
>
> Thanks for your time.
>
> --
> Check out our R resources at http://d-rug.github.io/
> ---
> You received this message because you are subscribed to the Google Groups
> "Davis R Users' Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an

> Visit this group at https://groups.google.com/group/davis-rug.
> For more options, visit https://groups.google.com/d/optout.

--
Check out our R resources at http://d-rug.github.io/
---
You received this message because you are subscribed to the Google Groups "Davis R Users' Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to davis-rug+unsubscribe@googlegroups.com.

Gouri Shankar Mishra

unread,
Jan 15, 2017, 10:55:09 PM1/15/17
to Davis R Users' Group
Thanks - this works well for me. Will attach a DF next time. Good to also know about Wakefield

> Visit this group at https://groups.google.com/group/davis-rug.
> For more options, visit https://groups.google.com/d/optout.

--
Check out our R resources at http://d-rug.github.io/
---
You received this message because you are subscribed to the Google Groups "Davis R Users' Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to davis-rug+...@googlegroups.com.

Matt Espe

unread,
Jan 16, 2017, 1:48:58 AM1/16/17
to Davis R Users' Group
Hi all,

For those interested, below is a method using base R (using the mtcars as the example dataset). 

# Create index of columns with "cyl" or "hp" in colnames
i = grep("cyl|hp", colnames(mtcars))

# Use that index to subset the dataframe, and apply() to convert those columns
# (I convert to character since the columns start as numeric, but the idea is the same
mtcars[,i] = apply(mtcars[,i], 2, as.character)


Matt

Michael Hannon

unread,
Jan 16, 2017, 5:08:54 PM1/16/17
to davi...@googlegroups.com
Thanks, Matt. I think it's useful to look at alternate solutions. To
that end, I've appended a slight variation on your solution.

-- Mike

# Create index of columns with "cyl" or "hp" in colnames
i = grep("cyl|hp", colnames(mtcars))

# Use that index to subset the dataframe, and apply() to convert those columns
# (I convert to character since the columns start as numeric, but the
idea is the same

mt_subset_1 <- apply(mtcars[,i], 2, as.character)
head(mt_subset_1)

cols_of_interest <- c("cyl", "hp")

mt_subset_2 <- sapply(cols_of_interest, function(next_col) {
as.character(mtcars[ , next_col])
})
head(mt_subset_2)

all.equal(mt_subset_1, mt_subset_2)
Reply all
Reply to author
Forward
0 new messages