Hi,
I request help with the following:
INPUT: A data frame where column "Lower" is a character containing numeric values (different count or occurrences of numeric values in each row, mostly 2)
> dput(dd)
structure(list(State = c("Alabama", "Alaska", "Arizona", "Arkansas",
"California"), Lower = c("R 72–33", "R/Coalition 27(23 R, 4 D)–12 D, 1 Ind.",
"R 36–24", "R 64–35, 1 Ind.", "D 52–28"), Upper = c("R 26–8, 1 Ind.",
"R/Coalition 15(14 R, 1 D)–5 D", "R 18–12", "R 24–11", "D 26–14"
)), .Names = c("State", "Lower", "Upper"), row.names = c(NA,
5L), class = "data.frame")
PROBLEM: Need to extract all numeric values and sum them. There are few exceptions like row2. But these can be ignored and will be fixed manually
SOLUTION SO FAR:
str_extract_all(dd[[2]],"[[:digit:]]+"), returns a list of numbers as character. I am unable to unlist it, because it mixes them all together, ...
I have received a non-dplyr solutoin from R Mailing list. As follows:
> z <-gsub("[^[:digit:]]"," ",dd$Lower)
> sapply(strsplit(z," +"),function(x)sum(as.numeric(x),na.rm=TRUE))
[1] 105 67 60 100 80
I was wondering if there is a "dplyr" way of doing it ...
Thanks