I have a dataframe with words and their durations in speech (reproducible data below):
test1
d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10
10 0.103 0.168 0.198 0.188 0.359 0.343 0.064 0.075 0.095 0.367 And I thought oh no Sarah do n't do it
132 0.091 0.072 0.109 0.119 0.113 0.087 0.088 0.264 0.092 0.249 I du n no you ca n't see his head
784 0.152 0.341 0.117 0.108 0.123 0.263 0.083 0.095 0.099 0.098 Oh honestly I did n't touch it I did n't
The short form n't
is treated as if it were a separate word. That's okay as long as the preceding word ends on a consonant such as did
, but that's not okay if the preceding word ends on a vowel such as do
or ca
. Because that separation into different words is incorrect the separation into different durations is incorrect too.
What I'd like to do is sum up the durations of ca
and n't
as well as do
and n't
(but leave alone the separate durations for did
and n't
), move the remaining durations one column to the left and replace the last duration with NA.
I know how to select the rows where the changes need to be implemented:
test1[which(grepl("(?<=(ca|do)\\s)n't", apply(test1, 1, paste0, collapse = " "), perl = T)),]
but I'm stuck going forward.
The desired result would look like this:
d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10
10 0.103 0.168 0.198 0.188 0.359 0.343 0.139 0.095 0.367 NA And I thought oh no Sarah do n't do it
132 0.091 0.072 0.109 0.119 0.113 0.175 0.264 0.092 0.249 NA I du n no you ca n't see his head
784 0.152 0.341 0.117 0.108 0.123 0.263 0.083 0.095 0.099 0.098 Oh honestly I did n't touch it I did n't
How can this be done? Help is much appreciated.
Reproducible data:
test1 <- structure(list(d1 = c(0.103, 0.091, 0.152), d2 = c(0.168, 0.072,
0.341), d3 = c(0.198, 0.109, 0.117), d4 = c(0.188, 0.119, 0.108
), d5 = c(0.359, 0.113, 0.123), d6 = c(0.343, 0.087, 0.263),
d7 = c(0.064, 0.088, 0.083), d8 = c(0.075, 0.264, 0.095),
d9 = c(0.095, 0.092, 0.099), d10 = c(0.367, 0.249, 0.098),
w1 = c("And", "I", "Oh"), w2 = c("I", "du", "honestly"),
w3 = c("thought", "n", "I"), w4 = c("oh", "no", "did"), w5 = c("no",
"you", "n't"), w6 = c("Sarah", "ca", "touch"), w7 = c("do",
"n't", "it"), w8 = c("n't", "see", "I"), w9 = c("do", "his",
"did"), w10 = c("it", "head", "n't")), row.names = c(10L,
132L, 784L), class = "data.frame")
Many thanks in advance!--
You received this message because you are subscribed to the Google Groups "CorpLing with R" group.
To unsubscribe from this group and stop receiving emails from it, send an email to corpling-with...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/corpling-with-r/CALFCMoW0u6am93TDBqc_CPOFExr9kY5C1LzaZOxY90FxGvWm0w%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/corpling-with-r/CANMdTKiRM8R%3DYi5niNAJ8p%3Dq08q9ViVpCeHSpYjLdG0v78DDRQ%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/corpling-with-r/CALFCMoUUKaKp_Nx12DixytOM3C2K9%2BXpi787ZxxLwUXfyvodCg%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups "CorpLing with R" group.
To unsubscribe from this group and stop receiving emails from it, send an email to corpling-with...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/corpling-with-r/CAFrBz2mu%3D_53TBL1zhHZaOqT0yx6QdNVS0qohGUT03MtQ8OT9w%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups "CorpLing with R" group.
To unsubscribe from this group and stop receiving emails from it, send an email to corpling-with...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/corpling-with-r/CAFrBz2mu%3D_53TBL1zhHZaOqT0yx6QdNVS0qohGUT03MtQ8OT9w%40mail.gmail.com.
--
You received this message because you are subscribed to the Google Groups "CorpLing with R" group.
To unsubscribe from this group and stop receiving emails from it, send an email to corpling-with...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/corpling-with-r/CAFrBz2kzWoY6CCVsVnNRaMiJOphEQ6oagSM1F5zcp6PDC5t1uQ%40mail.gmail.com.