I am trying to get my data from long to wide format. I will need to do this with multiple data sets with different lengths. I believe i can alter this script to do what i want but my attempts have been unsuccessful. can anyone explain how i can alter this script to convert data sets (all 2 columns but variable numbers of rows) into wide format for analysis in the geomorph r package? (my first data set is 2686 rows and 2 columns (landmark data)).--data <- read.table("Rafalt1outline.txt", header=FALSE)# Import datalandmarks=30# Number of landmarksdata$specimen=as.factor(unlist(lapply(seq_len(landmarks), function(x)rep(seq_len(landmarks)[x],30))))data$landmark=as.factor(rep(1:30,30))# Create two factors to index the various x,y coordinateslibrary(reshape)data_wide=reshape(data,idvar = "specimen",timevar = "landmark",direction = "wide")# Use reshape to convert from long to wide formatlandmarknames=unlist(lapply(seq_len(landmarks), function(x)c(paste("x.",x,sep=""),paste("y.",x,sep=""))))# Create ad hoc names to use for indexingdata_wide=cbind(data_wide$specimen,data_wide$genus.1,data_wide[,landmarknames])# Get only the useful columnswrite.table(data_wide, "reformat.txt")
You received this message because you are subscribed to the Google Groups "manipulatr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to manipulatr+...@googlegroups.com.
To post to this group, send email to manip...@googlegroups.com.
Visit this group at https://groups.google.com/group/manipulatr.
For more options, visit https://groups.google.com/d/optout.
I would be happy to.
Brandon,Thank you. I really appreciate your help. When I attempt to use the code on the data set I provided I get an error code that says Error: Column `specimen` must be a 1d atomic vector or a list. can you spot my issue?library(tidyverse)# 1 specimensspecimen <- Rafalt1outline# 2686 landmarkslandmark <- rep(1:2686, times = 1)
# make up some coordinate datax <- rnorm(length(specimen))y <- rnorm(length(specimen))# assemble into a tibble/data.framedf <- tibble(specimen, landmark, x, y)# make a vector of names so you can sort the columns how you want them
num_landmarks <- 2686
num_coords <- 2coord_names <- c('x', 'y')# paste them all together with a '.' separating themmove_cols <- paste(rep(coord_names, times = num_landmarks), rep(1: num_landmarks, each = num_coords), sep = ".")df %>%gather(., key = "xy", value = "coord_value", -specimen, -landmark) %>% #make dataset even longerunite(., landmark.coord, c("xy", "landmark"), sep = ".") %>% # create labels for future spreadingspread(., landmark.coord, coord_value) %>% #spread it all outselect(specimen, move_cols) # reorder things to fit expected output
> str(Rafalt1outline)'data.frame': 2686 obs. of 2 variables:$ V1: int 2880 2879 2879 2878 2878 2877 2877 2876 2876 2875 ...$ V2: int 1437 1437 1442 1442 1446 1446 1451 1451 1455 1455 ...Im trying to get it to 1 observation of 2686 variables