Good morning,
My current study stores subject id in a BIDS derived format, specified with example below:
`sub-{ID}_ses-{session number}_accel.csv`
These are raw ActiGraph files.
I am currently trying to add sleep log by session but current idloc parameters will no let me incorporate the session part in the ID.
I looked at the `extractID.R` function and saw that `idloc=4` uses an `hvar` list of subject numbers - while depracated - is it possible to implement this with my current set up? I could use `idloc = 6` to extract full filename with the appended `_accel` function but this would break our current rigid file storing conventions for LSS.
**Are there any ways i can extract the text preceding the *second '_'* from the filename?**
Below is my current full function so you can see how I am running these files in bulk, (it gets hacky since I have to output as a subdir from raw directory, which GGIR is not a fan of.
### Full GGIR Main
```
#!/usr/bin/env Rscript
# Usage: Rscript new_gg.R --project_dir "/Shared/vosslabhpc/Projects/BOOST/InterventionStudy/3-experiment/data/act-int-test/" --deriv_dir "derivatives/GGIR-3.2.6-test/"
library(optparse)
library(GGIR)
main <- function() {
# Define the option list
option_list <- list(
make_option(c("-p", "--project_dir"), type = "character",
default = "/mnt/nfs/lss/vosslabhpc/Projects/BOOST/InterventionStudy/3-Experiment/data/act-int-test/",
help = "Path to the project directory", metavar = "character"),
make_option(c("-d", "--deriv_dir"), type = "character",
default = "/derivatives/GGIR-3.2.6-test/",
help = "Path to the derivatives directory", metavar = "character")
)
# Parse the options
opt_parser <- OptionParser(option_list = option_list)
opt <- parse_args(opt_parser)
# Assign variables
ProjectDir <- opt$project_dir
ProjectDerivDir <- opt$deriv_dir
# Print values to verify
print(paste("Project Directory:", ProjectDir))
print(paste("Derivatives Directory:", ProjectDerivDir))
# Helper functions
SubjectGGIRDeriv <- function(x) {
a <- dirname(x)
paste0(ProjectDir, ProjectDerivDir, a)
}
datadirname <- function(x) {
b <- dirname(x)
paste0(ProjectDir, b)
}
# Gather subject directories
directories <- list.dirs(ProjectDir, recursive = FALSE)
subdirs <- directories[grepl("sub-*", directories)]
print(paste("subdirs: ", subdirs))
# Create project-specific derivatives GGIR folder if it doesn't exist
if (!dir.exists(paste0(ProjectDir, ProjectDerivDir))) {
dir.create(paste0(ProjectDir, ProjectDerivDir))
}
# List accel.csv files
filepattern <- "*accel.csv"
GGIRfiles <- list.files(subdirs, pattern = filepattern, recursive = TRUE,
include.dirs = TRUE, full.names = TRUE, no.. = TRUE)
print(paste("GGIR Files before splitting: ", GGIRfiles))
# Adjust path formatting
GGIRfiles <- sapply(strsplit(GGIRfiles, "//", fixed = TRUE), function(x) paste(x[2]))
print(paste("GGIR Files after splitting: ", GGIRfiles))
# Ensure directory structure exists
for (i in GGIRfiles) {
if (!dir.exists(SubjectGGIRDeriv(i))) {
dir.create(SubjectGGIRDeriv(i), recursive = TRUE)
}
}
# Run GGIR loop
for (r in GGIRfiles) {
if (dir.exists(paste0(SubjectGGIRDeriv(r), "/output_beh"))) {
next
} else {
datadir <- normalizePath(datadirname(r), mustWork = FALSE)
outputdir <- SubjectGGIRDeriv(r)
print(paste("datadir: ", datadir))
print(paste("outputdir: ", outputdir))
if (!dir.exists(datadir)) {
stop(paste("Error: datadir does not exist ->", datadir))
}
assign("datadir", datadir, envir = .GlobalEnv)
assign("outputdir", outputdir, envir = .GlobalEnv)
try({
GGIR(
# ==== Initialization ====
mode = 1:6,
datadir = datadir,
outputdir = outputdir,
studyname = "boost",
overwrite = FALSE,
desiredtz = "America/Chicago",
print.filename = TRUE,
idloc = 2,
# ==== Part 1: Data loading and basic signal processing ====
do.report = c(2, 4, 5, 6),
epochvalues2csv = TRUE,
do.ENMO = TRUE,
acc.metric = "ENMO",
windowsizes = c(5, 900, 3600),
# ==== Part 2: Non-wear detection ====
ignorenonwear = TRUE,
# ==== Part 3: Sleep detection ====
# Uncomment the below if using external sleep log:
# loglocation = "/mnt/nfs/lss/vosslabhpc/Projects/BOOST/InterventionStudy/3-experiment/data/act-int-test/sleep.csv",
# colid = 1,
# coln1 = 2,
# sleepwindowType = "SPT",
# ==== Part 4: Physical activity summaries ====
timewindow = c("WW", "MM", "OO"),
# ==== Part 5: Day-level summaries ====
hrs.del.start = 4,
hrs.del.end = 3,
maxdur = 9,
threshold.lig = 44.8,
threshold.mod = 100.6,
threshold.vig = 428.8,
# ==== Part 6: CR and other metrics ====
part6CR = TRUE,
visualreport = TRUE,
old_visualreport = FALSE
)
})
}
}
}
# Run main if executed as script
if (!interactive()) {
main()
}
```
### Further examples of subject IDs
- `sub-8001_ses-1_accel.csv`, `sub-8001_ses-2_accel.csv`
- `sub-7241_ses-1_accel.csv`
Any help would be awesome, thanks for all the hard work on this package!!
Best,
Zak