Duplicative dates for the same participant in part 5 cleaned day summary

176 views
Skip to first unread message

Julia ZZ

unread,
Mar 7, 2024, 11:23:22 AMMar 7
to R package GGIR
Hi all,

We met a problem regarding the duplicative dates for the same participant in part 5 cleaned day summary. The device 2261 used by the participant was not worn overnight from 05/23 to 05/24, leading to 2 wake times on 05/23 as shown in the report (Report_2261).  Eventually, it caused duplicative rows with two calendar dates of 05/25 in part 5 cleaned day summary. We think one of the rows should belong to 05/24 based on what was shown in L5TIME and M5TIME. We need to merge other data by dates in our analysis so duplicative dates are concerning. We wonder why GGIR gives duplicative calendar dates for those days and how can we fix this issue in GGIR codes. We attached processing codes and part 1 milestone for 2261 below. It would be great if anyone could help us with it.


library(GGIR)
datadir="V:/ACOI/R01 - Voucher/....../ACC_all"
outputdir="D:/....../GGIR_output_kid"

g.shell.GGIR(
  mode=c(1:5),
  datadir=datadir,
  outputdir=outputdir,
  studyname="Attempt1",
  idloc=6,
  f0=1,
  f1=279,
  print.filename=FALSE,
  storefolderstructure = FALSE,
  do.parallel=TRUE,
  overwrite=TRUE,
 
#part 1#
        windowsizes = c(5,900,3600),
        do.anglez=TRUE,
        chunksize = c(1),
        printsummary=TRUE,
 
#part 2#
        strategy = 2 , # this strategy gets rid of first data before first midnight and last midnight
        includedaycrit = 16, M5L5res = 10, #already in default
        winhr = 5,
        qwindow=c(0,24),
        qwindow_dateformat = "%d/%m/%Y",
        qlevels = c(c(1380/1440),c(1410/1440)),
        ilevels = c(0,36, 201, 707, 8000),
        mvpathreshold = c(201),
        boutcriter = 0.8,
      bout.metric = 6 ,
        epochvalues2csv=TRUE,
        mvpadur=c(1,5,10),
        iglevels = TRUE, #this function calculates intensity gradient
        #do.parallel = TRUE,

#part3#
        timethreshold= c(5),
        acc.metric="ENMO",
        anglethreshold=5,
        ignorenonwear = TRUE,

#part4#
        def.noc.sleep=1,
        includenightcrit = 4,
        outliers.only = TRUE,
        relyonguider = FALSE,
      criterror = 4,
        do.visual = FALSE,
        nnights = 30,
       
#part 5#
        threshold.lig = c(35.6),
        threshold.mod = c(201.4),
        threshold.vig = c(707.0),
      #Hildebrand 2014 and 2016 intensity thresholds#
        excludefirstlast = FALSE,
        boutcriter = 0.8,
        boutcriter.in = 0.9,
        boutcriter.lig = 0.8,
        boutcriter.mvpa = 0.8,
        boutdur.in = c(10),
        boutdur.lig = c(1),
        boutdur.mvpa = c(1),
        timewindow = c("WW"),
       
do.report=c(2,4,5),
dofirstpage=TRUE,
visualreport=TRUE,
viewingwindow=1)



Thanks,
Julia

part5_daysummary_WW_L35.6M201.4V707_T5A5.csv
Report_2261.gt3x.pdf
meta_2261.gt3x.RData

Michael Rueschman

unread,
Mar 7, 2024, 12:22:37 PMMar 7
to R package GGIR
I agree that the window_number = 13 from your part5_daysummary file appears to correspond to 05/24 (the missing day in the sequence). It matches up with your file summary report.

What version of GGIR did you use to create this output? Were there any errors? Do you see the same date error (i.e., two nights corresponding to 05/25) in the Part 4 nightsummary output? 

I noted that GGIR detected a nonwear period starting precisely at midnight on 05/24.

Julia ZZ

unread,
Mar 7, 2024, 4:06:08 PMMar 7
to R package GGIR
Hi Michael,

Thank you for the response. We are using GGIR version 2.8-2 since this version is consistent with our processing for previous cohort. There were no errors when we process the file, and I did not see the same date error (i.e., two nights corresponding to 05/25) in the Part 4 nightsummary output. Do you have any ideas that why the error shows in part 5 only? Btw, I will try to use the newest version to reprocess the file and keep you posted. 

Julia

Vincent van Hees

unread,
Mar 8, 2024, 3:17:24 AMMar 8
to Julia ZZ, R package GGIR
Hi Julia,

You are looking at the WW report, which describes windows from waking up to waking up.
The date always reflects the date on which the window started. 

Therefore, if a person is detected to wake up before midnight then it takes the date from that day and not the next calendar day.

This is expected behaviour and  has been in GGIR for many years.

Best,

Vincent
--
You received this message because you are subscribed to the Google Groups "R package GGIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to RpackageGGIR...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/RpackageGGIR/c2f040cc-4044-4f1f-8514-206ec1de4cacn%40googlegroups.com.

Michael Rueschman

unread,
Mar 8, 2024, 11:19:36 AMMar 8
to R package GGIR
Thanks Vincent! When you say, "The date always reflects the date on which the window started", are you referring to the calendar_date column in the part5_daysummary?

If that's the case, then I don't see two wakeup-to-wakeup windows on Julia's file summary report or part5_daysummary that start on 5/25/2021.

Window number 12 from part5 ends at 23:59:55 on 5/23/2021. My guess for window number 13 would then be to start at 00:00:00 on 5/24/2021, however the calendar date for this window is 5/25/2021.

Note there is wonkiness in this area of the recording due to non-wear on this night, which presumably causes the pre-midnight wakeup detection. 

Side question related to the above: I can understand duplicate (or "skipped") calendar_date entries in WW files due to the nature of varying wakeup times. However, should there ever be duplicate night_number entries? Both of the 5/25/2021 entries in Julia's file are marked as night_number = 15. I pondered whether there should be parity between part4's night column and part5's night_number column. I looked through a recent batch of data I ran through GGIR and night/night_number do not always agree.

Vincent van Hees

unread,
Mar 11, 2024, 4:30:52 AMMar 11
to Michael Rueschman, Julia ZZ, R package GGIR
Hi both,

Yes I was referring to the calendar_date column. Although, I now see the code actually takes the date from the second epoch in the window, which is why 23:59:55 is treated as the following day. I am not sure now whether changing this to use the first epoch will break something else, so I will leave it as it is and create an issue item for it on GitHub to investigate at  a later point in time.

No, there should not be duplicate night numbers. Part5 essentially merges in the part 4 results to offer a more complete description of the time use variables (names starting with dur_...). It does this by matching dates. I am not sure where those double night numbers came from.

I just ran GGIR's latest version (GitHub master branch) with part 2-5 on the test file Julia sent and see that the issues are resolved as I see no duplicate night numbers, explainable dates and no oddly formatted first date value.

Julia - if you are running a long data collection period and want to keep processing consistent then it may be worth keeping GGIR part 1 consistent but to reprocess part 2-5 with the latest GGIR version once in a while. GGIR part 1 has evolved relatively little and even GGIR part 1 output from 2019 can still be used nowadays without needing to reprocess the raw data again. However, part2-5 have been improving over time, so it can be helpful to try to keep those update date.

For what it is worth: GGIR has been evolving a lot in the past five years, sometimes at the price of new bugs. My expectation is that the focus will now shift more towards making the existing functionality and its documentation more robust and easier to navigate for both users and contributors. For example, I (with help of others) have been drafting a full revision of the narrative documentation which we hope to put online in the upcoming months.

Best,
Vincent

Vincent van Hees

unread,
Mar 12, 2024, 6:14:39 AMMar 12
to Julia ZZ, Michael Rueschman, R package GGIR
Update:

With help from Mike I was able to spot the issue:

It is indeed a bug, but very specific to merging the part 4 night-level variables with the part 5 results. So, this is not affecting for example time use variables in part 5. This originated in August 2023 when I split g.part5 that had become very long into several subfunctions, where one object was not well passed on.

I have prepared a fix for this which I plan to merge in the rest of GGIR later this week, and after that I will prepare a new GGIR release for both GitHub and CRAN. To test the fix:
remotes::install_github("wadpac/GGIR", ref = "issue1085_add_temperature_to_ts")

Thanks,
Vincent

Julia ZZ

unread,
Mar 12, 2024, 10:15:52 AMMar 12
to R package GGIR
Hi Vincent, thank you very much! I will test the fix. 

And thank you very much for your help, Michael!


Julia

Julia ZZ

unread,
Mar 13, 2024, 11:15:02 AMMar 13
to R package GGIR
Hi all,

After I installed the newest version of GGIR, I kept receiving the following error message when I tried to test the files. I can't figure out why.
Errors and warnings for 2261.gt3x$message
[1] "'origin' must be supplied"

$call
as.POSIXlt.numeric(x)

I am using the same codes as before. And the original file is gt3x. It was working in version 2.8-2, but now it does not work in version 3.0-8. I attached my codes below. I wonder if anyone have ideas about it? Thank you!

packageVersion("GGIR")
library(GGIR)
datadir="V:/.../DUP_TEST"
outputdir="V:/.../GGIR_output_kid_newversion_acc_nolog_dup"


g.shell.GGIR(
  mode=c(1:5),
  datadir=datadir,
  outputdir=outputdir,
  studyname="Attempt1",
  idloc=6,
  f0=1,
  f1=3,

  print.filename=FALSE,
  storefolderstructure = FALSE,
  do.parallel=TRUE,
  overwrite=TRUE,
 
#part 1#
        windowsizes = c(5,900,3600),
        do.anglez=TRUE,
        chunksize = c(1),
        printsummary=TRUE,
 
#part 2#
        strategy = 2 , # this strategy gets rid of first data before first midnight and last midnight
        includedaycrit = 16, M5L5res = 10, #already in default
        winhr = 5,
        qwindow=c(0,24),
        qwindow_dateformat = "%d/%m/%Y",
        qlevels = c(c(1380/1440),c(1410/1440)),
        ilevels = c(0,36, 201, 707, 8000),
        mvpathreshold = c(201),
options(nwarnings = 10000)
warnings()

Michael Rueschman

unread,
Mar 13, 2024, 11:55:05 AMMar 13
to R package GGIR
What sort of setup are you using to run GGIR? Operating system? R version? Etc.

I ran your code above successfully on one of my .gt3x files. I'm using Windows 10, R 4.3.2, and GGIR 3.0.8. It also ran through successfully for me on GGIR 3.0.6.

I see these recent commits that appear to center on the same error/warning you encountered (for Ubuntu users?): https://github.com/search?q=repo%3Awadpac%2FGGIR+origin+AND+supplied&type=commits

Maybe knowing your system setup will open up new ideas. 

Julia ZZ

unread,
Mar 18, 2024, 8:17:49 AMMar 18
to R package GGIR
Thank you Michael, I was able to run this with the most updated R version (4.3.3) and GGIR (3.0-8). 



However, I still see the duplicate nights with the newest version. Please see attached screenshot. I did not change my codes. I wonder if there are any problems with my codes. Or should I update to another version?

packageVersion("GGIR")
library(GGIR)
datadir="V:/ACOI/R01 - Voucher/Spring 2021/Data Collection/Mailing Accelerometers/Downloaded Files/DUP_TEST"
outputdir="V:/ACOI/R01 - Voucher/Spring 2021/GGIR_output_kid_newversion_acc_nolog_dup"
dup_nights.JPG

Michael Rueschman

unread,
Mar 18, 2024, 10:01:21 AMMar 18
to R package GGIR
Glad to hear you got GGIR 3.0-8 running.

I wonder if your particular file is still affected by this open issue about windows that start exactly at 23:59:55 -- https://github.com/wadpac/GGIR/issues/1082

Julia ZZ

unread,
Mar 18, 2024, 11:08:55 AMMar 18
to R package GGIR
Yes, it is still affected by the open issue.
Message has been deleted

Vincent van Hees

unread,
Mar 29, 2024, 1:44:25 PMMar 29
to Julia ZZ, R package GGIR
It is an open issue, which means that it is not fixed.

GitHub issues are intended to communicate awareness about a (potential) problem. If it has no person assigned to it (on the right) then it indicates that nobody is working on it at the moment, and that there is also no draft solution.

This is how open source software works. It depends on the community to help investigate and fix problems.

Vincent
On Wednesday, 27 March 2024 at 3:26 PM, Julia ZZ <julia...@gmail.com> wrote:
Hi all,

Just want to give an update. I have updated the GGIR to 3.0-9 and ran the files. However, the part 5 results are still affected by issue "calendar_date in part 5 when window starts exactly at 23:59:55 #1082". I wonder if there are any solutions for the issue. Or am I missing any updated version to fix the problem?

Sincerely,
Julia

Sarah Burkart

unread,
May 6, 2024, 8:42:39 AMMay 6
to R package GGIR
Hi all,

Is there an anticipated timeline for this fix and/or will it be part of the next CRAN release?

Thank you for your hard work in resolving this!

-Sarah

Vincent van Hees

unread,
May 24, 2024, 3:36:20 AMMay 24
to Sarah Burkart, Julia ZZ, R package GGIR
Dear Sarah and Julia,

I am no longer able to reproduce the issue with the file and GGIR function call Julia sent on March 7, so I am tempted to conclude that it is fixed? The part 1 output Julia sent is from an earlier version of GGIR but for parts 2, 3, 4 and 5 I am using GGIR 3.1-0.

Inside part5_daysummary_WW_L35.6M201.4V707_T5A5.csv I see:

image.png

Also when I remove the "+ 1" as I discussed in this issue, I get the same seemingly correct result. If you also want to test minor modification yourself you can install GGIR with:

remotess:install_github("wadpac/GGIR", ref = "issue1082_calendar_date")

Best,

Vincent
Reply all
Reply to author
Forward
0 new messages