Times in GMT timezone near DST boundaries confuse SML4S into seeing duplicates

25 views
Skip to first unread message

Dov Wasserman

unread,
Jul 18, 2024, 12:47:24 AM7/18/24
to User Group | Simple ML for Sheets (Public)
I am trying to produce a time series forecast. I have a column of dates, each of which is one hour ahead of the last. When I try to forecast values using this column, it consistently fails with the error:

"Collecting examples
Generating forecasting
The task failed
Error: ProduceSortedTimeseries: Input time series has duplicated timestamp 1616900400000000 at index 646 and index 647"

I've triple checked this coulmn, and there are no duplicated values anywhere, each row is exactly 1 hour past the last. But these "duplicated" indexes seem to be right around when DST would occur in many common time zones. That shouldn't be a problem because these values are in the DST-free GMT timezone. Viewing the dates as pure numbers in Sheets confirms they are all distinct. But Simple ML for Sheets somehow still fails at parsing them correctly.

I changed the Sheets settings to indicate the time zone is "GMT (no daylight savings)", restarted SimpleML, and see the same results. I then also changed the Cleaning mode to Low and High, but also no help.

Here is a sheet which reproduces the problem (contains no confidentials information):

https://docs.google.com/spreadsheets/d/1_qYt6hkTqR7eTdOHJ8vvA_YCNiLx6YMgfAAn0sbHNr4/edit

Please make it possible (and, ideally, Simple!) to forecast with such hourly values without getting snarled up by irrelevant DST boundaries. Thank you!
Reply all
Reply to author
Forward
0 new messages