Twitter time stamp convertion

85 views
Skip to first unread message

Rob H-Man

unread,
Dec 25, 2015, 12:50:48 PM12/25/15
to OpenRefine

Dear OpenRefine-Community,

Unfortunately I have huge problems converting Twitter timestamps into a format which allows me to filter Tweets by a timescale e.g. between 01.01.2014 and 01.01.2015.


The column I am talking about looks at the moment for example:
Mon Aug 10 20:38:59 +0000 2015   or
Tue Jul 14 20:20:41 +0000 2013
(Day Month hh:mm:ss +0000 Year)


Filtering my data just by simple date would be enough for my work – day or time doesn’t matter.


I apologize for such a “simple” question and request for help since I guess there is plenty of guides and all but I need the data for my dissertation in politics which means I am naturally a beginner and very bad “at coding”. All my Google-Searches for an already existing method/code was unfortunately no success.


BR, thanks and Merry Christmas

Message has been deleted

Joe Wicentowski

unread,
Dec 25, 2015, 6:28:47 PM12/25/15
to openr...@googlegroups.com
Hi Rob,

A cleaner solution would be to use the GREL function, toDate, which lets you match date patterns and returns a date object, which can be used for sorting, etc.  Here's what I get if I transform your cells using the following GREL expression:

  value.toDate('EEE MMM DD HH:mm:ss Z YYYY')
=> 
  2014-12-29T14:38:59Z
  2013-01-01T14:20:41Z

The documentation for this function is at:


The patterns here (e.g., EEE) are in the link in that document, or here's the direct link:


Cheers,
Joe

On Fri, Dec 25, 2015 at 4:46 PM, Rob H-Man <kidr...@gmail.com> wrote:
ok, I think I solved it by using:
value.split(' ')[2]+"."+value.split(' ')[1]+"."+value.split(' ')[5]

after that i transform months to numbers by using e.g. for August:
edit cells: value.replace("Aug","8")
+all the other months afterwards.

if someone has a better, "cleaner" solution feel free to let me know :)

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Rob H-Man

unread,
Dec 26, 2015, 7:48:46 AM12/26/15
to OpenRefine
Dear Joe,

Thank you so much; finally I "think" I understand this function now.

Unfortunately though it is not working: When trying to edit cells (edit cells -> transform...) I get an Error: "Error: Cannot parse to date".

I have an error-screenshot: http://s18.postimg.org/q00p6eg8p/error.jpg

Trying to add "toString" was not successful, too.
value.toDate('EEE MMM DD HH:mm:ss Z YYYY').toString(“yyyy-mm-dd”)

May I ask you what exactly you did to get your shown result? beeing able to use the timeline facet filter sounds great.

Thank you so much!

Joe Wicentowski

unread,
Dec 26, 2015, 11:04:21 AM12/26/15
to openr...@googlegroups.com
Hi Rob,

Glad to hear the function is starting to make sense. Once you get the
hang of it, you can parse all sorts of dates, and you can generate
dates in tons of formats too.

> Unfortunately though it is not working: When trying to edit cells (edit cells -> transform...) I get an Error: "Error: Cannot parse to date".
>
> I have an error-screenshot: http://s18.postimg.org/q00p6eg8p/error.jpg

That's strange! Here's the corresponding screenshot from my system:

http://postimg.org/image/jruhaxsm5/

> May I ask you what exactly you did to get your shown result? beeing able to use the timeline facet filter sounds great.

I'm using OpenRefine 2.6 RC1 with Java 8 update 66. Running "java
-version" at the command line returns this:

> java version "1.8.0_66"
> Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
> Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)

What are your versions of OpenRefine and Java?

Joe

Rob H-Man

unread,
Dec 26, 2015, 11:34:43 AM12/26/15
to OpenRefine
Oh, wow, that's super strange!

My Java is:
java version "1.8.0.66"
.... build 1.8.0_66-b18
Java Hotspot 64-Bit server VM (build 25.66-b18, mixed mode)

First I was using google-refine-2.5-r2407, just tried openrefine-win-2.6-beta.1 but no success either?

Tom Morris

unread,
Dec 26, 2015, 12:36:49 PM12/26/15
to openr...@googlegroups.com
Leading/trailing whitespace in the cell value, perhaps? Trying
trimming whitespace first.

Tom

Rob H-Man

unread,
Dec 26, 2015, 1:31:55 PM12/26/15
to OpenRefine
Good evening Tom,

Good idea; so I did by using the function in "common transforms", "value.trim()" and "value.escape("javascript")".

The latest screenshot: http://s10.postimg.org/3kzb34nnd/adfg.jpg

I am using Win 10 Pro 64 in case thats relevant somehow...

BR and thanks for all efforts

Joe Wicentowski

unread,
Dec 26, 2015, 2:03:42 PM12/26/15
to openr...@googlegroups.com
Hi Rob,

Hmm, let's try a little troubleshooting. Here are the exact steps I
followed to produce my results. Can you try these and let us know if
you encounter the same error as with your full data set?

1. From http://localhost:3333 select Create Project > Clipboard
2. Paste the following text into the input field (3 lines total):

created_at
Mon Aug 10 20:38:59 +0000 2015
Tue Jul 14 20:20:41 +0000 2013

3. Select Next > CSV/TSV/separator-based files > Create Project
4. On the created_at column select Edit Cells > Transform, and enter
the following into the input field (1 line total):

value.toDate('EEE MMM DD HH:mm:ss Z YYYY')

5. The preview should show the expected results:

[date 2014-12-29T14:38:59Z]
[date 2013-01-01T14:20:41Z]

Do these steps work for you? Or do you get the same error as before?

Joe

Rob H-Man

unread,
Dec 27, 2015, 5:57:07 AM12/27/15
to OpenRefine
Dear Joe,

No, unfortunately no result either: http://s24.postimg.org/qvusgdrsl/dfahdfh.jpg

BR and thanks anyways

Joe Wicentowski

unread,
Dec 27, 2015, 9:15:17 AM12/27/15
to openr...@googlegroups.com
Hi Rob,

I'm out of ideas, sorry!  Perhaps someone with a better understanding of OR internals or dependencies can provide some suggestions?

Joe

Sent from my iPhone

Rob H-Man

unread,
Dec 27, 2015, 1:03:05 PM12/27/15
to OpenRefine
Ok, too bad. Thank you anyways for all your great efforts. All the best to you, Rob

Rob H-Man

unread,
Dec 29, 2015, 10:29:03 AM12/29/15
to OpenRefine
Well, I dont know what to say, but I did:
value.replace('+0000','')
and afterwards edit cells -> common transforms -> To date
and it worked!  (...?)

The only last thing that I would like to ask is, how to filter between two certain dates as the timeline facet only shows one date but two times (time doesnt matter) = I cannot see the end-date in the filter. -> I want to filter my data e.g. for between 1. January and 3. January. How do I do that?

Screenshot: http://s11.postimg.org/6srxrdlc3/ghkxgk.jpg



Am Freitag, 25. Dezember 2015 18:50:48 UTC+1 schrieb Rob H-Man:

Tom Morris

unread,
Dec 29, 2015, 2:12:39 PM12/29/15
to openr...@googlegroups.com
On Tue, Dec 29, 2015 at 10:29 AM, Rob H-Man <kidr...@gmail.com> wrote:
Well, I dont know what to say, but I did:
value.replace('+0000','')
and afterwards edit cells -> common transforms -> To date
and it worked!  (...?)

The default parsing strategy doesn't use format strings.  One thing that occurs to me that could have messed up the previous attempt was having the locale for your computer set to something different than the language of the strings that you're trying to parse ie non-English.  You can override this by specifying a locale as well as a format string in your toDate() function.

Modifying Joe's early example steps, try using this for the transform step:

    value.toDate('en','EEE MMM DD HH:mm:ss Z YYYY')
 
The only last thing that I would like to ask is, how to filter between two certain dates as the timeline facet only shows one date but two times (time doesnt matter) = I cannot see the end-date in the filter. -> I want to filter my data e.g. for between 1. January and 3. January. How do I do that?

Screenshot: http://s11.postimg.org/6srxrdlc3/ghkxgk.jpg

That's the current behavior.  I can't remember if it used to work differently or not.  There's a limited amount of space available, but arguably it could be smarter about how it uses it.  Feel free to create an issue with the behavior that you would like to see (assuming one doesn't already exist in the issue list)

Tom

Rob H-Man

unread,
Dec 29, 2015, 2:55:24 PM12/29/15
to OpenRefine
Nice, good job and hot idea Tom - it worked! Thanks a lot!
Reply all
Reply to author
Forward
0 new messages