Fetching URL with UNIX timestamp

173 views
Skip to first unread message

herve

unread,
Jul 27, 2015, 10:22:47 AM7/27/15
to OpenRefine
Hi!

I need to fetch URLs with OpenRefine.

So I've created a column with my URLs like this :


My problem is that the server uses a security parameter to answer the request, by adding to it a UNIX timestamp (13 digits). Like this :


If the Timestamp is no coherent, the request doesn't work.

So, is it possible to use the FetchURL function of Openrefine, by doing something like :

  • Value+Timestamp(Current_hour_of_the_fetch_request)


Thanks for your help,


Hervé



Martin Magdinier

unread,
Jul 27, 2015, 10:53:55 PM7/27/15
to openr...@googlegroups.com
Hervé

The list is moderated for new user. This is why you didn't saw your first email posted directly. Your email will now be distributed directly.

To get the current timestamp you can use  datePart(now(),'time')
see here for more details: https://github.com/OpenRefine/OpenRefine/wiki/GREL-Date-Functions

However I don't know when Refine will compute the timestamp, when pressing OK or when querying each URL.
Let us know how it goes.

Martin
--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

herve

unread,
Jul 28, 2015, 5:19:51 AM7/28/15
to OpenRefine, martin.m...@gmail.com
Hi Martin,

Thank you for your answer!

Good point : datePart(now(),'time')  works like a charm and gives me the value I want..
Bad point : it doesn't work, i get a 503 error from my server. I don't know if it's a cross-domain limitation, or a bad timestamp.

I investigate.

;)

qi cui

unread,
Aug 6, 2015, 10:07:33 AM8/6/15
to OpenRefine, martin.m...@gmail.com
If you can paste the output from the console when you see the 503 error, that will be helpful. Also the complete GREL you are using.

herve

unread,
Aug 6, 2015, 12:21:16 PM8/6/15
to OpenRefine, martin.m...@gmail.com
Ok, So I replace theURL by its IP adress (10.124.1.5), in order to eliminate the unreachable DNS error.

Here is now the error : java.security.cert.CertificateException: No subject alternative names matching IP address 10.124.1.5 found

My initial column is something like :

https://10.124.1.5/library/author.srv?mode=fixer&nom=(%27RISCHEBE%27,%27RISCHEBE%27)&prenom=EMMANUEL&ctrl=

So I do Fetch URL, with a 5000ms threshold, and value+
datePart(now(),'time')

I don't see any problem on my syntax, and I think that a security issue on our server.

Reply all
Reply to author
Forward
0 new messages