Connecting dataverse to S3 storage

116 views
Skip to first unread message

Michel Bamouni

unread,
May 16, 2018, 7:35:46 AM5/16/18
to Dataverse Users Community
Hi,

I install dataverse 4.8.6 successfully and I want to replace fileSytem storage by an S3 storage.
I read the configuration http://guides.dataverse.org/en/latest/installation/config.html#file-storage-local-filesystem-vs-swift-vs-s3 and it seems
that to configure S3 storage in dataverse, it's mandatory to use amazon services.
I don't see in the configuration guide how to setup the URL of the S3 storage.
In our company, we have our own S3 storage so, I want to know how to tell dataverse to use our own S3 storage?


Best regards,







Philip Durbin

unread,
May 16, 2018, 8:57:48 AM5/16/18
to dataverse...@googlegroups.com
Hi Michel,

This question of non-AWS S3 storage is very similar to what was asked at https://groups.google.com/d/msg/dataverse-community/zZDA5fLpA5w/t42UZTr9AQAJ

Can you please give us some details on the S3 provider you use? We only have experience with S3 that's provided by AWS.

Thanks,

Phil


--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/2551b606-f657-4cb5-bfe3-7177584802fb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Michel Bamouni

unread,
May 16, 2018, 10:05:11 AM5/16/18
to Dataverse Users Community
HI Philip,

Thanhs for reply.

Before opne this thread, I search on this group to question about S3. SO I see the https://groups.google.com/d/msg/dataverse-community/zZDA5fLpA5w/t42UZTr9AQAJ
and according to me, my problem is different.

Below is what I want to :
My ops departement setup on a server a storage based on S3 techonology and the gives the credentials and the url of this S3 storage.
I have my bucket name also.
So I don't want to create an account on the amazon S3 https://console.aws.amazon.com/.
What I want is to telle my dataverse to use the S3 storage provided by my ops departement.
Is it possible in dataverse 4.8.6 or we need to have an account on amazon web service?
In addition, how dataverse know to which storage the credentials and the bucket is associated?


Michel


Le mercredi 16 mai 2018 14:57:48 UTC+2, Philip Durbin a écrit :
Hi Michel,

This question of non-AWS S3 storage is very similar to what was asked at https://groups.google.com/d/msg/dataverse-community/zZDA5fLpA5w/t42UZTr9AQAJ

Can you please give us some details on the S3 provider you use? We only have experience with S3 that's provided by AWS.

Thanks,

Phil

On Wed, May 16, 2018 at 7:35 AM, Michel Bamouni <olimi...@gmail.com> wrote:
Hi,

I install dataverse 4.8.6 successfully and I want to replace fileSytem storage by an S3 storage.
I read the configuration http://guides.dataverse.org/en/latest/installation/config.html#file-storage-local-filesystem-vs-swift-vs-s3 and it seems
that to configure S3 storage in dataverse, it's mandatory to use amazon services.
I don't see in the configuration guide how to setup the URL of the S3 storage.
In our company, we have our own S3 storage so, I want to know how to tell dataverse to use our own S3 storage?


Best regards,







--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

Pete Meyer

unread,
May 16, 2018, 10:10:54 AM5/16/18
to Dataverse Users Community
Hi Michel,


On Wednesday, May 16, 2018 at 10:05:11 AM UTC-4, Michel Bamouni wrote:
HI Philip,

Thanhs for reply.

Before opne this thread, I search on this group to question about S3. SO I see the https://groups.google.com/d/msg/dataverse-community/zZDA5fLpA5w/t42UZTr9AQAJ
and according to me, my problem is different.

Below is what I want to :
My ops departement setup on a server a storage based on S3 techonology and the gives the credentials and the url of this S3 storage.

Could you provide some additional information about which particular implementation of the the S3 protocol you're intending to make use of?  This is potentially relevant, because different implementations of S3 do not always show the same behavior as the original AWS S3 protocol (at least in my limited evaluations).

Best,
Pete

Michel Bamouni

unread,
May 16, 2018, 10:28:55 AM5/16/18
to Dataverse Users Community
Hi Pete,

Our local S3 is use CEPH technology : http://docs.ceph.com/docs/master/radosgw/s3/


Michel

Philip Durbin

unread,
May 16, 2018, 1:49:11 PM5/16/18
to dataverse...@googlegroups.com
Thanks, Michel, this helps clarify the situation. Can you please open an issue at https://github.com/IQSS/dataverse/issues about supporting the CEPH flavor of S3?

To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michel Bamouni

unread,
May 17, 2018, 3:09:36 AM5/17/18
to Dataverse Users Community
Hi Phil,

If I summarize, at this time, I can't telle dataverse to point to my local CEPH S3 storage and I need to create an account on amazon WS?

Michel
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/2551b606-f657-4cb5-bfe3-7177584802fb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

Philip Durbin

unread,
May 17, 2018, 7:50:53 AM5/17/18
to dataverse...@googlegroups.com
Sure, that's fine. The main thing is to have "CEPH S3" in the title of the issue. You could link back to this email thread for details.

http://docs.ceph.com/docs/master/radosgw/s3/java/ makes sense to me. I see "import com.amazonaws.services.s3.AmazonS3" both there and at https://github.com/IQSS/dataverse/blob/v4.8.6/src/main/java/edu/harvard/iq/dataverse/dataaccess/S3AccessIO.java so I'm hoping that it mostly just works. What I don't understand is how to configure authentication. Have you tried doing all the setup described under http://guides.dataverse.org/en/latest/installation/config.html#amazon-s3-storage such as creating a folder called ".aws" in the root of the folder you run Glassfish as? I would be nice if it "just works" and the issue you create is just about documenting any details that are important for CEPH users.

By the way, I noticed that CEPH also has a Swift API: http://docs.ceph.com/docs/master/radosgw/swift/ . Dataverse supports both S3 and Swift.

I hope this helps,

Phil

To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michel Bamouni

unread,
May 17, 2018, 8:11:34 AM5/17/18
to Dataverse Users Community
I follow the guide of the dataverse documentation.
But I think, I am not clear in description of my problem.
According to what I undestand about the configuration of the S3 in dataverse, with the credentials I give , Dataverse will use the de region ID I define in "config" under ~/.aws/ do determine the endpoint that contains the S3 bucket and where to put files user upload.
Always in my understanding, by defauft, I must choose amazon ws region (https://docs.aws.amazon.com/general/latest/gr/rande.html).
The thing blocking me is that I don't have an amazon account and my credentials for a local S3 installation.
So I try to sends my files to an amazon with my local credentials, this doesn't work and it is normal.
My question is : can me put the url of my local S3 storage in the config file?

Michel
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

Philip Durbin

unread,
May 17, 2018, 8:29:45 AM5/17/18
to dataverse...@googlegroups.com
I'm sorry, I don't know how to configure the hostname of the CEPH server and http://docs.ceph.com/docs/master/radosgw/s3/commons/#bucket-and-host-name is confusing to me.

Can you please get in touch with the CEPH community to ask what to do? You are welcome to link them to the Dataverse Guides about S3 support, of course.

To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michel Bamouni

unread,
May 18, 2018, 7:40:57 AM5/18/18
to Dataverse Users Community
Hi Phil,

I see the issue https://github.com/IQSS/dataverse/issues/4690 created by my colleague.

Thanks for all your reply

Michel
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

Philip Durbin

unread,
Oct 10, 2018, 7:03:21 AM10/10/18
to dataverse...@googlegroups.com
Yesterday we merged https://github.com/IQSS/dataverse/pull/5059 to provide support for S3-compatible storage on custom URLs. Thank you to Oliver Bertuch from Research Centre Jülich for this contribution!

Phil

To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michel Bamouni

unread,
Oct 12, 2018, 7:06:17 AM10/12/18
to Dataverse Users Community
Hi Phil,

This nice news. In what dataverse version, this will be available?

Michel
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

Philip Durbin

unread,
Oct 12, 2018, 7:45:02 AM10/12/18
to dataverse...@googlegroups.com
The pull request has already been merged so it will be in the next version of Dataverse. I don't know if that version will be 4.9.5 or 4.10.

While the issue was in QA we talked about cleaning up the docs a bit but didn't so maybe you (or anyone reading this) can take a look at them now (before we release the next version of Dataverse) and open a GitHub issue if you find anything confusing: https://github.com/IQSS/dataverse/blob/f9f9be2fe494754ceed8e53a9db6039eee6de3cd/doc/sphinx-guides/source/installation/config.rst#amazon-s3-storage-or-compatible

Thanks,

Phil

To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages