serving private datasets on cloud infra

51 views
Skip to first unread message

chris...@gmail.com

unread,
Oct 18, 2019, 10:08:10 AM10/18/19
to Neuroglancer
What is the recommended approach for serving a private dataset using Google's services?  I have a precomputed example made from cloud-volume up and running, but when I point the server to a cloud bucket, the objects must be public readable.  I don't see anything in the code to handle credentialing in this case.

My goal is to run a server on a GCE instance, accessing data stored privately in GCS buckets.

Thanks in advance,
Chris
 

Jeremy Maitin-Shepard

unread,
Oct 18, 2019, 10:25:45 AM10/18/19
to chris...@gmail.com, Neuroglancer
You can use a server on GCE or extend neuroglancer to support gcs credentials.  The difficulty with using oauth credentials via a normal web-based permission request to access gcs is that the permissions are not fine-grained --- you would have to grant neuroglancer access to *all* of the gcs buckets that a given account can access, which isn't really acceptable for the public demo client.  For a client running on an origin you control, however, it could be acceptable.  You could alternatively provide the credentials for a service account in the url.  Note that none of this is implemented yet, unfortunately.

There is a trick that already works though: use a gcs bucket with public read permission but not public list permission, and use a long randomly-generated prefix.  The prefix then serves as a secret key.  Because it is slow to move a large number of files as are used by the precomputed format, it is best to use the prefix from the start, rather than moving them later.



--
You received this message because you are subscribed to the Google Groups "Neuroglancer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to neuroglancer...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/neuroglancer/d7023a6c-f967-4fc0-8851-1e0d8ae4cb93%40googlegroups.com.

Chris Roat

unread,
Oct 18, 2019, 10:46:59 AM10/18/19
to Jeremy Maitin-Shepard, Neuroglancer
Security through obscurity does seem to be the easiest.  Bummer about the coarse-grained permissions for gcs oauth.  Is the suggestion to use a different random key for each dataset?

In your first statement, you say it's possible to run on GCE or extend neuromancer.  The latter would be tough for me, given a lack of Node experience.  But the first piece about GCE -- if I build the app and use apache to serve the files on a GCE instance running under a service account, will the http data requests somehow magically work?

C

Jeremy Maitin-Shepard

unread,
Oct 18, 2019, 11:19:52 AM10/18/19
to Chris Roat, Neuroglancer
Probably best to use a separate key per dataset --- although if you know that a set of datasets will always be shared together then you could use the same key.  There is no way to revoke access conveniently (though technically once someone has access they could download all the data), and you have to be careful not to accidentally leak the key, e.g. via neuroglancer urls.

If you create your own server you will also have to implement authentication (in neuroglancer) as well, unfortunately.  However, I think supporting http basic auth or cookie-based auth would be fairly easy --- you would just need to modify the precomputed datasource code to tell the browser to send credentials.

Chris Roat

unread,
Oct 18, 2019, 1:12:19 PM10/18/19
to Jeremy Maitin-Shepard, Neuroglancer
Would you be able to provide guidance on the modifications needed to send credentials?  Should I be modeling after the brainmaps credential handling?

Chris Roat

unread,
Oct 18, 2019, 1:24:41 PM10/18/19
to Jeremy Maitin-Shepard, Neuroglancer
Alternatively, would it work to mount a bucket on the filesystem with gcsfuse, and serve from the "local" filesystem?

Jeremy Maitin-Shepard

unread,
Oct 18, 2019, 2:44:44 PM10/18/19
to Chris Roat, Neuroglancer
If you only need access from the local machine, you can indeed just use a local webserver and gcsfuse.

If you want to modify neuroglancer to support http basic auth or cookie auth, you need to ensure all fetch calls by the precomputed datasource set the credentials option:



Jeremy Maitin-Shepard

unread,
Oct 18, 2019, 2:46:37 PM10/18/19
to Chris Roat, Neuroglancer
If you want to use oauth or something similar, then you can model it on the brainmaps datasource.
Reply all
Reply to author
Forward
0 new messages