--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/8e058504-cc23-413c-9bb3-7b9ae9651400n%40googlegroups.com.
Another thought would be to set dataverse.files.<id>.connection-pool-size to >256. That was introduced in v5.1.1 after Dataverse changed to using a pool of S3 connections in 5.1 to be more efficient. If you are on Dataverse >=5.1 and you are doing tests that open many S3 connections (uploads, thumbnail retrievals, etc.) and/or those connections aren’t getting closed quickly (more possibilities for that in older Dataverse versions as we have found/fixed cases where Dataverse leaves a connection open for a while, but also possibly something where your Ceph implementation has a longer timeout than AWS that might make these worse for you), then increasing this value should help. Although increasing the pool uses some memory, making it 10-20 times bigger should still be fine if that helps the freezing problem.
-- Jim
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/8ee0f522-222a-4581-a971-62bfa342961dn%40googlegroups.com.
The simplest solution would be to open up the CORS origin as discussed in https://guides.dataverse.org/en/latest/developers/big-data-support.html. (You can drop POST and DELETE though.). FWIW: Direct up and download both use signed URLs created by Dataverse only for users who should be able to get access to a given resource and which are only valid for a short time (configurable), hence allowing * does not open access as much as it would with endpoints that offer public access or have simple username/password controls, etc.)
If you want to tighten things up, you may also want to look at https://github.com/GlobalDataverseCommunityConsortium/dataverse-previewers/wiki/Using-Previewers-with-download-redirects-from-S3 which discusses other security mechanisms you’d have to address (specifically, Content-Security-Policy by default prohibits Javascript (used for direct upload) from adding an Origin header, which I think is why you’re seeing ‘missing’ in the error). If you figure out what’s needed there, we’d be happy to have a PR to get that into the guides for others to follow.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/92d863bb-389a-47b9-a8a1-80f2c8b85d5dn%40googlegroups.com.
Direct upload via the UI includes a second pass through the file to calculate the file hash (usable to verify that Dataverse/future downloaders have the exact same file as what was on the uploader’s disk). Depending on the relative speed of your network and machine, the first half (uploading to S3) and the second half (calculating the hash) of the bar can proceed at fairly different rates.
Also, as with normal upload, the files are uploaded prior to being added to the dataset, i.e. it is only when you hit save that the dataset registers that the new files are part of it.
FWIW: The DVUploader combines upload with hash calculation in one pass through the file, hence it can be somewhat faster. The decision to upload and hash sequentially in the UI was solely due to not knowing of any Javascript library that would allow doing both in parallel (which is straight forward in Java/the DVUploader.)
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/9cc4f66e-3f19-4904-a3ad-e07d728f691dn%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/cdfb8fa8-2913-437d-8633-b5c10c37b3a0n%40googlegroups.com.