Philipp,
API Keys are still used by Dataverse in the signed URL mechanism. Signed URLs use the dataverse.api.signature-secret, which is optional but recommended and should be set to a fairly long (32-64 byte) value, plus the user’s API key as the key for signing URLs. That said, the API Key is no longer to the tool or otherwise exposed outside of Dataverse when using signed URLs, which is the main advantage.
* Note that whether to use signed URLs or API keys is managed via the previewer registration mechanism – if you haven’t reinstalled the previewers or updated your DB to make your previewers use signedURLs/have a list of allowedURLs in the json, you may still be using API Keys directly and having them sent to previewers/other tools. (Simply updating Dataverse itself doesn’t switch them).
FWIW: Using a user specific part (the API Key) to the overall key makes it harder for an attacker to collect enough info and limits them to working to compromise one person at a time. Conversely, adding the dataverse.api.signature-secret, which isn’t known by users, makes it harder for an attacker to use signed URLs at all – since the overall key they need to find is longer. The decision to pick the API key as the user specific part was mostly one of convenience – since it often exists already and the code will auto-generate it when needed. The signed URL mechanism could be adapted to use a second hidden user key, or the API Key could still remain hidden unless/until the user requests access to it. Even as is, signed URLs are a big improvement in that the API Key is not being sent to the previewers and the only thing an attacker able to see the browser history/network traffic can get is the signed URLs which are short-lived and only allow reading the specific data/metadata being previewed (whatever the set of allowed URLs for a tool can do, limited to the specific datasets/files the tool is called on).
-- Jim
--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
dataverse-commu...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/dataverse-community/86f27fd6-4fd5-4bda-b463-fcc42de9c77en%40googlegroups.com.
Philipp,
The main concern that signedUrls address is not that the user can see their own API key while logged in. Before signedUrls, as is still the case on demo.dataverse.org, invoking a previewer on a restricted file (or one in a draft dataset) was done by using the API key directly in the URL used to launch the previewer. That is not so obvious when you just view the previewer on the file page (although it is still true), but is clear if you launch the previewer on a separate page, e.g. clicking the “Explore on View Image” button that shows up on the file page of a jpg image on demo.dataverse.org. When I do this with an image in a draft dataset I have there, the URL is
The key parameter was my API key and it is in now my browser history and available for anyone to view if they have access to my machine. (The URL of the preview embedded in the file page doesn’t automatically appear in the browser history, but it could be visible to people who can see the network traffic, so they could still get the API key.)
In contrast, launching a previewer when signed URLs are used results in a URL like
If you know to decode that, you’d be able to find another URL: https://dv.dev-aws.qdr.org/api/v1/files/8187/metadata/11589/toolparams/5?until=2024-06-28T13:13:55.187&user=qqmyers&method=GET&token=d9fa52707314b8fa48cc53166208dcf77c387119e7d19b9d3487edd1fc0032d6e6637bc7b68ec8d63852bc4ff6a6a2a9842bd3f82612e18df13a5002c30be3ec
which, if you had gotten it within 5 (configurable) minutes (and if this specific QDR dev machine weren’t also firewalled), would have allowed you to retrieve some json with a couple more signedURLs that would let you get the dataset metadata and the file bytes if you used them quickly. Since it took more than five minutes to write this email, the fact that this URL might be in the browser history or seen on the network doesn’t matter. All you could do with it is get the response:
{"status": "ERROR","message": "Bad signed URL"} (you can’t with the specific URL above because it is from a QDR dev machine that is behind a firewall.)
The advantages here are that there is nothing left on the browser after the user logs out that can be used to do anything except download the public metadata and the restricted bytes of that one file, and, even for those two actions, the URLs are no good after the end of the timeout anyway. And I don’t have to be aware of or change my API key to be sure that no one can impersonate me after I leave the machine.
In contrast, if I hadn’t changed my API key on demo, you could have come to my computer several days later and picked it up from the history and then done anything I could do, including deleting my data.
While never showing the API key to the logged in user when it is only being used for signedURLs could stop an attacker who could look over their shoulder (assuming they viewed it at all – no reason they need to), it wouldn’t stop someone who could get onto the user’s machine while they were logged in (since they could then manually generate an API key and copy it). At some point, we might be able to completely replace the API key mechanism with some combination of signedURLs and OIDC offline tokens (as I talked about in Mexico). Until then, simply hiding the API key unless the user explicitly requests it could be done, but that would only stop attacks where someone could see the screen while the user is logged in and for some reason looks at their API key. If that is a concern, it would not be much programming.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/f38870a3-b137-4096-b3dd-6f189eb2974dn%40googlegroups.com.