Large file upload and download

149 views
Skip to first unread message

Paul Boon

unread,
Feb 10, 2020, 9:47:05 AM2/10/20
to Dataverse Users Community
Does anyone know where 'temporary files' for upload and download are created and if can we change this via configuration?

When investigating upload problems I noticed that temporary files (upload__*.tmp) are generated in
/usr/local/glassfish4/glassfish/domains/domain1/generated/jsp/dataverse
Large file uploads via the UI are thus limited by size of the 'usr' partition, which is rather small on our system. 
To circumvent this I created a symlink for the 'generated' dir to a dir on our (large) data partition, 
but there might be a better solution?

This made we worry about the download; if zipping for download will result in temporary files here also. 
I can't see anything created here though when downloading large files (zipped). 
Hopefully this is all done in a 'streaming' manner, without creating potential large temporary files. 

So I would also like to know if and where temporary download files are created so I can take measures for handling large files. 

James Myers

unread,
Feb 10, 2020, 10:15:37 AM2/10/20
to dataverse...@googlegroups.com

Paul,

I think Dataverse stores it’s temp files in a subdir of the -Ddataverse.files.directory= that is set as a glassfish jvm-option. This is where Dataverse does it’s unzipping of uploaded zip files for example.

 

However Dataverse itself doesn’t use the upload__*.tmp pattern. My guess (and I do mean guess) is that those files are created by the PrimeFaces library during upload, before Dataverse itself gets to do anything. A quick search on the web hasn’t turned up where Primefaces is storing files but it is an interesting coincidence that the Java user.dir property on the machine I looked at is  /srv/glassfish/dataverse/generated/jsp/<name of deployed app>. So – at least a guess that changing that Java property may help you.

 

-- Jim

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/fbfc2684-0be1-45d0-8231-6b4333566ffb%40googlegroups.com.

Philip Durbin

unread,
Feb 10, 2020, 1:45:31 PM2/10/20
to dataverse...@googlegroups.com
There's an open issue at https://github.com/IQSS/dataverse/issues/2848 to document the various temporary directories used by Dataverse. If anyone reading this can offer a pull request, please go ahead!

Thanks,

Phil



--

Paul Boon

unread,
Feb 11, 2020, 3:34:26 AM2/11/20
to Dataverse Users Community
Jim, 

You are right about the dataverse.files.directory, I can see there is a 'temp' subdir that also gets files during upload (with and without zipped content).
However, I don't see anything when downloading large (>1Gb) zipped files and I use 'watch -d -n 1 ls -al' to see changes in the 'temp' dir. 

The directory used by PrimeFaces has the application name in it, so my guess is that Glassfish is somehow providing it via the 'user.dir'. Maybe we can have PrimeFaces use the 'temp' in dataverse.files.directory instead of the user.dir, but that might need some coding. 

For the time being I will stick to the softlink solution. 

Thanks for the info, 
Paul


Op maandag 10 februari 2020 16:15:37 UTC+1 schreef Jim Myers:

To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

Oliver Bertuch

unread,
Feb 11, 2020, 4:49:46 AM2/11/20
to Dataverse Users Community
Morning guys,

as this would affect my containers, I looked into this, too.

Looks like we can nail down Primefaces where to store uploads via filter configuration: https://primefaces.github.io/primefaces/7_0/#/components/fileupload?id=filter-configuration

A quick grep through the code base reveals that we are currently not using this.
Who wants the honor to create an issue? :-)

Cheerio,
Oliver

Oliver Bertuch

unread,
Feb 17, 2020, 9:08:45 AM2/17/20
to Dataverse Users Community
Folks,

I took the liberty to create an issue. https://github.com/IQSS/dataverse/issues/6656
Feedback on Github is appreciated and please share.

Best,
Oliver
Reply all
Reply to author
Forward
0 new messages