Re: [Fedora-commons-users] Large datastream ingest issue: Bad request; unable to fulfill REST API request

5 views
Skip to first unread message

Shalvi, Doron (NIH/NLM) [C]

unread,
May 25, 2010, 10:06:21 AM5/25/10
to fedora-com...@lists.sourceforge.net

Hi Graeme,

The checksum bug appears to be related only to ingesting Managed "M" datastreams while providing a checksum; you can read more about the bug at http://fedora-commons.org/jira/browse/FCREPO-696 . We've noticed this problem in Fedora 3.2.1 and 3.3. Datastreams of type External "E" and Redirect "R" appear to be unaffected. We store 2 GB video files in our Fedora architecture as type R, which still provides checksum validation on ingest. You may want to consider storing these large files as type E or R in any case, to avoid large data transfers through Fedora on ingest and access.

Good luck,

Doron Shalvi

-----Original Message-----

Message: 7
Date: Mon, 24 May 2010 15:58:48 +0100
From: "West, Graeme" <Graem...@gcu.ac.uk>
Subject: [Fedora-commons-users] Large datastream ingest issue: Bad
request; unable to fulfill REST API request
java.lang.NumberFormatException: For input string: "2326355355"
To: "fedora-com...@lists.sourceforge.net"
<fedora-com...@lists.sourceforge.net>
Message-ID: <D041D898-7276-454F...@gcu.ac.uk>
Content-Type: text/plain; charset="us-ascii"

Hello (again) Fedora people :)

I'm planning a repository migration which will involve storing large video files in Fedora. As such I've been asking around as to the potential issues and gotchas involved.

I understand that Fedora 3.3 contains a bug related to datastream checksumming, which I would welcome any more information about. However, I encountered another problem while ingesting a 2.2GB file into Fedora as a Managed Content datastream via the Flash/Flex admin interface on my local machine.

The full error traceback is below, but the gist of it is:

> WARN 2010-05-24 15:28:59.282 [http-8080-2] (DatastreamResource) Bad request; unable to fulfill REST API request
> java.lang.NumberFormatException: For input string: "2326355355"


"2326355355" happens to be, more or less, the size of the file involved (my Mac reports that the file is 2,326,355,123 bytes, but I suppose Fedora may also be counting the XML of the datastream too).

I therefore wondered whether there's an issue simply with Tomcat's memory allocation or whether this is something more fundamental. It could be an offshoot of the checksumming bug, though the problem also occurs when I specify that the datastream should not be checksummed in any way.

I'm running Mac OS X 10.6.3, and using the latest Apple Java JDK on a Core 2 Duo (64 bit) machine with 2GB RAM. The Java process seems to be running in 64-bit mode.

Thanks in advance for any advice.

Regards,


Graeme West

Digital Repository Developer
Glasgow Caledonian University
graem...@gcu.ac.uk



> WARN 2010-05-24 15:28:59.282 [http-8080-2] (DatastreamResource) Bad request; unable to fulfill REST API request
> java.lang.NumberFormatException: For input string: "2326355355"
> at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
> at java.lang.Integer.parseInt(Integer.java:461)
> at java.lang.Integer.parseInt(Integer.java:499)
> at fedora.server.rest.RestUtil.getRequestContent(RestUtil.java:107)
> at fedora.server.rest.DatastreamResource.addOrUpdateDatastream(DatastreamResource.java:372)
> at fedora.server.rest.DatastreamResource.addDatastream(DatastreamResource.java:321)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at com.sun.jersey.server.impl.model.method.dispatch.EntityParamDispatchProvider$ResponseOutInvoker._dispatch(EntityParamDispatchProvider.java:157)
> at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:67)
> at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:124)
> at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:111)
> at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:71)
> at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:111)
> at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:63)
> at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:555)
> at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:514)
> at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:505)
> at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:359)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at fedora.server.security.servletfilters.FilterRestApiFlash.doFilter(FilterRestApiFlash.java:65)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at fedora.server.security.servletfilters.FilterSetup.doFilter(FilterSetup.java:234)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at fedora.server.security.servletfilters.FilterSetup.doFilter(FilterSetup.java:234)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at fedora.server.security.servletfilters.FilterSetup.doFilter(FilterSetup.java:234)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at fedora.server.security.servletfilters.FilterSetup.doFilter(FilterSetup.java:234)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at fedora.server.security.servletfilters.FilterSetup.doFilter(FilterSetup.java:234)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
> at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
> at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
> at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
> at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
> at java.lang.Thread.run(Thread.java:637)

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education's Widening Participation Initiative of the Year 2009 and Herald Society's Education Initiative of the Year 2009
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html



------------------------------

------------------------------------------------------------------------------



------------------------------

_______________________________________________
Fedora-commons-users mailing list
Fedora-com...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users


End of Fedora-commons-users Digest, Vol 39, Issue 14
****************************************************

West, Graeme

unread,
May 25, 2010, 11:35:15 AM5/25/10
to Shalvi, Doron (NIH/NLM) [C], fedora-com...@lists.sourceforge.net
Hi Doron,
Thanks for your input. It's interesting that the checksum bug only affects managed content, and good to hear that other people are also using Fedora to manage large datastreams.

About the choice of managed content: we're attempting to store everything this way to gain the most from Fedora's storage abstraction architecture. Basically, we want our entire repository stored in one logical directory, and not to have to rely on the presence of external web servers to provide the content.

Also, we won't be allowing access to >2GB files to the public, so hopefully any performance tradeoff will be limited. Having them available to repository administrators and software tools without relying on external web servers has compelling advantages for our purposes.

On the REST file upload bug, I've updated bug #704 with a patch against trunk which seems to fix the issue following Steve's instructions:
http://www.fedora-commons.org/jira/browse/FCREPO-704

I've tested it with >2GB files with good results, and more than acceptable speed from Fedora (~20MB/s ingest). I don't have a complete build environment set up yet but hopefully the diff file will help someone integrate the change.

Regards,

Graeme
> ------------------------------------------------------------------------------
>
> _______________________________________________
> Fedora-commons-users mailing list
> Fedora-com...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users
>
> Email has been scanned for viruses by Altman Technologies' email management service - www.altman.co.uk/emailsystems
Reply all
Reply to author
Forward
0 new messages