[DuraSpace JIRA] (FCREPO-1020) Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory

1 view
Skip to first unread message

Stephen Bayliss (Updated) (DuraSpace JIRA)

unread,
Oct 24, 2011, 10:05:04 AM10/24/11
to fcrepo-...@googlegroups.com

[ https://jira.duraspace.org/browse/FCREPO-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stephen Bayliss updated FCREPO-1020:
------------------------------------

Description:
See report from Greg Jansen here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

The code looks like it is buffering the datastream in memory, which is clearly a bad thing. It should be passing the stream through.

was:
See report here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

The code looks like it is buffering the datastream in memory, which is clearly a bad thing. It should be passing the stream through.


> Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory
> ------------------------------------------------------------------------
>
> Key: FCREPO-1020
> URL: https://jira.duraspace.org/browse/FCREPO-1020
> Project: Fedora Repository Project
> Issue Type: Bug
> Components: FeSL
> Reporter: Stephen Bayliss
>
> See report from Greg Jansen here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473
> The code looks like it is buffering the datastream in memory, which is clearly a bad thing. It should be passing the stream through.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://jira.duraspace.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Stephen Bayliss (Created) (DuraSpace JIRA)

unread,
Oct 24, 2011, 10:05:04 AM10/24/11
to fcrepo-...@googlegroups.com
Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory
------------------------------------------------------------------------

Key: FCREPO-1020
URL: https://jira.duraspace.org/browse/FCREPO-1020
Project: Fedora Repository Project
Issue Type: Bug
Components: FeSL
Reporter: Stephen Bayliss


See report here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

Greg Jansen (Commented) (DuraSpace JIRA)

unread,
Oct 24, 2011, 11:09:03 AM10/24/11
to fcrepo-...@googlegroups.com

[ https://jira.duraspace.org/browse/FCREPO-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=22881#comment-22881 ]

Greg Jansen commented on FCREPO-1020:
-------------------------------------

I looks like the problem stems from the way the PEP enforces both before and after the remaining parts of the filter chain. Looking at the contract for RESTFilter, there is a handleResponse method that can return a RequestCtx that may inform enforcement. It looks like the response is wrapped and buffered in it's entirety, such that this context object can be used for enforcement...


/**
* Handles the response path and returns a RequestCtx if necessary.
*
* @param request
* the servlet request
* @param response
* the servlet response
* @return the RequestCtx if one is needed, or else null
* @throws IOException
* @throws ServletException
*/
public RequestCtx handleResponse(HttpServletRequest request,
HttpServletResponse response)
throws IOException, ServletException;

This additional enforcement step impacts API-A only, as any API-M operations will have been called by this point in the filter chain. In practice it looks like all RESTFilter implementations return a null for handleResponse. Perhaps this was meant as a future extension point, but I'm having trouble imagining the use case. To me it looks as if the Response wrapper can go away and the second enforcement step can be completely removed, so that all enforcement is before the rest of the filter chain is called.



> Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory
> ------------------------------------------------------------------------
>
> Key: FCREPO-1020
> URL: https://jira.duraspace.org/browse/FCREPO-1020
> Project: Fedora Repository Project
> Issue Type: Bug
> Components: FeSL
> Reporter: Stephen Bayliss
>

> See report from Greg Jansen here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

Stephen Bayliss (Commented) (DuraSpace JIRA)

unread,
Oct 24, 2011, 11:26:03 AM10/24/11
to fcrepo-...@googlegroups.com

[ https://jira.duraspace.org/browse/FCREPO-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=22882#comment-22882 ]

Stephen Bayliss commented on FCREPO-1020:
-----------------------------------------

I believe there is a use case around handling search results - to ensure that only "permitted" resources are in the result set. I've not looked at the code in detail though; from memory there is an implementation for the basic search. I believe there was a similar intent for RISearch results (and potentially this could apply to the Relationships API methods). Probably worth checking methods such as list datastreams also as these may also have some response filtering.



> Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory
> ------------------------------------------------------------------------
>
> Key: FCREPO-1020
> URL: https://jira.duraspace.org/browse/FCREPO-1020
> Project: Fedora Repository Project
> Issue Type: Bug
> Components: FeSL
> Reporter: Stephen Bayliss
>

> See report from Greg Jansen here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

Greg Jansen (Commented) (DuraSpace JIRA)

unread,
Oct 24, 2011, 3:06:03 PM10/24/11
to fcrepo-...@googlegroups.com

[ https://jira.duraspace.org/browse/FCREPO-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=22887#comment-22887 ]

Greg Jansen commented on FCREPO-1020:
-------------------------------------

Yes, I eventually saw that use case, the filtering of search results one. I expect that on basic search, but I would be a surprised if RISearch results were filtered that way.
Maybe there is some way that post-processed responses can be the exception, rather than the rule. (Seems like an easy way to speed up most response times too.)



> Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory
> ------------------------------------------------------------------------
>
> Key: FCREPO-1020
> URL: https://jira.duraspace.org/browse/FCREPO-1020
> Project: Fedora Repository Project
> Issue Type: Bug
> Components: FeSL
> Reporter: Stephen Bayliss
>

> See report from Greg Jansen here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

Greg Jansen (Commented) (DuraSpace JIRA)

unread,
Oct 24, 2011, 3:44:03 PM10/24/11
to fcrepo-...@googlegroups.com

[ https://jira.duraspace.org/browse/FCREPO-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=22890#comment-22890 ]

Greg Jansen commented on FCREPO-1020:
-------------------------------------

Looking at SearchFilter, where it post-processes the results of the search. There's a lot of knowledge about the search results format embedded in there, but its main function seems to be the batch evaluation of policy and pruning the pids. Since the filters are an integral part of the PEP, it seems like the a new PostProcessingRESTFilter could be a subclass of a pre-processing only RESTFilter interface. Then SearchFilter could implement that inteface and PEP would know when it needs to wrap the response.

Unless someone is already working this, I could give it a shot..



> Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory
> ------------------------------------------------------------------------
>
> Key: FCREPO-1020
> URL: https://jira.duraspace.org/browse/FCREPO-1020
> Project: Fedora Repository Project
> Issue Type: Bug
> Components: FeSL
> Reporter: Stephen Bayliss
>

> See report from Greg Jansen here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

Greg Jansen (Commented) (DuraSpace JIRA)

unread,
Oct 24, 2011, 3:44:03 PM10/24/11
to fcrepo-...@googlegroups.com

[ https://jira.duraspace.org/browse/FCREPO-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=22891#comment-22891 ]

Greg Jansen commented on FCREPO-1020:
-------------------------------------

But first let me know if this is the right idea pls.. thx.



> Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory
> ------------------------------------------------------------------------
>
> Key: FCREPO-1020
> URL: https://jira.duraspace.org/browse/FCREPO-1020
> Project: Fedora Repository Project
> Issue Type: Bug
> Components: FeSL
> Reporter: Stephen Bayliss
>

> See report from Greg Jansen here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

Greg Jansen (Issue Comment Edited) (DuraSpace JIRA)

unread,
Oct 24, 2011, 3:46:03 PM10/24/11
to fcrepo-...@googlegroups.com

[ https://jira.duraspace.org/browse/FCREPO-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=22890#comment-22890 ]

Greg Jansen edited comment on FCREPO-1020 at 10/24/11 7:44 PM:
---------------------------------------------------------------

Looking at SearchFilter, where it post-processes the results of the search. There's a lot of knowledge about the search results format embedded in there, but its main function seems to be the batch evaluation of policy and pruning the pids. Since the filters are an integral part of the PEP, it seems like a new PostProcessingRESTFilter could be a subclass of a pre-processing only RESTFilter interface. Then SearchFilter could implement that inteface and PEP would know when it needs to wrap the response.

Unless someone is already working this, I could give it a shot..

was (Author: gregjan):


Looking at SearchFilter, where it post-processes the results of the search. There's a lot of knowledge about the search results format embedded in there, but its main function seems to be the batch evaluation of policy and pruning the pids. Since the filters are an integral part of the PEP, it seems like the a new PostProcessingRESTFilter could be a subclass of a pre-processing only RESTFilter interface. Then SearchFilter could implement that inteface and PEP would know when it needs to wrap the response.

Unless someone is already working this, I could give it a shot..

> Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory
> ------------------------------------------------------------------------
>
> Key: FCREPO-1020
> URL: https://jira.duraspace.org/browse/FCREPO-1020
> Project: Fedora Repository Project
> Issue Type: Bug
> Components: FeSL
> Reporter: Stephen Bayliss
>

> See report from Greg Jansen here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

Greg Jansen (Commented) (DuraSpace JIRA)

unread,
Oct 24, 2011, 4:22:03 PM10/24/11
to fcrepo-...@googlegroups.com

[ https://jira.duraspace.org/browse/FCREPO-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=22893#comment-22893 ]

Greg Jansen commented on FCREPO-1020:
-------------------------------------

Looks like the ObjectsFilter is a filter that parses the servlet path in a more fine grained way to find and execute another RESTFilter. The class is a barrier to the strategy outlined above, in that it would make a whole range of servlet paths into ResponseHandlingRESTFilters, due to the search implemented at /objects.

Putting RESTFilters within RESTFilters makes this code a bit hard for me to understand. Might be clearer just to put the /objects path logic from ObjectsFilter.getObjectsHandler(req) in the PEP.getFilter(path) method. You would need to have the request method added to the signature, or the whole request object.



> Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory
> ------------------------------------------------------------------------
>
> Key: FCREPO-1020
> URL: https://jira.duraspace.org/browse/FCREPO-1020
> Project: Fedora Repository Project
> Issue Type: Bug
> Components: FeSL
> Reporter: Stephen Bayliss
>

> See report from Greg Jansen here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

Greg Jansen (Issue Comment Edited) (DuraSpace JIRA)

unread,
Oct 24, 2011, 4:26:03 PM10/24/11
to fcrepo-...@googlegroups.com

[ https://jira.duraspace.org/browse/FCREPO-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=22890#comment-22890 ]

Greg Jansen edited comment on FCREPO-1020 at 10/24/11 8:24 PM:
---------------------------------------------------------------

Looking at SearchFilter, where it post-processes the results of the search. There's a lot of knowledge about the search results format embedded in there, but its main function seems to be the batch evaluation of policy and pruning the pids. Since the filters are an integral part of the PEP, it seems like a new ResponseHandlingRESTFilter could be a subclass of a pre-processing only RESTFilter interface. Then SearchFilter could implement that inteface and PEP would know when it needs to wrap the response.

Unless someone is already working this, I could give it a shot..

was (Author: gregjan):
Looking at SearchFilter, where it post-processes the results of the search. There's a lot of knowledge about the search results format embedded in there, but its main function seems to be the batch evaluation of policy and pruning the pids. Since the filters are an integral part of the PEP, it seems like a new PostProcessingRESTFilter could be a subclass of a pre-processing only RESTFilter interface. Then SearchFilter could implement that inteface and PEP would know when it needs to wrap the response.

Unless someone is already working this, I could give it a shot..

> Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory
> ------------------------------------------------------------------------
>
> Key: FCREPO-1020
> URL: https://jira.duraspace.org/browse/FCREPO-1020
> Project: Fedora Repository Project
> Issue Type: Bug
> Components: FeSL
> Reporter: Stephen Bayliss
>

> See report from Greg Jansen here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

Greg Jansen (Commented) (DuraSpace JIRA)

unread,
Oct 25, 2011, 9:18:03 AM10/25/11
to fcrepo-...@googlegroups.com

[ https://jira.duraspace.org/browse/FCREPO-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=22897#comment-22897 ]

Greg Jansen commented on FCREPO-1020:
-------------------------------------

Here is my attempt at a fix: https://github.com/UNC-Libraries/fcrepo/tree/FCREPO-1020/maintenance-3-4
Maybe someone with FeSL expertise can review it. I'm able to use this patch to the fcrepo-security-pep jar with existing configuration files. Memory is stable and downloads of files begin immediately.
I'm not sure that I've created my fork in the most convenient way possible, still learning the git.



> Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory
> ------------------------------------------------------------------------
>
> Key: FCREPO-1020
> URL: https://jira.duraspace.org/browse/FCREPO-1020
> Project: Fedora Repository Project
> Issue Type: Bug
> Components: FeSL
> Reporter: Stephen Bayliss
>

> See report from Greg Jansen here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

Chris Wilper (Updated) (DuraSpace JIRA)

unread,
Oct 25, 2011, 11:25:03 AM10/25/11
to fcrepo-...@googlegroups.com

[ https://jira.duraspace.org/browse/FCREPO-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Wilper updated FCREPO-1020:
---------------------------------

Status: Open (was: Received)



> Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory
> ------------------------------------------------------------------------
>
> Key: FCREPO-1020
> URL: https://jira.duraspace.org/browse/FCREPO-1020
> Project: Fedora Repository Project
> Issue Type: Bug

> Components: Fedora, FeSL
> Affects Versions: Fedora 3.6
> Reporter: Stephen Bayliss
>
> See report from Greg Jansen here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

Chris Wilper (Updated) (DuraSpace JIRA)

unread,
Oct 25, 2011, 11:25:03 AM10/25/11
to fcrepo-...@googlegroups.com

[ https://jira.duraspace.org/browse/FCREPO-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Wilper updated FCREPO-1020:
---------------------------------

Priority: Major (was: Minor)
Assignee: Stephen Bayliss



> Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory
> ------------------------------------------------------------------------
>
> Key: FCREPO-1020
> URL: https://jira.duraspace.org/browse/FCREPO-1020
> Project: Fedora Repository Project
> Issue Type: Bug

> Components: Fedora, FeSL
> Affects Versions: Fedora 3.6
> Reporter: Stephen Bayliss

> Assignee: Stephen Bayliss
> Priority: Major
>
> See report from Greg Jansen here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

Chris Wilper (Updated) (DuraSpace JIRA)

unread,
Oct 25, 2011, 11:25:04 AM10/25/11
to fcrepo-...@googlegroups.com

[ https://jira.duraspace.org/browse/FCREPO-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Wilper updated FCREPO-1020:
---------------------------------

Component/s: Fedora
Affects Version/s: Fedora 3.6



> Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory
> ------------------------------------------------------------------------
>
> Key: FCREPO-1020
> URL: https://jira.duraspace.org/browse/FCREPO-1020
> Project: Fedora Repository Project
> Issue Type: Bug

> Components: Fedora, FeSL
> Affects Versions: Fedora 3.6
> Reporter: Stephen Bayliss
> Assignee: Stephen Bayliss
> Priority: Major
>

> See report from Greg Jansen here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

Greg Jansen (Commented) (DuraSpace JIRA)

unread,
Oct 25, 2011, 11:33:03 AM10/25/11
to fcrepo-...@googlegroups.com

[ https://jira.duraspace.org/browse/FCREPO-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=22902#comment-22902 ]

Greg Jansen commented on FCREPO-1020:
-------------------------------------

Note that the fix above was written against 3.4 maintenance branch. Not sure if the REST Summer of Code stuff would change the fix.



> Out-of-memory exceptions due to FeSL PEP buffering datastreams in memory
> ------------------------------------------------------------------------
>
> Key: FCREPO-1020
> URL: https://jira.duraspace.org/browse/FCREPO-1020
> Project: Fedora Repository Project
> Issue Type: Bug

> Components: Fedora, FeSL
> Affects Versions: Fedora 3.6
> Reporter: Stephen Bayliss
> Assignee: Stephen Bayliss
> Priority: Major
>

> See report from Greg Jansen here: http://sourceforge.net/mailarchive/message.php?msg_id=28272473

Reply all
Reply to author
Forward
0 new messages