[Dspace-tech] Unwanted errors in Cocoon.log

178 views
Skip to first unread message

Ian Boston

unread,
Aug 26, 2015, 10:15:08 AM8/26/15
to DSpac...@lists.sourceforge.net
Hi,
I've been doing some load testing recently using a jmeter spidering
test, and I notice the Webapp node becomes Disc IO bound long before
anything else runs out which I think is odd since the user is
anonymous, and I would not have expected huge amounts of disk IO.

In trying to locate the source of the problem I have noticed thousands
of large tracebacks appearing in the cocoon.log whenever a 404 is
generated. Each traceback is just under 400 lines long and always
starts with this

2012-12-11 01:02:51,820 ERROR cocoon.handled - Failed to process pipeline
at <map:serialize type="xml"> -
jndi:/localhost/xmlui/aspects/aspects.xmap:96:31
at <map:generate type="file"> -
jndi:/localhost/xmlui/aspects/aspects.xmap:95:55
at <map:serialize type="xml"> -
resource://aspects/ViewArtifacts/sitemap.xmap:218:52


with a base cause of.

Caused by: org.apache.cocoon.ResourceNotFoundException: Page cannot be found
at org.dspace.app.xmlui.aspect.general.PageNotFoundTransformer.addBody(PageNotFoundTransformer.java:168)
at org.dspace.app.xmlui.wing.AbstractWingTransformer.startElement(AbstractWingTransformer.java:223)
... 309 more


The ResourceNotFoundException is probably reasonable, but I would like
to eliminate the traceback without disabling the "cocoon.handled"
logger completely.

I would also like to simply pass the 404 the front end httpd with zero
content so that it can deliver an appropriate response, rather than
confusing users with a java + cocoon stack trace.

Any pointers gratefully received.

Ian


(BTW, this probably isnt the source of the IO since the file only gets
to 100M in the 5 minute test, but I would like to eliminate the
growth).

helix84

unread,
Aug 26, 2015, 10:15:15 AM8/26/15
to Ian Boston, DSpac...@lists.sourceforge.net
Hi Ian,

I don't know about shortening the stacktrace, but that's a pretty
generic Java question so you may try googling around.

To answer at least a part of your question, this is where 404s are
handled in Cocoon. You may want to edit the XSL to filter what you
send to the user for specific exceptions and leave the rest (the
unexpected ones) as stacktraces.

https://github.com/DSpace/DSpace/blob/dspace-3_x/dspace-xmlui/src/main/webapp/sitemap.xmap#L673


Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Ian Boston

unread,
Aug 26, 2015, 10:15:16 AM8/26/15
to hel...@centrum.sk, DSpac...@lists.sourceforge.net
On 12 December 2012 20:15, helix84 <hel...@centrum.sk> wrote:
> Hi Ian,
>
> I don't know about shortening the stacktrace, but that's a pretty
> generic Java question so you may try googling around.

I dont think there is a way other than changing the Cocoon code,
blocking the logger or not throwing the exception and setting a 404
status code in the Dspace code. I might just do the latter.

>
> To answer at least a part of your question, this is where 404s are
> handled in Cocoon. You may want to edit the XSL to filter what you
> send to the user for specific exceptions and leave the rest (the
> unexpected ones) as stacktraces.
>
> https://github.com/DSpace/DSpace/blob/dspace-3_x/dspace-xmlui/src/main/webapp/sitemap.xmap#L673
>

Ok, thanks for the pointer.

Ian

Tim Donohue

unread,
Aug 26, 2015, 10:15:37 AM8/26/15
to Ian Boston, DSpac...@lists.sourceforge.net
Hi Ian,

Just now saw this thread about the ugly 404 error page & logs in DSpace
XMLUI / Cocoon.

You may already be aware of when/why this came about. But, I figured I'd
fill in some "history" just in case you (or others) are not.

Essentially, prior to DSpace 1.8.x, DSpace XMLUI actually had a nicer
looking XMLUI "Page Not Found" Error page. It just simply said "page not
found" and looked like every other page in your XMLUI theme.

However, then we discovered that Apache Cocoon was responding with "200
OK" on *every error page* in the XMLUI, see this ticket
https://jira.duraspace.org/browse/DS-768

Essentially, the only resolution we were able to come up with was to
actually patch Apache Cocoon to throw the proper 404 error. So we
created a patched copy of the Cocoon code which was throwing 200 instead
of 404. That patched code is maintained here:
https://github.com/DSpace/dspace-cocoon-servlet-service-impl

Unfortunately though, in our patching of this Cocoon bug, we were not
able to get our nicer looking "Page Not Found" error page to display
again...instead, we were left with the ugly error page you see now.
That patch to Cocoon may have also been the cause of the messages you
are seeing in the Cocoon.log.

There may be ways to fix this & clean up both the "Page Not Found" error
page and limit the output in the Cocoon.log. I haven't looked into it in
some time. But, if you find anything that works for you, I'd highly
encourage you to send us a Pull Request or a patch -- I think this is
something we'd all like to see fixed.

Not sure if this info will be of help. But, at least know you know the
full story of where this issue began.

- Tim
> ------------------------------------------------------------------------------
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> _______________________________________________
> DSpace-tech mailing list
> DSpac...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>

Ian Boston

unread,
Aug 26, 2015, 10:15:43 AM8/26/15
to Tim Donohue, DSpac...@lists.sourceforge.net
Hi Tim,
Thank you.
Could you give me a pointer to where the DSpace page is rendered?

Looking at the Coocon code it should be done by [1], but in DSpace
that doesnt take the blindest bit of notice of the setting
org.apache.cocoon.manageexceptions, which if set to false should just
set the status.

Incidentally, in the code I am looking at the status is set correctly
when managed, at least in the cocoon codebase. If a
ResourceNotFoundException appears as a 404 with html content, see [2],
and the log message is issued at warn level.

I am looking at the section of the stack below where think the
ResourceNotFoundException is handled (line 189 in
org.apache.cocoon.servlet.RequestProcessor )


at org.apache.cocoon.servlet.RequestProcessor.process(RequestProcessor.java:351)
at org.apache.cocoon.servlet.RequestProcessor.service(RequestProcessor.java:169)
at org.apache.cocoon.sitemap.SitemapServlet.service(SitemapServlet.java:84)


Thanks
Ian



1 org.apache.cocoon.servlet.RequestUtil.manageException(HttpServletRequest,
HttpServletResponse, Environment, String, int, String, String, String,
Exception, ServletSettings, boolean, Object)

2 org.apache.cocoon.servlet.RequestProcessor line 200:

RequestUtil.manageException(request, res, env, uri,
HttpServletResponse.SC_NOT_FOUND,
"Resource Not Found",
"Resource Not Found",
"The requested resource \"" +
request.getRequestURI() + "\" could not be found",
e,
this.servletSettings,
getLogger(), this);
return;

Ian Boston

unread,
Aug 26, 2015, 10:15:45 AM8/26/15
to Tim Donohue, DSpac...@lists.sourceforge.net
Hi Tim, Helix,
Thanks for your help, I have fixed this by setting the cocoon.handled
level to FATAL and changed the xsl as you suggested. Since the errors
are cocoon handled, they will appear in the UI where appropriate, but
better than filling the logs up. I managed to create almost 2G of logs
yesterday.
Thanks
Ian

Tim Donohue

unread,
Aug 26, 2015, 10:16:13 AM8/26/15
to Ian Boston, DSpac...@lists.sourceforge.net
The DSpace "Page Not Found" page is rendered from the main sitemap, here:
https://github.com/DSpace/DSpace/blob/master/dspace-xmlui/src/main/webapp/sitemap.xmap#L673

- Tim

Bill T

unread,
Sep 10, 2015, 11:06:19 AM9/10/15
to DSpace Technical Support, DSpac...@lists.sourceforge.net, ib...@cam.ac.uk
I just revisited this old post;  I had encountered this problem before, and "solved" it by by setting my cocoon log level to FATAL.  I was reminded of this while working on a dev site where I am the only user right now.  With the log level set to INFO or DEBUG, I see that this error is being logged with every page I visit.  With the exception of this error in the cocoon log, everything seems fine -- the page renders completely, and there are no client side warnings.  However, this error seems to occur for every page I visit.

The dev site is DSpace 5.3, but clearly it is happening with 4.x as well...  It occurs to me that changing the log level is just a band aid, and there is something deeper happening.

Is there any way to determine exactly what is not being found?

Regards,
Bill

Bill T

unread,
Sep 11, 2015, 1:40:02 PM9/11/15
to DSpace Technical Support, DSpac...@lists.sourceforge.net, ib...@cam.ac.uk
Vaguely of interest -- I have discovered that whenever a request is made that results in a server response of 302 (for instance (in my case) /login, /shibboleth-login, /logout), this "Failed to process pipeline" error is generated in the cocoon logs.  Outside of the fact that this generates a whole lot of text, the error is harmless.  I'm still going to set the log level to FATAL to avoid logging all this, but I'm still looking for a way to add the culprit url to the error message to help me decide if it's really a meaningful error or not...
Cheers!
Bill

Chris Wilper

unread,
Sep 12, 2015, 1:37:10 PM9/12/15
to DSpace Technical Support, DSpac...@lists.sourceforge.net, ib...@cam.ac.uk
Hi Bill, 

The redirect-related noisiness is a known issue[1]. I see it all the time and it's bothered me too, so I looked into it and I submitted a pull request for what I think is a reasonable code fix, linked from the issue in JIRA.

I couldn't find a JIRA issue for the related issue alluded to above, which is that overly verbose log messages appear in Cocoon logs when pages are not found (http 404). So I created one [2]...hopefully it's not a dupe. As I think Tim mentioned earlier in the thread, an ultimate fix for the 404 noisiness might require some changes to one of the Cocoon libraries. I recently made an attempt at fixing it solely within the DSpace code but my approach had undesirable side effects with caching. I may give it another shot soon. In the meantime, even with the log level set to FATAL, you should be able to see evidence of 404s in the tomcat logs ($TOMCAT_HOME/logs/localhost_access_log.*)

- Chris

Reply all
Reply to author
Forward
0 new messages