Ok, I have a Filter
public class CrawlServlet implements Filter{
@Override
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain) throws IOException, ServletException {
HttpServletRequest httpRequest = (HttpServletRequest) request;
String requestQueryString = httpRequest.getQueryString();
if ((requestQueryString != null) && (requestQueryString.contains("_escaped_fragment_"))) {
String str="http://127.0.0.1:8888/Myproject.html?gwt.codesvr=127.0.0.1:9997#!article";
// this url work fine, if i open it it will show article.
final WebClient webClient = new WebClient();
HtmlPage page = webClient.getPage(str);
PrintWriter out = response.getWriter();
out.println(page.asXml());
}
}
}
Ok, now i ran my app in eclipse, & open the url http://127.0.0.1:8888/Myproject.html?gwt.codesvr=127.0.0.1:9997?_escaped_fragment_=article
Then I got this error in eclipse
.GWTUpld .gwt-Button:HOVER, .GWTUpld .DecoratedFileUpload .gwt-Button-over, .GWTUpld .DecoratedFileUpload .gwt-Anchor-over, .GWTUpld .DecoratedFileUpload .gwt-Label-over { color: #af6b29; } ...... .GWTUpld input[type="file"] { cursor: pointer; } ': null java.util.EmptyStackException at java.util.Stack.peek(Unknown Source) at java.util.Stack.pop(Unknown Source) at com.steadystate.css.parser.CSSOMParser$CSSOMHandler.endDocument(CSSOMParser.java:271) at com.steadystate.css.parser.AbstractSACParser.handleEndDocument(AbstractSACParser.java:456) at com.steadystate.css.parser.SACParserCSS3.styleSheet(SACParserCSS3.java:56) at com.steadystate.css.parser.AbstractSACParser.parseStyleSheet(AbstractSACParser.java:284) at com.steadystate.css.parser.SACParserCSS3.parseStyleSheet(SACParserCSS3.java:23) at com.steadystate.css.parser.CSSOMParser.parseStyleSheet(CSSOMParser.java:146) at com.gargoylesoftware.htmlunit.javascript.host.css.CSSStyleSheet.parseCSS(CSSStyleSheet.java:818) at com.gargoylesoftware.htmlunit.javascript.host.css.CSSStyleSheet.(CSSStyleSheet.java:179) at com.gargoylesoftware.htmlunit.javascript.host.css.CSSStyleSheet.loadStylesheet(CSSStyleSheet.java:321) at com.gargoylesoftware.htmlunit.javascript.host.html.HTMLLinkElement.getSheet(HTMLLinkElement.java:130) at com.gargoylesoftware.htmlunit.javascript.host.css.StyleSheetList.item(StyleSheetList.java:151) at com.gargoylesoftware.htmlunit.javascript.host.Window.getComputedStyle(Window.java:1601) at com.gargoylesoftware.htmlunit.javascript.host.Element.getCurrentStyle(Element.java:545) at com.gargoylesoftware.htmlunit.html.DomNode.isDisplayed(DomNode.java:712) at com.gargoylesoftware.htmlunit.WebClient$CurrentWindowTracker.webWindowContentChanged(WebClient.java:1671) at com.gargoylesoftware.htmlunit.WebClient.fireWindowContentChanged(WebClient.java:742) at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:440) at com.gargoylesoftware.htmlunit.WebClient.loadDownloadedResponses(WebClient.java:2024) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.doProcessPostponedActions(JavaScriptEngine.java:712) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.access$500(JavaScriptEngine.java:92) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:679) at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:602) at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:507) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:616) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:591) at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScriptFunctionIfPossible(HtmlPage.java:985) at com.gargoylesoftware.htmlunit.javascript.background.JavaScriptFunctionJob.runJavaScript(JavaScriptFunctionJob.java:53) at com.gargoylesoftware.htmlunit.javascript.background.JavaScriptExecutionJob.run(JavaScriptExecutionJob.java:102) at com.gargoylesoftware.htmlunit.javascript.background.JavaScriptJobManagerImpl.runSingleJob(JavaScriptJobManagerImpl.java:328) at com.gargoylesoftware.htmlunit.javascript.background.DefaultJavaScriptExecutor.run(DefaultJavaScriptExecutor.java:162) at java.lang.Thread.run(Unknown Source) May 16, 2014 1:36:41 PM com.gargoylesoftware.htmlunit.javascript.host.css.CSSStyleSheet parseCSS SEVERE: Error parsing CSS from '/** * The file contains styles for GWT widgets in the standard theme. * * In order to maintain cross-browser compatibility, the following syntax is * used to create IE6 specific style rules: * .gwt-Widget { * property: rule applies to all browsers * -property: rule applies only to IE6 (overrides previous rule) * } * * html .gwt-Widget { * property: rule applies to all versions of IE * } */ ....... ': null java.util.EmptyStackException at java.util.Stack.peek(Unknown Source) at java.util.Stack.pop(Unknown Source) at com.steadystate.css.parser.CSSOMParser$CSSOMHandler.endDocument(CSSOMParser.java:271) at com.steadystate.css.parser.AbstractSACParser.handleEndDocument(AbstractSACParser.java:456) at com.steadystate.css.parser.SACParserCSS3.styleSheet(SACParserCSS3.java:56) at com.steadystate.css.parser.AbstractSACParser.parseStyleSheet(AbstractSACParser.java:284) at com.steadystate.css.parser.SACParserCSS3.parseStyleSheet(SACParserCSS3.java:23) at com.steadystate.css.parser.CSSOMParser.parseStyleSheet(CSSOMParser.java:146) at com.gargoylesoftware.htmlunit.javascript.host.css.CSSStyleSheet.parseCSS(CSSStyleSheet.java:818) at com.gargoylesoftware.htmlunit.javascript.host.css.CSSStyleSheet.(CSSStyleSheet.java:179) at com.gargoylesoftware.htmlunit.javascript.host.css.CSSStyleSheet.loadStylesheet(CSSStyleSheet.java:321) at com.gargoylesoftware.htmlunit.javascript.host.html.HTMLLinkElement.getSheet(HTMLLinkElement.java:130) at com.gargoylesoftware.htmlunit.javascript.host.css.StyleSheetList.item(StyleSheetList.java:151) at com.gargoylesoftware.htmlunit.javascript.host.Window.getComputedStyle(Window.java:1601) at com.gargoylesoftware.htmlunit.javascript.host.Element.getCurrentStyle(Element.java:545) at com.gargoylesoftware.htmlunit.html.DomNode.isDisplayed(DomNode.java:712) at com.gargoylesoftware.htmlunit.WebClient$CurrentWindowTracker.webWindowContentChanged(WebClient.java:1671) at com.gargoylesoftware.htmlunit.WebClient.fireWindowContentChanged(WebClient.java:742) at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:440) at com.gargoylesoftware.htmlunit.WebClient.loadDownloadedResponses(WebClient.java:2024) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.doProcessPostponedActions(JavaScriptEngine.java:712) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.access$500(JavaScriptEngine.java:92) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:679) at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:602) at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:507) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:616) at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:591) at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScriptFunctionIfPossible(HtmlPage.java:985) at com.gargoylesoftware.htmlunit.javascript.background.JavaScriptFunctionJob.runJavaScript(JavaScriptFunctionJob.java:53) at com.gargoylesoftware.htmlunit.javascript.background.JavaScriptExecutionJob.run(JavaScriptExecutionJob.java:102) at com.gargoylesoftware.htmlunit.javascript.background.JavaScriptJobManagerImpl.runSingleJob(JavaScriptJobManagerImpl.java:328) at com.gargoylesoftware.htmlunit.javascript.background.DefaultJavaScriptExecutor.run(DefaultJavaScriptExecutor.java:162) at java.lang.Thread.run(Unknown Source)
Is this cos I have /*** ***/
comment line in css & that is why the HtmlUnit could not parse?
Can you fix this issue?
Also, How can we make sure HTMLUnit will work OK in all kind of html code?
Note: I am using htmlunit-2.14
!stack.empy() && stack.peek() .. stack.peek() is the one throwing this
exception. May be that, when the current thread has checked !empty(),
another thread has already consumed the data due to JVM memory model. May
be we need to do in sych??
http://127.0.0.1:8888/Myproject.html?gwt.codesvr=127.0.0.1:9997?_escaped_fragment_=article, I can see the Chrome browser show the raw html page like this