Re: GWT Crawlable (SEO)

330 views
Skip to first unread message

Jens

unread,
Dec 28, 2012, 3:32:48 PM12/28/12
to google-we...@googlegroups.com
The servlet filter is only for the crawler and the crawler will not navigate your app using PlaceChangeEvents. The crawler just loads an URL that will hit your server and your servlet filter.

If the bot finds a hyperlink like "#!myPlace" then it calls your server using "http://domain.com/?_escaped_fragment_=myPlace" and thus hitting your server. If you have two servers (a dedicated web server for static content + application server) then you have to proxy the request to your application server as soon as the URL contains the _escaped_fragment_ query parameter, so that the server can generate a HTML snapshot on the fly or you have to pre-generate all possible snapshots and serve them directly from the dedicated web server (and update them regularly using a cron job).

Basically your server needs to follow the spec described at:


-- J.

darkflame

unread,
Dec 29, 2012, 5:43:51 PM12/29/12
to Google Web Toolkit
I solved this by effectively having a crudely laidout, but text/
content identical php system that gives the content when javascript
isnt present and a ?_escaped_fragment_ url is given.
As a bonus to making my site crawlable, this also makes it usable(ish)
for those without javascript on.
> https://developers.google.com/webmasters/ajax-crawling/docs/specifica...
>
> -- J.

RyanZA

unread,
Dec 30, 2012, 3:36:03 PM12/30/12
to google-we...@googlegroups.com
Check out this link, specifically point 3 on how to set up a servlet filter that works with _escaped_fragment.
https://developers.google.com/webmasters/ajax-crawling/docs/html-snapshot

You now have two choices: in the link above, they use a java WebClient ( http://htmlunit.sourceforge.net/ ) to grab the html from your real page. The WebClient can parse javascript, and if your GWT page is fairly straight forward it should just work and give the right html back to google bot with zero extra work beyond the web.xml config / filter on your end.

However, if WebClient can't get good data from your pages (you'll need to test using WebClient directly), then you can inline some html for any pages that don't work, and you can make calls to your database/servlets to pull any dynamic data you might need. This output should be valid html, but it doesnt have to be pretty - just a <body>Lots of text here...</body> is enough for googlebot to understand the content on your page.

On Thursday, December 27, 2012 1:36:17 PM UTC+2, Jan wrote:
Hi

I like to make my GWT-App by the google bot. I found this article (https://developers.google.com/webmasters/ajax-crawling/). It states there should be a servlet filter, that serves a different view to the google bot. But how can this work? If i use for example the activities and places pattern, than the page changes are on the client-side only and there is no servlet involved -> servlet filter does not work here.

Can someone give me an explanation? Or is there another good tutorial tailored to gwt how to do this?

Thanks and best regards
Jan

Benjamin Possolo

unread,
Dec 31, 2012, 4:56:05 AM12/31/12
to google-we...@googlegroups.com
Jan

you seem to be kind of trolling; duplicate thread created 5 days ago by yourself and you clearly didnt bother to use the search feature but I will answer anyways....

Just copy the filter I wrote:
Reply all
Reply to author
Forward
0 new messages