I searched the user-forum and looked at several posts on SEO (search
engine optimization) issue with GWT apps. Essentially GWT based
applications are more like web "apps" not the tradition web "sites" -
which are much more friendlier for search engine crawlers and
spiders.
Now, GWT is very productive for creating dynamic web-applications. My
questions:
* I am wondering how have people address the SEO issue with GWT apps?
* I have seen many create very powerful web-apps with GWT, how do you
provide access to SEO crawlers?
Joster
1. You got the SEO angle covered (as long as you serve up the view-
only version if the User-Agent contains 'bot' of some sort), and
2. You got non-compatible browsers mostly covered. Make your target
REALLY SMALL displays  because the browsers that will be getting your
view-only mode are mobile phones and stuff like netscape v1.0 on a
640x480 screen, or w3m/links/lynx. Even if a mobile phone could
somehow run your AJAX app (e.g., iPhone), you can't design an
interface to be well designed both for 400x300 and 1280x1024 screens,
it just can't be done.
As far as I know (but I don't work for google and I'm not a lawyer),
showing a differently built/styled version of the exact same content
is not a violation of google's search spidering rules. I believe an
analytics account can help you in that it informs of you of any
spidering problems.
It's unfortunate that this isn't exactly easy.
This is an interesting concept. A follow-up question: how do I go
about creating view-only version of my web-app? And how do I make the
view-only version to serve 'bot' user-agents?
Please advice.
Joster
e.g: The hook redirects to the view only version using
document.location =.
in the <noscript> tag, you add a link to the view-only version as a
failsafe, and finally:
When serving up the MyProject.html file, use a servlet (or PHP script,
or whatever you like) to inspect the User-Agent string, and depending
on it, serve up the GWT-activating MyProject.html file, or the view
only version.
The User-Agent parser should only redirect to 'view-only' mode for
obvious User-Agents, like anything containing 'BOT' in any
capitalization, links, lynx, w3m.
The read-only version is just something you cook up completely outside
of GWT. Use whatever people have been using to build websites before
AJAX and JS stuff became popular. This isn't as hard as it sounds, as
you only need to generate the data with a few links to other read-only
pages, no edit capability. Servlets, JSPs, PHP scripts, whatever
strikes your fancy.
http://google.about.com/od/searchengineoptimization/tp/badseo.htm
Any official word from the GWT developers? In the current age where
market is flooded with ad-driven web-apps/web-sites, I am sure this
aspect of GWT is very important.
Joster
an other issue is to develop your site in a static way (no ajax), and
to wrap exiting div in widgets (set gwt widgets library) in order to
enhance them with gwt.
an other things, do not use ajax to display the content on the man
page or to do the navigation, use ajax only to modify or update data.
you can test your code with utils like "spider simulator" to see what
will be seen by bots.
On Jul 4, 9:36 am, "Ian Bambury" <ianbamb...@gmail.com> wrote:
> Yep, they banned BMW for doing this. They may not get around to you, but it
> is against the rules.
>
> My solution is to hold a lot of the text in hidden divs in the main html
> index page. You can organise it so it formats well on text-only,
> non-JavaScript browsers and is therefore also available for web crawlers. It
> also means that you can change text without a recompile, and it is
> accessability-friendly. I find it easier to write in plain text rather than
> in strings in Java. Internationalisation works, and you can palm off writing
> the content to people who aren't Java programmers.
>
> Ian
>
Those articles from about.com are pretty superficial (most of them
are) and I would suggest you look for more detailed articles on the
subject.
One of the best articles that I've read on the subject is this
whitepaper from backbase  :
http://www.backbase.com/download/articles/DesigningRIAsForSearchEngineAccessibility.pdf
which goes over the concept and implementation details in some depth.
This author seems to this that GWT's hashed fragments in the URL for
history management is the solution for SEO but I think not.
http://seoblog.intrapromote.com/2006/05/seo_considerati.html
Sanjiv
The about.com article is hardly canon. As long as your INTENT is in
the right place, and the content is exactly the same, there should be
no problem.
A follow-up question: has someone already implemented such a strategy
for GWT app and has been successful in getting their site/content
indexed by Google search engine?
Joster
Which ones of these did you adopt, could you please share your
learning, any BKMs?
Joster
On Jul 5, 12:31 am, "Ian Bambury" <ianbamb...@gmail.com> wrote:
> If you do a search for
>
> "ian bambury" trek 2008
>
> you'll find a site that has done this and is indexed by Google and others
>
http://googlewebmastercentral.blogspot.com/2007/07/best-uses-of-flash.html
http://ventureskills.wordpress.com/2007/07/06/cloaking-is-ok-says-google/
Sanjiv
I am most interestedin the "Secondary Site strategy" as it most suits
our application needs for SEO.
Has anyone successfully implemented the "Secondary Site Strategy" for
making Single Page Interface (SPI) search engine accessible?
Is it possible to come-up with a generic library which can be
applicable for all/any GWT apps to help with this issue? I am sure
this would benefit the entire GWT community?
By any chance, is this something in the road-map of future GWT
releases?
Joster
On Jul 7, 7:24 am, "Ian Bambury" <ianbamb...@gmail.com> wrote:
> I didn't use any of those. Like I said in my previous post, I just stuck the
> text in a hidden div. I've said a bit more about it in a thread "*Static
> text/HTML [Was: HTML templating vs Java Coding]"*
> --
> Ianhttp://roughian.com
gwt examples StackPanel                2
gwt examples DockPanel                 2
gwt examples CellPanel                 2
gwt examples Grid                      2
gwt examples AbsolutePanel             3
gwt examples CheckBox                  3
gwt examples FlexTable                 5
gwt examples Composite                 5
gwt examples ScrollPanel               5
gwt examples DialogBox                 5
gwt examples Hidden                    5
gwt examples FileUpload                6
gwt examples Frame                     6
gwt examples HorizontalPanel           6
gwt examples FormPanel                 7
gwt examples TabPanel                  7
gwt examples Button                   10
gwt examples HTML                     14
Thanks for follow-up. This looks very interesting and indeed great
work!
Do you detect that the user-agent is search-bots and direct them text/
html pages? And if not search-bots, you direct them to your main GWT
app? Sorry, I was unable to exactly understand your approach, could
you please elaborate.
I am very interested in how you implemented this as well.
Joster
On Jul 24, 1:43 pm, "Ian Bambury" <ianbamb...@gmail.com> wrote:
> Hi Joster,
>
> A followup...
>
> I put up my replacement examples site 9 days ago, Google has found most of
> the new pages now, and thought you might be interested in some checks I've
> just made.
>
> The new site gets text from static html pages and puts it in the GWT app
> simply because a lot of text would slow down the initial load and I don't
> want casual visitors to give up :-)
>
> These other HTML pages are browsable by text-only, non-JavaScript browsers,
> and screen-readers. If you go there with a JavaScript-enabled browser you
> get redirected to the correct page in the GWT app. These pages are linked to
> from the main index page.
>
> Anyway, the reason for writing is that you asked at one point "has someone
> already implemented such a strategy for GWT app and has been successful in
> getting their site/contentindexed by Google search engine?"
>
> These are the rankings forhttp://examples.roughian.comif you search
> for "gwt examples [widgetname]" (no quotes)
>
> Ian
>
> *Search Text* *Ranking*
> gwt examples FocusPanel                1
> gwt examples HTMLPanel                 1
> gwt examples VerticalPanel             1
> gwt examples FlowPanel                 1
> gwt examples DisclosurePanel           1
> gwt examples DeckPanel                 1
>
> gwt examples StackPanel                2
> gwt examples DockPanel                 2
> gwt examples CellPanel                 2
> gwt examples Grid                      2
>
> gwt examples AbsolutePanel             3
> gwt examples CheckBox                  3
>
> gwt examples FlexTable                 5
> gwt examples Composite                 5
> gwt examples ScrollPanel               5
> gwt examples DialogBox                 5
> gwt examples Hidden                    5
>
> gwt examples FileUpload                6
> gwt examples Frame                     6
> gwt examples HorizontalPanel           6
>
> gwt examples FormPanel                 7
> gwt examples TabPanel                  7
>
> gwt examples Button                   10
> gwt examples HTML                     14
>
> --
> Ianhttp://examples.roughian.com
If you actually read this far and have any questions, let me know
That is very ingenious way to solve this issue.  I've been keeping my
large static blocks of HTML in hidden divs on the home page and
displaying them (in a different div) as needed.  Your idea makes much
more sense though as it it is easier to maintain and organize, and it
solves the SEO issue.  Would you be willing to share your redirect.js
script.  I'm sure for a javascript programmer it would be easy to
create i, but I'm not a javascript programmer.
Thanks.
Dennis
> ClientUI_DeckPanel.htm tohttp://www.examples.roughian.com/#ClientUI_DeckPanel
>
> ==========
>
> The effect, if you have managed to follow me this far (or even if you
> haven't), is that
>
>    - Non-JS things see the underlying set of web pages - i.e. the content
>    without the GWT widgets
>    - Bots collect the underlying set of web pages
>    - JS-enabled things going tohttp://www.examples.roughian.comget the
>    GWT site and see the conten PLUS any GWT widgets
> - JS-enabled things going to one of the underlying set of web pages
>    (from, say a search engine) get the right page of the GWT site
>
> It probably sounds terribly complicated, but in practice once the framework
> is in place
>
>    - You write an HTML page with the non-GWT HTML content and slots for
>    GWT widgets (if any)
>    - You add a link to it in the home page
>    - You create a Java class to add any GWT widgets you want to the slots
>    - You stick an entry in the GWT menu
>
> You only need to speed things up if you have casual visitors
> It is only any good for speeding up loading, if you have great swathes of
> STATIC text
> It's only any good for search engines if the content is relevant
>
> But also, on the plus side,
>
> - you don't have problems keeping two sets of pages in sync since they
>    are one and the same
> - you could add content for non-JS visitors which isn't displayed to
>    GWT visitors to make the SEO better
> - you are not detecting bots and serving different pages so you are in
>    the clear with the rules
>
> If you actually read this far and have any questions, let me know
> Ian
>
===========================================================================================
Each of the 'ghost' HTML pages looks something like this
===========================================================================================
 String url;
 public void create()
 {
  url = Utilities.downloadURL(this);
  try
  {
   clear();
   add(panel = new HTMLPanel(msgs.msgContactingServer()));
   RequestBuilder builder = new RequestBuilder(
RequestBuilder.GET, url);
   builder.setHeader("Content-Type", "application/x-www-form-urlencoded");
   builder.sendRequest(null, callback);
  }
  catch (RequestException e)
  {
   Window.alert
("Fetch of '" + url + "' - Failed with" + e.getMessage());
  }
 }
 HTMLPanel panel;
 RequestCallback callback = new RequestCallback()
 {
  public void onError(Request request, Throwable e)
  {
   Window.alert("Error: " + e.getMessage());
  }
  public void onResponseReceived(Request request, Response response)
  {
   String titleArray[] = response.getText().split("<title>|</title>");
   if (titleArray.length < 2)
   {
    clear();
    add(panel = new HTMLPanel(msgs.msgCannotContactServer()));
    setPageNeedsCreating(true);
    return;
   }
   title = titleArray[1];
   Window.setTitle
(title);
   HTML bodyhtml = new HTML(response.getText());
   Element bodycontents = bodyhtml.getElement();
   Element elem = DOM.getFirstChild(bodycontents);
   while (!DOM.getElementProperty(elem, "id").equals("page"))
    elem = DOM.getNextSibling(elem);
   clear();
   add(panel = new HTMLPanel(elem.toString()));
   buildPage();
   add(new HTML(msgs.pageFooter()));
   urchinTracker(getHistoryToken());
  }
 };
 
===========================================================================================
 Utilities.downloadURL() looks like this
===========================================================================================
 public static String downloadURL(Object o)
 {
  String url = getBaseURL();
  String[] classpath = GWT.getTypeName(o).split("[.]");
  url += classpath[classpath.length - 1] + "." + 
msgs.infoLocale() + ".htm";
  return url;
 }
===========================================================================================
 getBaseURL() looks like this (it's just a way of choosing a particular web server
 on my local machine so I can have a number of project all working at the same time)
===========================================================================================
 public final static String getBaseURL()
 {
  String url = GWT.getModuleBaseURL();
  String bits[] = url.split("/");
  if (bits[2].equals("localhost:8888")) url = "
http://localhost:1080";
  return url + "/";
 }
 
 
===========================================================================================
 redirect.js looks like this
===========================================================================================
var filebits = location.pathname.split("/");
var filename = filebits[filebits.length - 1];
filebits = filename.split(".");
var token = filebits[0];
var locale = "_" + filebits[1];
if(locale = "_en")locale = "";
var hostname = location.host;
var url = "http://" + hostname + "/index" + locale + ".htm" + location.search + "#" + token;
if(location.search != '?text') location.replace(url);
===========================================================================================
I simplified what I do slightly - Main_Home.htm is really Main_Home.en.htm so I can add 
translations of the text at some point, and all messages are held as 
com.google.gwt.i18n.client.Messages, so you can strip that bit out if you don't need it,
also the last line means you can append ?text to the page's url and you won't get redirected
This is useful when developing (in Windows and IE, anyway) because you have to restart
hosted mode before changes to the page will appear - with ?text, you can see it in IE
straight away without getting redirected
HTH, if you need any more info or explanation, drop me a line
Ian
--------------------------------------------
Dennis
> RequestBuilder builder = new RequestBuilder(RequestBuilder.GET, url);
>    builder.setHeader("Content-Type", "application/x-www-form-urlencoded");
>    builder.sendRequest(null, callback);
>   }
>   catch (RequestException e)
>   {
> Window.alert("Fetch of '" + url + "' - Failed with" + e.getMessage());
> ...
>
> read more »