Gmail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Discussions > Crawling, indexing, and ranking > Stemming of complex plurals
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 1 - 25 of 55 - Collapse all  -  Translate all to Translated (View all originals)   Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
beckysharpe  
View profile  
 More options May 2 2008, 1:45 pm
From: beckysharpe
Date: Fri, 2 May 2008 10:45:59 -0700 (PDT)
Local: Fri, May 2 2008 1:45 pm
Subject: Stemming of complex plurals
Hi All

Just for general interest, until recently (agreed, I may have been
wearing some kind of blindfold and not noticed!) I believed from
tracking back searches that stemming was on simple pluralising,
verbalising, etc.  So swims, swimmers, swimming etc reduce to "swim"
as a highlighted word if "swim" is the search term.  I didn't think
complex variations of the stem were recognised.

However, just seen the search term "cattery" being recognised (by
highlighting) as a diminutive of the complex plural "catteries".  Tbh,
I don't keep much of an eye on this site for trackback purposes, so
this could have been the case for a while - but interesting I hope,
nonetheless ;-)  And what a huge development in linguistic
understanding by the googlebot.

Becky


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Autocrat  
View profile  
(2 users)  More options May 2 2008, 2:02 pm
From: Autocrat
Date: Fri, 2 May 2008 11:02:03 -0700 (PDT)
Local: Fri, May 2 2008 2:02 pm
Subject: Re: Stemming of complex plurals
Now you've done it.

Wait for the 'it's not possible' speech :D

Not exactly new... but I must admit... I'm not sure how long it's been
going on.
(I love the fact they show up in bold... makes my job that little bit
easier ;))


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
beckysharpe  
View profile  
 More options May 2 2008, 2:36 pm
From: beckysharpe
Date: Fri, 2 May 2008 11:36:22 -0700 (PDT)
Local: Fri, May 2 2008 2:36 pm
Subject: Re: Stemming of complex plurals
Cheers, Auto - think I've just been super-boring/anal ;-)

On May 2, 7:02 pm, Autocrat wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Autocrat  
View profile  
 More options May 2 2008, 3:01 pm
From: Autocrat
Date: Fri, 2 May 2008 12:01:19 -0700 (PDT)
Local: Fri, May 2 2008 3:01 pm
Subject: Re: Stemming of complex plurals
Erm.... not going to say a word :D

:twiddles thumbs:
:looks skywards:

Bet we're not hte only ones ;)


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
beckysharpe  
View profile  
 More options May 2 2008, 4:04 pm
From: beckysharpe
Date: Fri, 2 May 2008 13:04:24 -0700 (PDT)
Local: Fri, May 2 2008 4:04 pm
Subject: Re: Stemming of complex plurals
I'm happy in my little world - and you're welcome to join me ;)) - or
not -( !

On May 2, 8:01 pm, Autocrat wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Autocrat  
View profile  
 More options May 2 2008, 6:05 pm
From: Autocrat
Date: Fri, 2 May 2008 15:05:39 -0700 (PDT)
Local: Fri, May 2 2008 6:05 pm
Subject: Re: Stemming of complex plurals
Happy to join you :D

    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Robbo  
View profile  
(1 user)  More options May 2 2008, 9:35 pm
From: Robbo
Date: Fri, 2 May 2008 18:35:21 -0700 (PDT)
Local: Fri, May 2 2008 9:35 pm
Subject: Re: Stemming of complex plurals
Becky

Further examples of what you are noting:

Search for: [ tesol accredit ] or [ tesol accrediting ] and you will
see [ accreditation ] highlighted.

Also, note that [ accredit ] does not appear on my site but
[ accredited ] and [ accreditation ] are used frequently.
Now if I google for:-
site:www.tesol-direct.com +accredit
it says Did you mean: site:www.tesol-direct.com +"accredited"

It seems that having FIRST detected that my site does NOT contain
[ accredit ] but has a closely rleated [ accredited ] it offers that
instead. It is not simply offering it because it is closely related;
it is offering it because the original term does could not be found
(on that site).

If I repeat the same operation on a huge site, exact matches are found
and no such near-alternatives (Did you mean ...) are offered.
eg site:www.bbc.co.uk +accredit

So, what are the implications for improving SEO?

Robbo


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Gissit  
View profile  
 More options May 3 2008, 8:01 am
From: Gissit
Date: Sat, 3 May 2008 05:01:11 -0700 (PDT)
Local: Sat, May 3 2008 8:01 am
Subject: Re: Stemming of complex plurals
It goes a bit further than simple stemming. A UK search for "Smart Car
NOS" will show you a snippet with "Nitrous" highlighted in bold. The
word "NOS" is not used anywhere on the site or in anchor text that
links to the site. Google seems to be quite capable of ranking on
synonyms now. (NOS is a brand name for Nitrous systems).

Why would anyone be surpised at this? It makes perfect sense to me, in
the future maybe exact keywords and phrases in content will bet less
important than the context of other words.


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
MrGamma  
View profile  
 More options May 3 2008, 8:14 am
From: MrGamma
Date: Sat, 3 May 2008 05:14:03 -0700 (PDT)
Local: Sat, May 3 2008 8:14 am
Subject: Re: Stemming of complex plurals
I guess Google is discerning quality content by what variations of the
search term are present on a page... Maybe they will extract the
quality of content based on the unique meaning of the words when
compared against the search term?

Honestly... I'm not sure if Google uses links anymore to determine
relevancy... I think they just build a huge world wide word density
cloud and use that to abstract the depth of information for a search
term based on the frequency it appears within a community...

I mean... If I can within reason think that something like that was
actually possible... what's stopping a muti billion dollar empire with
virtually unlimited resources from doing it... or at the very least
trying it... or more likely... doing something one hundred times more
accurate that anything I could dream of...


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Phil Payne  
View profile  
 More options May 3 2008, 8:18 am
From: Phil Payne
Date: Sat, 3 May 2008 05:18:31 -0700 (PDT)
Local: Sat, May 3 2008 8:18 am
Subject: Re: Stemming of complex plurals

> It goes a bit further than simple stemming. A UK search for "Smart Car
> NOS" will show you a snippet with "Nitrous" highlighted in bold. The
> word "NOS" is not used anywhere on the site or in anchor text that
> links to the site.

Wrong.  Check out the cached copy:

These search terms have been highlighted:       smart   car
These terms only appear in links pointing to this page: nos

That's nothing new - I have several pages that do this.

And it's been occupying my mind a little for some time.

A link is traditionally regarded as a "vote" for the site, and is held
to increase its popularity.  Anchor text in such a link is a "vote"
for each of the keywords - is Google trying to improve the quality of
search results by actively responding to keywords that the author of a
particular page may simply have omitted?

It seems to me that anchor text keywords are only marginally less
important than <title> keywords, and as important if not more
important than any other keywords.


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
JohnMu Google employee  
View profile  
(3 users)  More options May 3 2008, 8:19 am
From: JohnMu
Date: Sat, 3 May 2008 05:19:10 -0700 (PDT)
Local: Sat, May 3 2008 8:19 am
Subject: Re: Stemming of complex plurals
Are you saying that we're doing a good job? :-) I'll pass it on to the
teams involved, it's nice to see that their work is not going
unnoticed!

One other element that plays a role in all of this is personalized
search. That can help with stemming, but it can also be really
interesting when it comes to disambiguation in general. "NOS" can be a
synonym for a kind of car enhancement technology, but it can also mean
a lot of other things: http://www.google.com/search?q=define%3Anos .
Words like "mercury" can be even more complicated:
http://www.google.com/search?q=define%3Amercury . If the search engine
knows that you're an astronomy-fan, maybe it will be able to help you
find the planet first.

How can we help you find things even better?

John


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
webado  
View profile  
 More options May 3 2008, 8:38 am
From: webado
Date: Sat, 3 May 2008 05:38:54 -0700 (PDT)
Local: Sat, May 3 2008 8:38 am
Subject: Re: Stemming of complex plurals

On May 3, 8:19 am, JohnMu wrote:

> How can we help you find things even better?

> John

You'll have to read the minds of webmasters and give them their site
regardless of how well or poorly they optimized it for the most
generic of search terms ;)

Oh, wait, I think you do this too, but they have to concentrate really
hard on that  LOL


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
beckysharpe  
View profile  
(1 user)  More options May 3 2008, 9:16 am
From: beckysharpe
Date: Sat, 3 May 2008 06:16:45 -0700 (PDT)
Local: Sat, May 3 2008 9:16 am
Subject: Re: Stemming of complex plurals
Robbo

My first reaction is that if the bots are reading the intent of the
page as well as the content, then from a SEO viewpoint improving
content is the first priority, and bliss, we won't need to think about
creating content that uses the keyword in a multiplicity of
grammatical forms to cover all bases.

But that's for sites that are built and optimised at the same time,
where the designer has control of content.  It's making it
increasingly difficult to use off-page tactics to optimise a site; it
will be interesting to see what sort of strategies the SEO gurus come
up with to meet this challenge.

Becky


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Gissit  
View profile  
 More options May 3 2008, 9:54 am
From: Gissit
Date: Sat, 3 May 2008 06:54:13 -0700 (PDT)
Local: Sat, May 3 2008 9:54 am
Subject: Re: Stemming of complex plurals
 > Wrong.  Check out the cached copy:

> These search terms have been highlighted:       smart   car
> These terms only appear in links pointing to this page: nos

Phil
Did you not read my post? There are NO LINKS WITH THIS IN THE ANCHOR
and hte word Nitrous is bold in the snippet.

I have control of this site and I know when the subject was added and
I know when it was cached and the result appeared pretty soon after.

There is a link to the site from a nitrous forum (just the site name
in the anchor) and obviously they use the term NOS interchangeably
with Nitrous. Just maybe Google has worked out the context from other
sites that link in rather than actual anchor text.

G really is much smarter than people give it credit for. I have some
other hobby sites that rank extremely well just for their content as
they have only a couple of inbound links from forums.


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Phil Payne  
View profile  
 More options May 3 2008, 10:04 am
From: Phil Payne
Date: Sat, 3 May 2008 07:04:11 -0700 (PDT)
Local: Sat, May 3 2008 10:04 am
Subject: Re: Stemming of complex plurals

> Did you not read my post? There are NO LINKS WITH THIS IN THE ANCHOR
> and hte word Nitrous is bold in the snippet.

Check out the cached item for yourself:

http://66.102.9.104/search?q=cache:AzjYEZUY1zwJ:www.smartuki.com/+sma...

See "These terms only appear in links pointing to this page: nos"

That's from Google, not from me.  Whether or not you believe you do or
don't have nos in anchor text, someone somewhere obviously does.

So that page being found for 'smart car nos' is no surprise at all.
The highlighting of 'nitrous' is indeed interesting, but it's a
completely different issue since snippet building occurs during SERPs
assembly, not during searching.


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
webado  
View profile  
 More options May 3 2008, 10:05 am
From: webado
Date: Sat, 3 May 2008 07:05:34 -0700 (PDT)
Local: Sat, May 3 2008 10:05 am
Subject: Re: Stemming of complex plurals
Which site is that? I spotchecked a bunch which didn't have NOS in
title or description snippet either, but  had it on the page itself or
in links pointing to the page.
I show 100 links per page so obviously I didn't test all of them.

On May 3, 9:54 am, Gissit wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Autocrat  
View profile  
(1 user)  More options May 3 2008, 10:10 am
From: Autocrat
Date: Sat, 3 May 2008 07:10:58 -0700 (PDT)
Local: Sat, May 3 2008 10:10 am
Subject: Re: Stemming of complex plurals
The 'appears in links' is kind of problematic.

I've got a couple of sites tha apparently get listed for words due to
link text... yet I also know that no onsite links use that term... and
the sites in questions have barely any inlinks... and looking through
with Yahoo... I do not see that term applied either.
(Not in the URL, not in the Text, not in the title attribute, not even
in text around the link).

Maybe... G has got clever and now use that terms a 'blanket'?


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Robbo  
View profile  
 More options May 3 2008, 11:01 am
From: Robbo
Date: Sat, 3 May 2008 08:01:50 -0700 (PDT)
Local: Sat, May 3 2008 11:01 am
Subject: Re: Stemming of complex plurals

> "These terms only appear in links pointing to this page: nos"

I don't think that such messages should be taken too literally.  I
believe that although inbound link text was the original and probably
still the main cause, this message is now used for other reasons too.

I interpret it to mean ~ "for reasons other than that keyword
appearing on the page itself".

From a logical and empirical point of view, it is of course generally
not possible to establish that there is no IBL with link text
=[keyword]. To prove that there is no such link text would involve
having access to the crawl results of EVERY page on the internet over
a long period of time.  The contrary is easy to prove: I only have to
find one page with that link text.

Robbo


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Autocrat  
View profile  
 More options May 3 2008, 11:13 am
From: Autocrat
Date: Sat, 3 May 2008 08:13:49 -0700 (PDT)
Local: Sat, May 3 2008 11:13 am
Subject: Re: Stemming of complex plurals
"...
From a logical and empirical point of view, it is of course generally
not possible to establish that there is no IBL with link text
=[keyword]. To prove that there is no such link text would involve
having access to the crawl results of EVERY page on the internet over
a long period of time.  The contrary is easy to prove: I only have to
find one page with that link text.
..."

Not always difficult.
As I said... I have a few sites with barely any links.
Yet they still show up for a word that does not appear anywhere that I
can find (G/Y?M show no sitesl inking that seem to have that word on
it anywhere).

"...
I interpret it to mean ~ "for reasons other than that keyword
appearing on the page itself".
..."
I agree whole heartedly... I think it is a 'general' response, and
could have several factors...
Maybe someone at G forget to add the rest of hte code for that page ;D


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Robbo  
View profile  
 More options May 3 2008, 11:30 am
From: Robbo
Date: Sat, 3 May 2008 08:30:24 -0700 (PDT)
Local: Sat, May 3 2008 11:30 am
Subject: Re: Stemming of complex plurals
There may be a link from an authoritative site about [NOS] and by
context the link is evaluated to be relevant to the target site even
if the link text itself does NOT contain [NOS] eg

" NOS is a term used blah  blah  blah  blah  blah  blah  blah  blah
blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah
blah  blah  blah  blah  blah  blah  blah  blah  blah  blah  blah. For
further information, visit: www.anotherdomain.co.uk "

In that example, the link text is not [ NOS ] but the context and
focus of that paragraph could only be interpreted to mean that
[ www.anotherdomain.co.uk ] is a good and highly relevant place to
look for things related to [ NOS ] even if the topic/subject is called
by some other name on the target site.

I also think that if a highly authoritative site has an "excellent"
page on keyword1 and another on keyword2 and another on "keyword3",
search quality would be enhanced by returning pages from that site
EVEN IF all three keywords did not appear on the same page.  In fact,
it could be argued that in many cases by having a separate
(substantial) page devoted to each respective topic, there is is a
likelihood that the site is more relevant to a search for
authoritative information in response to the original search term.

Autocrat, I take your point about evaluating the available evidence on
IBLs but the fact is that we have NO WAY of definitively 100% certain
knowing that there is no link, anywhere on any of the billions of
pages on the internet, that points with the specified link text to any
given page on any given site.  None of the available tools is
completely authoritative.  Although Yahoo is very useful in exploring
links, we do not have access to complete information about what Google
sees, and it is what Google sees that is relevant here because we are
talking about Google indexing and reporting.

Robbo


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Autocrat  
View profile  
 More options May 3 2008, 11:41 am
From: Autocrat
Date: Sat, 3 May 2008 08:41:05 -0700 (PDT)
Local: Sat, May 3 2008 11:41 am
Subject: Re: Stemming of complex plurals
I make reference to Yahoo as it seems a lot more 'complete' than thee
google links (of which, half my sites apparently have no links
according to G ;)).

As to the rest... you are talking about 'site themeing' ?
Ranking a site as well as a page based up on content/topic/relativity
to specific terms?
If so, again, I agree.

In fact, I'd probably lay money on it.

I think it's also a major factor on the links.
I'm pretty certain tat links from relevant sites give more weight...
regardless of PR of the site.
It's the link text and the topic of the page/site that contributes
too.
(I'm waiting for the day they say PR is useless ;))

- - - - - - - - - - - - -

So... we may have
page content...
with varieties of the focus word/phrase
site content...
with varieties of the focus word/phrase
Site links....
with varieties of the focus word/phrase
External links
with varieties of the focus word/phrase in the link text
External links
with varieties of the focus word/phrase in the content around the link
External links
with varieties of the focus word/phrase thought the external site

Sounds more than feasible to me :DAs I posted on another topic... it's
technically not even that difficult (though CG never bothered to
resond ;)).
The implementation and resource usage for it is probably a
nightmare... but the principal is sound.
It's the words with multiple uses that would be a pain... as that
would require filtering/looping over to search for variations/
alternatives to decide which usage is correct/applicable.
If that is being done - I feel very sorry for the G server admins :D
(You could probably cook ontop of those servers)


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Robbo  
View profile  
(2 users)  More options May 3 2008, 11:55 am
From: Robbo
Date: Sat, 3 May 2008 08:55:07 -0700 (PDT)
Local: Sat, May 3 2008 11:55 am
Subject: Re: Stemming of complex plurals

Another aspect of computational linguistics that may have a bearing on
SEO in ways similar to what Becky called "stemming of complex plurals"
is that of "anaphora resolution" regarding the notion of keyword
density.

In computational linguistics, anaphora resolution is concerned with
disambiguation of (eg) pronouns like it, he, him, her, etc (informally
speaking).

So if I write:
"France is a beautiful country. That country is in western Europe and
it is controlled by farmers. The republic has huge wine lakes and
butter mountains but recently it has decided to free up some of the
bureaucracy of that nation."

With traditional keyword density calculations, there is only one
occurrence of [France] but if <hypothetically> Google adopted methods
of anaphora resolution from computational linguistics, we can see that
the above string in fact has seven occurrences of [France} and
EQUIVALENTS (France, that country, it, the republic, it, its, that
nation).

Watch this space!

Robbo


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Autocrat  
View profile  
 More options May 3 2008, 12:12 pm
From: Autocrat
Date: Sat, 3 May 2008 09:12:49 -0700 (PDT)
Local: Sat, May 3 2008 12:12 pm
Subject: Re: Stemming of complex plurals
That would be fantastic.
I doubt if it is currently happening though.... unless you know
otherwise?

The reasons I doubt it is;
1) Various languages have different syntaxing and gramatical
structures - would be a nightmare
2) Multiple shifts of focu word on a page/site would cause some
confusion (even if shifteing within alternatives of the same word).
3) half the people on the web cannot type 9myself included) and thus
try to guess what they are talking aobut would probably kill the
servers :D

That aside... can you imagine how nice it would be?
To actually write 'correctly' and not pretend your readers are
goldfish (short term memory).
Of course... spotting spam could get amusing.

See how often the word 'it' appears on a page ROFL

Still, I do believe their is a certain measure of 'density' used,
along with proximity and location (how often used, how near to each
occurence, and whether at the start, middle or end of strucutres
etc.).
I have no idea if it is influential, but logic suggests it has some
usage (definitely to spot stuffing :D).


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
seosares  
View profile  
 More options May 3 2008, 12:44 pm
From: seosares
Date: Sat, 3 May 2008 09:44:33 -0700 (PDT)
Local: Sat, May 3 2008 12:44 pm
Subject: Re: Stemming of complex plurals
http://66.102.9.104/search?q=cache:AzjYEZUY1zwJ:www.smartuki.com/+goo...

On 3 mai, 18:12, Autocrat wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Autocrat  
View profile  
 More options May 3 2008, 1:13 pm
From: Autocrat
Date: Sat, 3 May 2008 10:13:10 -0700 (PDT)
Local: Sat, May 3 2008 1:13 pm
Subject: Re: Stemming of complex plurals
Well, obviously I suck to hell... as I cannot find that term, nor a
reliable portion of it, at all.
So where is it comming from?

    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 1 - 25 of 55   Newer >
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2010 Google