Account Options

  1. Sign in
The old Google Groups will be going away soon.
Switch to the new Google Groups.
Google Groups Home
« Groups Home
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  9 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Andy Beard  
View profile  
 More options Sep 24 2008, 12:46 pm
From: Andy Beard
Date: Wed, 24 Sep 2008 09:46:59 -0700 (PDT)
Local: Wed, Sep 24 2008 12:46 pm
Subject: Tag Pages & Indexing
Many CMS systems create tag pages - these could be looked on as pages
of related snippets from the site contents.
Then you have multipe sites that act as an alternative tag end-point,
or "tag space", where if you don't use something internally, you can
point your links using rel="tag" for visitor content discovery.

These are not quite search results, they are predefined by a content
author to be specifically related to the content.

Google seems to handle tag pages in a fairly random fashion.

Sometimes they will all appear on a site search such as
site:andybeard.eu /tag/

When they are indexed, sometimes Google retains a cache version of the
pages, but that is not universal.

Sometimes the tag pages rank extremely well, in part due to the
typically well optimized title tag, but also due to the nature of the
related content on the pages created.

Sometimes Google even ranks tag pages that are 100% duplicated,
because a number of terms had not been used before, thus for example a
blog would end up with multiple tag pages just containing an identical
snippet.

Many tag spaces seem to rank very well, e.g. technorati, Wordpress.com
tag pages, Wikipedia etc.

However that is not universal...

As an example http://technorati.com/tag/sarah+palin doesn't seem to be
indexed, whereas http://technorati.com/tag/john+mccain and
http://technorati.com/tag/barack+obama are indexed.
Sarak Palin has been quite a popular news topic on Technorati for
weeks, and the link to the tag page has been appering on their front
page almost constantly, thus that page is receiving a lot of "Google
Juice", certainly enough to be indexed, and maybe looked on as
important.

Over the last 12 months I have noticed some changes to the way Google
has treated tag pages, but it seems very much domain specific, and
following no real logic.
Wordpress.com tag pages for instance are quite sparse, yet still seem
to get indexed and ranked well.
Technorati generally is well indexed.
Sites with more value in their navigational elements than
Wordpress.com, though ultimately still a collection of 3rd party
snippets can get most or all their tag pages deindexed, as has
happened to a few of the sites I monitor.

Tag pages can provide useful search results, and on a domain with
reasonable trust have a chance of ranking even if the original content
has been buried extremely deeply within a CMS navigation system.
As a search user, I am often happy to browse tag pages to find content
that otherwise might have slipped through the cracks, or not be
considered relevant, especially on technical topics.

How does Google decide which tag pages deserve to be indexed? it seems
to be almost random

I would love to see rules applied evenly - it is painful to see 1M+
pageviews per month knocked off a site that has tag pages whilst not
quite up to Technorati standards, certainly more human useful than
those on many authority sites that still have the pages indexed.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
bluegill01  
View profile  
 More options Sep 24 2008, 1:16 pm
From: bluegill01
Date: Wed, 24 Sep 2008 10:16:14 -0700 (PDT)
Local: Wed, Sep 24 2008 1:16 pm
Subject: Re: Tag Pages & Indexing
On some pages Google sees tags as keyword spam.  It all depends on how
you have them listed and linked.  When in doubt use a link condom - no
follow tag.

On Sep 24, 12:46 pm, Andy Beard wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andy Beard  
View profile  
 More options Sep 25 2008, 5:27 am
From: Andy Beard
Date: Thu, 25 Sep 2008 02:27:40 -0700 (PDT)
Local: Thurs, Sep 25 2008 5:27 am
Subject: Re: Tag Pages & Indexing
I gave an example, Sarah Palin vs John McCain vs Barack Obama with all
3 tag pages linked from the home page of Technorai, and have been for
some time. That means they are gaining tons of juice from an authority
domain.
The overall quality of the pages is the same, and the pages are
gaining tons of editorial links as bloggers use Technorai as "tag
space", though if they were smart they would nofollow the links, or
use internal tag space.

Sticking nofollow on the links, or using meta noindex on the tag pages
would achieve effectively the same as the current situation, with
Google ignoring the pages, which isn't the preferred outcome. That
still means a significant drop in search traffic to pages containing
highly relevant results, though I understand Google would rather list
source pages above tag pages composed of snippets, in many instances,
tag pages can in many cases provide fresher results.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
JohnMu Google employee  
View profile  
 More options Sep 25 2008, 10:58 am
From: JohnMu
Date: Thu, 25 Sep 2008 07:58:26 -0700 (PDT)
Local: Thurs, Sep 25 2008 10:58 am
Subject: Re: Tag Pages & Indexing
Hi Andy
Now I understand what you mean :-)

I tried various site:-queries for the tag pages that you mentioned and
I noticed that all of them are indexed. Do you see it differently in
your results?

I'm not aware that we're treating this kind of page any different than
other pages. A tag page can be a good resource for users, at the same
time it can also be that it would make sense to send the users to an
article directly instead of having them take the "detour" through a
tag page (I imagine it depends on the actual query). In most cases, I
would assume that these tag pages are more of value to us in helping
us to find the content that is linked from them. If a new article
comes up and is listed there, we'll want to go grab that article as
fast as we can.

Looking at the tag pages you mentioned and the results from our index:
http://www.google.com/search?q=site:http://technorati.com/tag/sarah%2...
http://www.google.com/search?q=site:http://technorati.com/tag/john%2B...
http://www.google.com/search?q=site:http://technorati.com/tag/barack%...
.. I see one common issue that could be affecting a smaller site:
there a bunch of variations, including punctuation and capitalization.
There's "/sarah+palin", "/Sarah+Palin," (comma at the end,
capitalized), "/Sarah+Palin+" (extra space at the end), "/John
+mccain", "/John+McCain," , "/John+McCain+", "/Barack+Obama+-+", etc.
The reason we keep these duplicates in the index is simple: they
returned different content when we crawled them. Since these topics
update fairly frequently, it's possible that crawling them at
different points in time (even just minutes/hours separated) will
result in vastly different content.

Another problem with tag pages (depending on how they are made) is
that they could be seen as pages that do not contain much original
content. Obviously this depends a lot on the implementation, and
looking at the Technorati pages, it appears that they work hard to
make sure that there is a good mix of information and quotes. If a tag
page is just linking to other sites with fairly large snippets /
quotes and no other unique content, then that might be a situation
where we would prefer to send the user directly to an article -- in
the worst case, it could be hard to tell that kind of tag page apart
from RSS-scraper sites.

So if you have tag pages that you feel are relevant to users, then I
would recommend making sure that your site in general is nicely
crawlable (no significant duplication) and that those pages provide
unique and compelling content in addition to the snippets that you are
providing. That's pretty much what we would recommend for web pages in
general :).

Hope it helps!
John


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andy Beard  
View profile  
 More options Sep 26 2008, 9:08 am
From: Andy Beard
Date: Fri, 26 Sep 2008 06:08:07 -0700 (PDT)
Local: Fri, Sep 26 2008 9:08 am
Subject: Re: Tag Pages & Indexing
John thanks for the feedback

You are actually making a mistake in the query you are using as you
are using site:

If you remove the site: , the Sarah Palin URL is not returned, the
John McCain and Barack Obama are.

Here are some example searches

http://www.google.com/search?q=http%3A%2F%2Ftechnorati.com%2Ftag%2Fba...
(indexed Barack Obama)
http://www.google.com/search?q=http%3A%2F%2Ftechnorati.com%2Ftag%2Fjo...
(indexed John McCain)
http://www.google.com/search?q=http%3A%2F%2Ftechnorati.com%2Ftag%2Fsa...
(not indexed Sarah Palin)

The version of the URL I am searching for is the version used on the
front page of Technorati, so that version should be the one gaining
enough juice to at least get indexed in the primary index (there is
meant to be only one these days ;) ) - but it is not

If I do a site query on the friends site I am trying to resolve
problems with, or even my own blog, all tag pages are returned. Site
queries are handled differently these days.

However my friends site has lost 95% of their search traffic to tag
pages within the last 2-3 months. My blog actually took at least one
hit on tag pages in December.
Blog tag pages can be fairly rough by default so I accept I need to
work on them a little, though sometimes when they receive a lot of
juice and have lots of content, they are a great reference, I even
have one of my tag pages linked from Wikipedia.
They also allow publishers to throw a slightly wider net.

Here is a good example:-

My tag page for Izea's Social Spark service is currently a little
barren as so far I have only written about the service once.
http://andybeard.eu/tag/social_spark

It does however rank well for Google searches for Social Spark, and
has a 27% bounce rate

If a searcher searched for SocialSpark, they would ultimately find the
primary article - I get more traffic to the tag page due to higher
ranking and possibly it is a better search term with correct spacing.

Examples of what tag indexing changes can do to traffic
http://andybeard.eu/tag-pages-andybeard.eu.jpeg  (image: on
Andybeard.eu)
That is a fairly hefty hit, though I am not too woried about it - at
the time I had full content showing on tag pages and I now have
snippets - not as useful for a visitor now but less likely to be
classed as duplicate. I used to average maybe 50 visits per day to tag
pages, with a peak of 80+, it is now close to single figures.

Now for the big hit

http://andybeard.eu/tag-pages-friend.png (image: Friends site 95%
drop)

That is just segmenting on the tag pages as a landing page from Google
search

The tag pages are not as good as those for Technorati, who are
indexing more content, and pulling content from more sources
The tag pages are in my opinion (though everyone has one) better than
those from Wordpress.com e.g. Sarah Palin tag http://wordpress.com/tag/sarah-palin/
The Wordpress.com Sarah Palin tag page is index
http://www.google.com/search?q=http%3A%2F%2Fwordpress.com%2Ftag%2Fsar...
There has been a small reduction in emphasis of the tag pages, and
there are some confusing factors, but no obvious reason why they would
be dropped

Something seems to have changed in the way Google treats the indexing
of tag pages, though there is no official guidance, other than some
people assume they come under search results and shouldn't be indexed,
and many make the mistake of blocking them with robots.txt and create
100s or 1000s of dangling pages on their sites.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
JohnMu Google employee  
View profile  
 More options Sep 26 2008, 5:17 pm
From: JohnMu
Date: Fri, 26 Sep 2008 14:17:35 -0700 (PDT)
Local: Fri, Sep 26 2008 5:17 pm
Subject: Re: Tag Pages & Indexing
Hi Andy
Thanks for your long reply. Let's see how far we get :) ...

I think searching for a URL like this is not always going to bring you
what you might expect; at any rate I wouldn't use it as a measure of
whether or not a URL is indexed. It might however give you information
regarding where the URL is mentioned (in this case I see our
discussions on this topic already dominate the results :-)).

> However my friends site has lost 95% of their search traffic to tag
> pages within the last 2-3 months. My blog actually took at least one
> hit on tag pages in December.

I think it would be worthwhile to look at these issues separately from
the tag pages.

> http://andybeard.eu/tag-pages-friend.png (image: Friends site 95%
> drop)
> That is just segmenting on the tag pages as a landing page from Google
> search

Again, I think it would be worthwhile to look at the page on it's own.
I am not aware of any "tag page problem" -- as you mentioned, there
are a lot of really good pages like that.

> Something seems to have changed in the way Google treats the indexing
> of tag pages, though there is no official guidance, other than some
> people assume they come under search results and shouldn't be indexed,
> and many make the mistake of blocking them with robots.txt and create
> 100s or 1000s of dangling pages on their sites.

I think the assumption that they are all "search results" is generally
not correct and it really depends a lot on the pages themselves. In
cases like that, it's really impossible to say too much without
knowing the individual URLs -- and even then it's possible that it's
just a result of natural fluctuations in ranking, just as can happen
to any other kind of URL.

Feel free to start a thread (or have your friend start one) so that
the users & Googlers here can take a look. Perhaps there's something
that could be improved on your site or, who knows, perhaps there's
even something that could be improved on our side.

Cheers,
John


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andy Beard  
View profile  
 More options Sep 27 2008, 8:01 am
From: Andy Beard
Date: Sat, 27 Sep 2008 05:01:22 -0700 (PDT)
Local: Sat, Sep 27 2008 8:01 am
Subject: Re: Tag Pages & Indexing
The Sarah Palin tag on Technorati has actually started appearing today
within that search result, so it is no longer a useful example of the
problem, though I have always found that there is a close correllation
between that method of searching, and whether a page can be returned
in a normal search (as opposed to a site search) - it generally has
less noise than searching for a unique phrase, especially on pages
that themselves are syndicated content.

I am not currently at liberty to discuss the specific site in public
(why I have been using general examples), and it is probably too
prominent to discuss in public without causing unnecessary concern.

Also a specific example doesn't necessarily help in general guidance.
The site in question has some other problems which might complicate
any specific diagnosis anyway, but before giving recommendations for
change, some general guidance was useful.

One example of things which complicate matters...

We removed completely the XML sitemaps which were incomplete, only
containing 100,000 of 6M+ pages, and specifically didn't contain tag
pages or most of the user generated content (it is a dynamic site, I
don't think a complete count of pages exists)

Within 1 week, site:domain.com now returns 900,000 more indexed pages
Within 1 week, site:domain.com/* now returns 9,000 more pages in the
primary index (as opposed to supplemental index that doesn't exist any
more) - it is actually strange, because that figure seemed to have
remained capped in some way for months. This isn't a site I have been
monitoring constantly, thus I don't have daily figures (I don't work
as an SEO consultant) - also that search query modifer might just
return random meaningless numbers.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
JohnMu Google employee  
View profile  
 More options Sep 27 2008, 6:26 pm
From: JohnMu
Date: Sat, 27 Sep 2008 15:26:05 -0700 (PDT)
Local: Sat, Sep 27 2008 6:26 pm
Subject: Re: Tag Pages & Indexing
Hi Andy
There are a lot of factors that go into crawling and indexing, URLs
found in XML Sitemap files are just one of the ways that we find URLs
that can be crawled. XML Sitemap files generally don't limit crawling
or indexing of other parts of a site at all. Lots of sites only list a
part of the actual URLs in Sitemap files (for various reasons) and we
still crawl and index the rest. I've never seen a change like this
because of XML Sitemap files, so I'd have to assume that it's more of
a coincidence. It would be interesting to double-check  based on the
URL of the site, but I feel pretty certain that any change like that
is not related to Sitemap files.

I'm not really sure what site:domain.com/* returns, but it's probably
not what you are assuming :-).

Hope it helps!

John


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andy Beard  
View profile  
 More options Oct 1 2008, 5:13 am
From: Andy Beard
Date: Wed, 1 Oct 2008 02:13:34 -0700 (PDT)
Local: Wed, Oct 1 2008 5:13 am
Subject: Re: Tag Pages & Indexing
Wordpress.com just added nofollow on the links away from their tag
pages

http://www.seoibiza.com/blog/2008/09/30/pagerank-crunch-wordpresscom-...

I am currently assuming that this is an attempt to retain more juice
and not under specific direction to "clean up" giving juice to user
generated content - after all Technorati now give juice, or are doing
some very smart IP delivery.

In many ways it is only fair that WP.com users receive juice back -
they have no control over the links to tag pages on their blogs, or
whether to give juice to Wordpress.com as a whole.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »