Perhaps if dupe content was reported so that we could find the problematic URLs or parameters? And some report of what the algo believes is the canonical URL so that if this is incorrect we can apply additional resources to fixing this on our sides.
As the size of sites grow dealing with dupe content becomes a less trivial issue.
I would second this and I was going to say something similar myself - transparency is important here and it would be really good if this could be added into the Webmaster Tools reports. Trusting Google to "consolidate properties of the URLs in the cluster, such as link popularity, to the representative URL" could easily breed complacency that the the original problem doesn't need to be fixed and could mean the number of inadvertant duplicates increases.
BTW - excellently detailed post that helps website owners understand the true situation, rather than have to rely upon SEO agencies - would rather hear it direct from you guy's...please keep these type of updates coming!
> Perhaps if dupe content was reported so that we could find the > problematic URLs or parameters? And some report of what the algo > believes is the canonical URL so that if this is incorrect we can > apply additional resources to fixing this on our sides.
> As the size of sites grow dealing with dupe content becomes a less > trivial issue.
> Nice to see you engaging the community via group.
1) whatever swedish-fish.jpg is, it's not being shown.
2) I'm encouraged to see Maile say, "When tracking visitor information, use 301 redirects to redirect URLs with parameters such as affiliateID, trackingID, etc. to the canonical version." I've been tempted to do this since I've got affiliate links that have caused some of those links to be seen in the search results (I got some newspapers that carry a lot of PageRank to be affiliates and their links are more powerful than the stores themselves), but I didn't want to appear like I was doing anything too sneaky. I must admit however, as the stores have aged Google has gotten the correct URLs listed more and more.
> I would second this and I was going to say something similar myself - > transparency is important here and it would be really good if this > could be added into the Webmaster Tools reports. Trusting Google to > "consolidate properties of the URLs in the cluster, such as link > popularity, to the representative URL" could easily breed complacency > that the the original problem doesn't need to be fixed and could mean > the number of inadvertant duplicates increases.
> BTW - excellently detailed post that helps website owners understand > the true situation, rather than have to rely upon SEO agencies - would > rather hear it direct from you guy's...please keep these type of > updates coming!
> Regards, > Paul
> On Sep 12, 11:47 am, Red Cardinal wrote:
> > Perhaps if dupe content was reported so that we could find the > > problematic URLs or parameters? And some report of what the algo > > believes is the canonical URL so that if this is incorrect we can > > apply additional resources to fixing this on our sides.
> > As the size of sites grow dealing with dupe content becomes a less > > trivial issue.
> > Nice to see you engaging the community via group.
Can you clarify what you wrote in the "Why should you care?" section:
"1. Having multiple URLs can dilute link popularity. For example, in the diagram above, rather than 50 links to your intended display URL, the 50 links may be divided three ways among the three distinct URLs."
Vs. what you said in the "How we help users and webmasters with duplicate content" section:
"3. We then consolidate properties of the URLs in the cluster, such as link popularity, to the representative URL."
I'd like to firstly thank Maile and the Google team for addressing this.
Here's my vote for Google producing a report that Webmasters and Siteowners can use to identify URL's that G considers to be duplicate. I'd like to re iterate earlier remarks here that management of this with larger sites and unfamiliar CMS software which is widely and often incorrectly handled is a big problem, even for advanced websmaster skills.
Ordinary folks will find it even more complex, and we for one have been struggling with this problem for over 4 years, with a great deal of skilled input.
> Can you clarify what you wrote in the "Why should you care?" section:
> "1. Having multiple URLs can dilute link popularity. For example, in > the diagram above, rather than 50 links to your intended display URL, > the 50 links may be divided three ways among the three distinct URLs."
> Vs. what you said in the "How we help users and webmasters withduplicatecontent" section:
> "3. We then consolidate properties of the URLs in the cluster, such as > link popularity, to the representative URL."
> If multiple coded URLs are used for tracking (for examplewww.site.com/story,www.site.com/story?xid=rss,www.site.com/story?xid=..., etc.) > is the link popularity of all the URLs completely consolidated into > the URL that Google deems the "best URL"? Or is link popularity > dilution still an issue?
The new tool that Yahoo! have to specify which URL parameters are irrelevant and should be stripped out looks extremely useful, and this is something we'd all love to see Google support in the future. It's pretty much exactly what we need to solve this issue in most cases.
Reports of duplicate content that was discovered, as another poster suggested, would also be extremely useful.
I agree, why try and re-invent the wheel? Go with what Yahoo did and admit they did it first, it is a great idea and you also think so.
If you want to be new and innovative, figure out a way to deal with xml feeds and tell webmasters straight out if that is or isn't considered duplicate content and how to deal with it (robots.txt out is that smart or not?). While I'm on the topic, drop those suckers from the index OR at least make it clear that they are feeds. 99% of the people using your search haven't got a clue what they are when they hit a page like that...
> The new tool that Yahoo! have to specify which URL parameters are > irrelevant and should be stripped out looks extremely useful, and this > is something we'd all love to see Google support in the future. It's > pretty much exactly what we need to solve this issue in most cases.
> Reports of duplicate content that was discovered, as another poster > suggested, would also be extremely useful.
Providing a way through the webmaster console of specifying which URL's can be grouped whilst at the same time authenticating ownership by confirming an online declaration of original content in a similar method employed by Wickopedia when you upload an image to it. With scraper sites i appreciate creation dates are not always reliable. If site owners are forced to register content, some say non-savvy site owners wouldn't know to register content however i believe word would get round pretty quickly particualry if it was mentioned in webmaster guidleines. I realise that a third party could take the content and register it instead however i beleive that would be fewer than the current system where scrapers rely on their own site's authority and the huge number of links they create to the page which many more non-savvy webmasters do not have a clue about. The flaw is that the scraper will pick on a new site or recently published URL in the knowledge it has little authority. In the real non-virtual world creators of original works are already aware of the need to register trademarks etc so would it not be better to extend this time tested practice to indexing on search engines rather than rely on a system which i fear many scrapers know how to cheat.
hi, i have a question. i have a website hosted my school's web server that i transfered to a new web host. i also have my own domain name now. i have the exact same identical content though (a blog published through Blogger).
i tried doing a "301" with ".htaccess" like what i read online, but it doesn't work. i tried putting it in every conceivable folder. i also saw different versions of what to put inside the .htaccess file, and tried every possible variation but it doesn't work.
my new web host has a cPanel where you can just click a button and make a link redirect to something else but my school's web server doesn't have that (at least not for students). so i can't use that feature to redirect my old link to my new link.
is there a way i can just write a letter to Google and ask them to refer to my new domain name when some searches for my site (Misadventures in Taiwan) in the future?