Account Options

  1. Sign in
The old Google Groups will be going away soon.
Switch to the new Google Groups.
Google Groups Home
« Groups Home
Discussions > Crawling, indexing, and ranking > Webmaster Central video blog: Removing your content from Google
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  14 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Riona MacNamara Google employee  
View profile  
 More options Jan 10 2008, 12:28 am
From: Riona MacNamara
Date: Wed, 9 Jan 2008 21:28:46 -0800 (PST)
Local: Thurs, Jan 10 2008 12:28 am
Subject: Webmaster Central video blog: Removing your content from Google
Got feedback on our latest video blog post? Post it here!

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
JLH  
View profile  
 More options Jan 10 2008, 12:32 am
From: JLH
Date: Wed, 9 Jan 2008 21:32:14 -0800 (PST)
Local: Thurs, Jan 10 2008 12:32 am
Subject: Re: Webmaster Central video blog: Removing your content from Google
Here's the video:

http://googlewebmastercentral.blogspot.com/2008/01/remove-your-conten...

On Jan 9, 11:28 pm, Riona MacNamara wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
planetmike  
View profile  
 More options Jan 10 2008, 8:45 am
From: planetmike
Date: Thu, 10 Jan 2008 05:45:51 -0800 (PST)
Local: Thurs, Jan 10 2008 8:45 am
Subject: Re: Webmaster Central video blog: Removing your content from Google
You didn't mention sitemap.xml and how that affects a page in the
google Index. If we don't list a page in our sitemap, will it still be
included in the Google Index? It's very possible we'll have links to a
page no in our sitemap.

I am splitting my web site into two sites. I've moved the content, now
I'm using .htaccess to rewrite the URL from PlanetMike.com to the new
URL at MichaelClark.name. I'm giving out server code 301, and updating
my sitemap.xml. What will happen when the Googlebot next sees one of
my pages that has been moved? I assume the 301 will tell Googlebot
that the page isn't at PlanetMike.com any more, effectively removing
that page from Google. And at the same time the 301 is telling
GoogleBot to add the new page to its records for MichaelClark.name.
I've been watching both domains pretty closely in the Webmaster Tools
area, and it looks like everything is working smoothly.

Thanks, Mike

On Jan 10, 12:32 am, JLH wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Sebastian  
View profile  
 More options Jan 10 2008, 9:25 am
From: Sebastian
Date: Thu, 10 Jan 2008 06:25:37 -0800 (PST)
Local: Thurs, Jan 10 2008 9:25 am
Subject: Re: Webmaster Central video blog: Removing your content from Google
@Matt

Thanks for the great explanations!

As for password protected contents, are you sure that you don't index
those based on 3rd party signals like ODP listings or strong inbound
links?

You totally forgot to mention the neat X-Robots-Tag that allows
outputting REP tags like "noindex" even for non-HTML resources like
PDFs or videos in the HTTP header. That's an invention Google can be
very proud of. :)

@Ian M, who in the comments asks for a "Noindex:" statement in
robots.txt

Actually, Google experiments with Noindex: in robots.txt, but that's
"improvable":
http://sebastians-pamphlets.com/standardization-of-rep-tags-as-robots...

@Google

Currently Google interprets Noindex: in robots.txt as (Disallow: +
Noindex:). I think that's completely wrong, because:

1. It's not compliant to the Robots Exclusion Standard.

2. It confuses Webmasters because "noindex" in robots.txt means
something completely different than "noindex" in meta tags or HTTP
headers.

3. Mixing crawler directives and indexer directives this way is a
plain weak point that will produce misunderstandings resulting in
traffic losses for Webmasters and less compelling contents available
to searchers. All indexer directives
(noindex,nofollow,noarchive,noodp, unavailable_after etc.) do require
crawling when put elsewhere. I do Webmaster support for ages and I
assure you that Webmasters will not get it. If nobody understands it
and adapts it, it's as useless as Yahoo's robots-nocontent class name
that only 500 sites on the whole Web make use of.

4. The REP's "noindex" tag has an implicit "follow" that Google
ignores in robots.txt for technical reasons (it's impossible to follow
links from uncrawled pages). When I put a robots meta tag with a
"noindex" value, then Google rightly follows my links, passes PageRank
and anchor text to those, and just doesn't list the URL on the SERPs.
When I do the same in robots.txt Google behaves totally different, for
no apparent reason. (Of course there's a reason but I want to keep
this statement simple.)

Having said all that, I appreciate it very much that Google works on
robots.txt evolvements. Kudos to Google! However, please don't assign
semantics of crawler directives to established indexer directives,
that doesn't work out. I see the PageRank problem, and I think I know
a better procedure to solve that. If you're interested, please read my
"RFC" linked above. ;)

@all

Do not make use of experimental robots.txt directives unless you
really know what you do, and that includes monitoring Google's
experiment very closely. If you've the programming skills, then better
make use of X-Robots-Tags to steer indexing respectivele deindexing of
your resources on site level. X-Robots-Tags work with HTML contents as
well as with all other content types.

@Riona

I hope you don't mind the cross posting. :)

Thanks for your time and have a nice day!
Sebastian

On Jan 10, 6:28 am, Riona MacNamara wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Susan Moskwa Google employee  
View profile  
 More options Jan 10 2008, 9:42 am
From: Susan Moskwa
Date: Thu, 10 Jan 2008 06:42:22 -0800 (PST)
Local: Thurs, Jan 10 2008 9:42 am
Subject: Re: Webmaster Central video blog: Removing your content from Google
Hey Mike--

It's entirely possible for pages not listed in a Sitemap to get
crawled and indexed; we use data from Sitemaps to supplement our usual
crawl and discovery procedures, but they're not the only way that we
find out about URLs to crawl.

It sounds like you're on the right track, though--301 redirecting each
page on the old site to the corresponding page on the new site is the
way to go. As we crawl the old pages we'll see the 301 redirects and
know that that content is now found on the corresponding URL on your
new site.

However, it looks like you haven't yet implemented the 301 redirects
on planetmike.com, right? (If so, you might want to double-check,
because I tried a couple URLs and didn't get redirected anywhere.)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
planetmike  
View profile  
 More options Jan 10 2008, 9:57 pm
From: planetmike
Date: Thu, 10 Jan 2008 18:57:14 -0800 (PST)
Local: Thurs, Jan 10 2008 9:57 pm
Subject: Re: Webmaster Central video blog: Removing your content from Google
Hi Susan,

On Jan 10, 9:42 am, Susan Moskwa wrote:

> It sounds like you're on the right track, though--301 redirecting each
> page on the old site to the corresponding page on the new site is the
> way to go. As we crawl the old pages we'll see the 301 redirects and
> know that that content is now found on the corresponding URL on your
> new site.

> However, it looks like you haven't yet implemented the 301 redirects
> on planetmike.com, right? (If so, you might want to double-check,
> because I tried a couple URLs and didn't get redirected anywhere.)

Thanks for looking. I'm slowly getting everything tweaked around.
There is a lot of cruft that has accumulated over 8 years. It's very
likely I've moved some files and directories and haven't got the
redirect working nicely yet. I should have gone a bit slower than I
did. Mike

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Chris Estes  
View profile  
 More options Jan 12 2008, 4:50 pm
From: Chris Estes
Date: Sat, 12 Jan 2008 13:50:35 -0800 (PST)
Local: Sat, Jan 12 2008 4:50 pm
Subject: Re: Webmaster Central video blog: Removing your content from Google
One of the best I have seen on the subject!  I am not sure I have ever
seen a video on it.  Matt Cutts "is the man" but he needs a tele-
prompter.

On Jan 10, 12:28 am, Riona MacNamara wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
AjiNIMC  
View profile  
 More options Jan 14 2008, 1:10 am
From: AjiNIMC
Date: Sun, 13 Jan 2008 22:10:06 -0800 (PST)
Local: Mon, Jan 14 2008 1:10 am
Subject: Re: Webmaster Central video blog: Removing your content from Google
Hey Matt,

Thanks for the video but can you please explain a bit about removing
https pages using webmaster tool as webmaster URL removal console
starts with http. Is their a help file I can read for it?

Thanks, looking for an answer, we been discussing it at WMW for few
days without a proper answer.

Thanks,
AjiNIMC

On Jan 13, 2:50 am, Chris Estes wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
jhackett305  
View profile  
 More options Jan 25 2008, 1:16 pm
From: jhackett305
Date: Fri, 25 Jan 2008 10:16:08 -0800 (PST)
Local: Fri, Jan 25 2008 1:16 pm
Subject: Re: Webmaster Central video blog: Removing your content from Google
Great, would love to see more videos on related topics.  Very easy to
listen to in the background while doing work.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Susan Moskwa Google employee  
View profile  
 More options Feb 5 2008, 6:23 pm
From: Susan Moskwa
Date: Tue, 5 Feb 2008 15:23:37 -0800 (PST)
Local: Tues, Feb 5 2008 6:23 pm
Subject: Re: Webmaster Central video blog: Removing your content from Google
Hi AjiNIMC:

(For those following at home, we discussed the http vs. https issue a
bit on this blog post:
http://googlewebmastercentral.blogspot.com/2008/01/remove-your-conten...
)

Regarding finding your indexed https pages, you could look at your
logs or analytics data to see which URLs are getting referrals from
Google search results.

Regarding the URL removal requests getting denied: if you request
removal of a URL that isn't indexed, the request should be marked
'Removed' since that URL isn't in our index ("removed" and "not
indexed" are basically synonymous in this case). If your requests are
getting denied, it's likely that the URL(s) in question don't meet the
criteria for removal. Take a close look at what you need to do to make
a URL eligible for removal:

http://www.google.com/support/webmasters/bin/answer.py?answer=59819


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
AjiNIMC  
View profile  
 More options Feb 5 2008, 8:29 pm
From: AjiNIMC
Date: Tue, 5 Feb 2008 17:29:28 -0800 (PST)
Local: Tues, Feb 5 2008 8:29 pm
Subject: Re: Webmaster Central video blog: Removing your content from Google
>> Regarding finding your indexed https pages, you could look at your logs or analytics data to see which URLs are getting referrals from Google search results.

It may not be getting any traffic yet, I want to avoid the duplicate
pages to be on the safer side. Does google have some commands like
site:www.domain.com:443 etc. Can google develop one, this will be
really helpful.

>> Take a close look at what you need to do to make a URL eligible for removal:

Can you please add a section where we can check if a page satisfies
the conditions for removal or not? I think I have done my part by
adding it to the robots.txt

Thanks for the reply Susan.

Regards,
Aji aka AjiNIMC

On Feb 6, 4:23 am, Susan Moskwa wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Susan Moskwa Google employee  
View profile  
 More options Feb 6 2008, 9:44 pm
From: Susan Moskwa
Date: Wed, 6 Feb 2008 18:44:25 -0800 (PST)
Local: Wed, Feb 6 2008 9:44 pm
Subject: Re: Webmaster Central video blog: Removing your content from Google
Right now the https version of your site, including the robots.txt
file, doesn't seem to be allowing connections; if Googlebot can't
access your robots.txt file, it won't be able to see whether your
URL(s) have been disallowed in the file, which may have something to
do with your removal requests being denied.

Have you considered using methods other than the URL removal tool to
remove any https URLs from the index? The URL removal tool is good for
urgent requests, but you can accomplish the same thing in many other
ways, most of which would probably be easier to implement since you
don't know exactly what URLs you want to remove. You could 301
redirect https pages to their http version, add an X-Robots-Tag header
with "noindex", or a variety of other methods:

http://www.google.com/support/webmasters/bin/answer.py?answer=35301


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rodrigo Garcia  
View profile  
 More options Apr 16 2008, 11:27 am
From: Rodrigo Garcia
Date: Wed, 16 Apr 2008 08:27:17 -0700 (PDT)
Local: Wed, Apr 16 2008 11:27 am
Subject: Re: Webmaster Central video blog: Removing your content from Google
Nice info, but I missed one thing...
Can URL Removal Tool (URT) remove every single URL from a site?

I need to remove http://www.qa.fdc.org.br/* (all pages and directories
below the root...)

Can it be done through URT?
The pages are already blocked by a robots.txt and by ip filtering on
the web server...

Please help...

On Feb 23, 5:05 pm, ilhan wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Susan Moskwa Google employee  
View profile  
 More options Apr 16 2008, 12:37 pm
From: Susan Moskwa
Date: Wed, 16 Apr 2008 09:37:14 -0700 (PDT)
Local: Wed, Apr 16 2008 12:37 pm
Subject: Re: Webmaster Central video blog: Removing your content from Google
Yes, you can; just select the "Remove your entire site" option when
you're creating your removal request.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »