If its prefered that webmasters report spam here and regular users should use the spamreport link possible there should be, as much as I hate to say it a spam category.
Adam -- What happens with spam reports? I've reported quite a number of sites (back when I thought it would make a difference) -- so far NOTHING has happened to them, they're still listed, etc. Do you have a big backlog? Do you even do manual removal / penalities? Or do these reports just get checked via statistics for the next algo update? (can you tell it's frustrating to take the time to report spam and never see anything happening? It is VERY frustating.)
Sorry you haven't seen action taken on the individual sites you've reported. It's interesting to note that while others have also said the same thing, I've read notes from still other people who've marveled at how quickly the sites they've reported have disappeared from our index.
On the whole, though, we really shy away from doing "hand-to-hand" combat.
We take the information we glean from the spam reports and use it to fine tune and (re)evaluate our algorithms. Just as we're constantly evaluating the quality of our search results in general (quantitatively), we're also using spam reports to assess the "spamminess" of certain areas and our index on the whole over time... and make adjustments from there.
We then ask... would algorithmic tweak [x] catch a particular set of spam? If not, why not? And how much better (again, quantitatively) does algorithmic tweak [x] make our index?
This raises the broader point, then, that while we update our algorithms constantly, we do not do so lightly or without comprehensive testing. Each adjustment is evaluated based upon how it improves upon our users' search experience. No adjustment is going to be 100% perfect but each does, in the aggregate, improve our search results quality.
Though we understand that it'd be more satisfying for some users to see immediate results taken against specific sites, we feel that our broader approach is more scalable and more productive in the long run.
I'd love to ask more questions on this because I find it fascinating how spam sites could be recognized automatically :-).
* What can we do to make spam-processing on your site better / more effective? Do you want us to report every spammy URL we find indexed? (I'm sure some here would do that if you asked nicely and handed out Google T-Shirts or more :-))
* Do you manually confirm the spamminess of a site before adding it to the spam corpus?
* When you add a site to the spam corpus for later tests, do you add the URL or the actual contents? (eg if a site changes after being marked as spam - do you test the changed site - as spam, though it isn't anymore - or the contents of the old site? or is that obvious? :-))
* What can we do if we feel that a manual nudge is required (ie *very spammy, very large scale*)? If the spam reports just land in the "list of URLs to be marked as bad" you might miss (or be too late) on something big...
* And the most interesting of them all: do you work together with the Adsense team? I imagine if you have the Adsense-ID from one site, you could probably knock out the whole setup. Wouldn't something like that be a nice add-on to the algo -- Adsense-ID blacklist lookup :-)
(egad, this is getting long)
* What do you aim for with algorithmic tweaks? Say you have 10000 pages from 100 sites: do you work on the # of sites or # of pages? (I'm guessing the algo is generally page-oriented, but a site-factor is probably there anyway) What percentage of "false positives" is acceptable?
* Tell us something about the spam penalty :-).
* What can an honest webmaster do to make sure he isn't hit by a "spam-penalty"? Say I add affiliate links to my site to pay for the hosting, I make $5/month on it, my visitors are mostly similar african beetle belly-lint collectors, my site has few inbound links because it's obscure. My son did the website in Frontpage (and you can tell) and told me to put some hidden text on it to make sure it ranks high for "african belly-lint, african beetles". I realize it's not good, but I don't have a clue about how to fix it. Will I be penalized / banned constantly / forever? Why? [if you can algorithmically recognize the hidden-text, why not just ignore it?]
I bet you're now wishing you never answered those questions here :-). I have lots of other questions up my sleeve, but I'll leave you with the easy ones first. Take your time, no problem.
All very pressing issues that are not explored enough in the posts I have seen.
Not to side track but John mentioned notification of pending penatles in the sitemaps area. The new tools he mentioned last night re:spam just showed up on my console today, google is moving faster than me :)
Im sorry to dilute this thread but I must correct myself.
Correction I thought
>Not to side track but Adam mentioned notification of pending penatles in sitemaps and email notification to validated web masters.
Is this up and running in sitemaps and Im missing it?
Points from John I would like to hear more on are 1. * What can we do to make spam-processing on your site better / more 2. do you work together with the Adsense team? 3. What can an honest webmaster do to make sure he isn't hit by a "spam-penalty"
surf_doggie wrote: > John very well though out post. Kudo's
> All very pressing issues that are not explored enough in the posts I > have seen.
> Not to side track but John mentioned notification of pending penatles > in the sitemaps area. The new tools he mentioned last night re:spam > just showed up on my console today, google is moving faster than me :)
> Earl
Strange, I saw that "report spam" area in the SM login area more than a week ago, and today I can't find it! This area of which you speak Earl, is it for OTHER sites, or YOUR site that's under the Sitemaps area?
> I'd love to ask more questions on this because I find it > fascinating how spam sites could be recognized automatically :-).
Your questions are indeed interesting, but I'm sure you'll understand that I'm not able to answer them as fully as you'd like. Though some might be surprised by this, we don't embrace secrecy in some areas just to be capricious; rather, we weigh the great benefits of transparency against similarly great (or, in this case, unpleasant) risks associated with giving undue insight to those who seek to disrupt the quality of our search results.
> What can we do to make spam-processing on your site better / > more effective? Do you want us to report every spammy URL we > find indexed? > (I'm sure some here would do that if you asked nicely and > handed out Google T-Shirts or more :-))
Heh heh... Google t-shirts, eh? I am giving away one on my own blog, but I'll see what I can do about getting a larger stash for the future :).
As for what you can do to help in this area; seriously, one of the greatest things would be to do what many of you are already doing: being thoughtful members of this community, helping address others' concerns here, and busting search engine myths.
Improving the quality of our index (and, I dare say, the Web on the whole) isn't just about busting spam. Of equal or greater importance, it's about empowering well-meaning Webmasters with CORRECT AND USEFUL information to help them make great content available online. The greater success we encourage and facilitate for this group, the less likely there are to be "gray hat" folks who dabble in destructive behaviors out of ignorance or frustration.
Reporting spam via our spam form is helpful, indeed, but I like to look at the greater picture. :)
> Do you manually confirm the spamminess of a site before adding > it to the spam corpus?
As I recently noted in another thread, we do have a pretty complex set of checks and balances. Overall, quality issues are handled in a myriad of different ways... depending upon a lot of different factors.
> When you add a site to the spam corpus for later tests > do you add the URL or the actual contents? (eg if a site > changes after being marked as spam - do you test the changed > site - as spam, though it isn't anymore - or the contents of > the old site? or is that obvious? :-))
We like to work with as much information as we possibly can. One key takeaway: if a site previously engaged in spammy behavior but is now completely "clean," and the webmaster has filed a reinclusion request, we are unlikely to hold a grudge.
> What can we do if we feel that a manual nudge is required (ie > *very spammy, very large scale*)? If the spam reports just land > in the "list of URLs to be marked as bad" you might miss (or > be too late) on something big...
While I hesitate to open the floodgates in this area, let me try this policy out for size at least for a bit: If you note a SIGNIFICANTLY LARGE or problematic network impacting our search results, feel free to post a note about that in this group, which folks here monitor daily.
An example of a problematic issue would be that a particular set of domains or subdomains is suddenly appearing for a query that is not wildly "longtail." In other words, you might guess that we're less *urgently* concerned about a single spam result showing up for "broken car door poetry" than a sudden influx of nasties for "music history."
> And the most interesting of them all: do you work together with > the Adsense team? I imagine if you have the Adsense-ID from one > site, you could probably knock out the whole setup. Wouldn't > something like that be a nice add-on to the algo -- Adsense-ID > blacklist lookup :-)
We are pretty constrained by various privacy restrictions as to what data can be shared with what teams and in what contexts. For instance, we don't use data from Analytics to penalize specific sites, even spammer sites.
> What do you aim for with algorithmic tweaks?
At the end of the day, we look for an increase in search-related user happiness. That's measured in a LOT of ways, quantitatively and exhaustively. We love charts, especially spam charts with nice lines trending downwards :)
> Tell us something about the spam penalty :-).
You don't want to make the Googlebot mad. Let's just leave it at that :-P
> What can an honest webmaster do to make sure he isn't hit by a > "spam-penalty"? Say I add affiliate links to my site to pay for > the hosting, I make $5/month on it, my visitors are mostly similar > african beetle belly-lint collectors, my site has few inbound links > because it's obscure. My son did the website in Frontpage (and you > can tell) and told me to put some hidden text on it to make sure > it ranks high for "african belly-lint, african beetles". I realize > it's not good, but I don't have a clue about how to fix it. Will I > be penalized / banned constantly / forever? Why? [if you can > algorithmically recognize the hidden-text, why not just ignore it?]
The three best things Webmasters can do to garner Googlelove, userlove, and avoid the wrath of the Googlebot:
- Do the Right Thing. That's not always cut and dry, but in *most* instances, it's reasonably intuitive. Having a page that is 98.6% ads is not going to make users happy. Using some "optimization" software you bought for "Just $49.95... today only!" is probably not the wisest investment you could make ;) By the way, affiliate links or ads in and of themselves are hardly a big red flag. Lots of wonderful (and highly ranked) sites have 'em!
- Read, read, read! Matt Cutt's blog. Our Webmaster Guidelines. Two great places to start.
- Collect wisdom. Use analytics (ours or a package by another trustable company). Is everyone leaving your site quickly? Do you have a lot of 404s? Are your pages each taking 4187234 seconds to load? Does your Uncle Bob think your site is confusing and overwhelming or looking like a bad infomercial (without even Cher's good looks)? Bad signs.
And don't worry about traffic. We know that a million people aren't going to instantly flock to your son's site about african beetle belly-lint collectors site. And that's okay.
But hidden text? Well, if that's okay and not penalized, how about some cloaking? Just a little bit, it can't hurt. Everyone else is doing it, right? Therein lies one of the considerations: Unabashed and unpunished efforts to mislead search engine bots and users portend a slippery slope...
If your son added hidden text and the site got penalized... 'tis not the end of the world. One look at the Guidelines, and you see this under "Quality guidelines - specific guidelines"
* Avoid hidden text or hidden links.
Pretty clear. Hidden text removed, reinclusion request filed, a little patience, and voila, unpenalized :)
Additionally, we're really ramping up our efforts to e-mail folks like your son to *notify* them of the hidden text issue and give them a chance to fix it.
Rick:
> Strange, I saw that "report spam" area in the SM login area > more than a week ago, and today I can't find it! This area > of which you speak > Earl, is it for OTHER sites, or YOUR site that's under > the Sitemaps area?
Should be under the Tools link on the righthand side. Let me know if you still don't see it!
That's much more than I expected to see here in the group, Adam :-). Thanks!
SO, if spam-reporting isn't that important to you, how can we give you signals about search quality - I assume you need it on a datacenter-basis, right? Or am I thinking too simple again, you prefer to use "subconscious signals" - paging in the results, clicking in the top entries and staying there, etc - rather than user-submitted quality values (like the post-rankings here)? Is it "just act natural and we'll notice"?
(time to pull out the ol' AOL database and run statistics again :-))
Adam, does Google take spam reports from all niches seriously? For example, if only the SERPS for viagra ("free viagra","cheap viagra", "buy viagra", etc) returned spammy reports, would Google refrain from making any changes to its algo because doing so may turn the SERPs in other niches (e.g. "online banking") upside down?
Log into sitemaps the first page will show all the sitemaps you have on in the upper right hand side of the page click on blue +Tools you will get a drop down its in there.
> Log into sitemaps the first page will show all the sitemaps you have on > in the upper right hand side of the page click on blue +Tools you will > get a drop down its in there.