* Please submit spam reports to http://www.google.com/contact/spamreport.html. * Or better yet -- if you have a Webmaster Tools account -- you can report spam directly to Google. On the Dashboard, click Tools, and then click Report spam in our index. Note that you don't need to have an XML sitemap in order to create an account, and you can typically create a Webmaster Tools account in under two minutes. Spam reports sent via Webmaster Tools carry more weight than reports via our unauthenticated (open) spam report page.
Can Google handle 'Spam Report' from spammers? Why it should rely on 'smellers'? That's funny. I'll publish link to my site on 750000 blogs tomorrow. I don't care. Learn Math, and don't forget about your PhD. Have a fun!
> * Please submit spam reports tohttp://www.google.com/contact/spamreport.html. > * Or better yet -- if you have a Webmaster Tools account -- you > can report spam directly to Google. On the Dashboard, click Tools, and > then click Report spam in our index. Note that you don't need to have > an XML sitemap in order to create an account, and you can typically > create a Webmaster Tools account in under two minutes. Spam reports > sent via Webmaster Tools carry more weight than reports via our > unauthenticated (open) spam report page.
Spam reports are manually reviewed and if one were to file numerous false spam reports, one would likely find their future, as well as any yet to be reviewed, spam reports ignored.
> Can Google handle 'Spam Report' from spammers? Why it should rely on > 'smellers'? That's funny. I'll publish link to my site on 750000 blogs > tomorrow. I don't care. Learn Math, and don't forget about your PhD. > Have a fun!
> On Jun 30, 11:16 pm, Bambarbia Kirkudu! wrote:
> > Adam wrote:
> > * Please submit spam reports tohttp://www.google.com/contact/spamreport.html. > > * Or better yet -- if you have a Webmaster Tools account -- you > > can report spam directly to Google. On the Dashboard, click Tools, and > > then click Report spam in our index. Note that you don't need to have > > an XML sitemap in order to create an account, and you can typically > > create a Webmaster Tools account in under two minutes. Spam reports > > sent via Webmaster Tools carry more weight than reports via our > > unauthenticated (open) spam report page.
> Spam reports are manually reviewed and if one were to file numerous > false spam reports, one would likely find their future, as well as any > yet to be reviewed, spam reports ignored.
> On Jul 1, 12:19 pm, Bambarbia Kirkudu! wrote:
> > Can Google handle 'Spam Report' from spammers? Why it should rely on > > 'smellers'? That's funny. I'll publish link to my site on 750000 blogs > > tomorrow. I don't care. Learn Math, and don't forget about your PhD. > > Have a fun!
> > On Jun 30, 11:16 pm, Bambarbia Kirkudu! wrote:
> > > Adam wrote:
> > > * Please submit spam reports tohttp://www.google.com/contact/spamreport.html. > > > * Or better yet -- if you have a Webmaster Tools account -- you > > > can report spam directly to Google. On the Dashboard, click Tools, and > > > then click Report spam in our index. Note that you don't need to have > > > an XML sitemap in order to create an account, and you can typically > > > create a Webmaster Tools account in under two minutes. Spam reports > > > sent via Webmaster Tools carry more weight than reports via our > > > unauthenticated (open) spam report page.
> > > It smells bad, isn't it? > > > +1- Hide quoted text -
> * Please submit spam reports tohttp://www.google.com/contact/spamreport.html. > * Or better yet -- if you have a Webmaster Tools account -- you > can report spam directly to Google. On the Dashboard, click Tools, and > then click Report spam in our index. Note that you don't need to have > an XML sitemap in order to create an account, and you can typically > create a Webmaster Tools account in under two minutes. Spam reports > sent via Webmaster Tools carry more weight than reports via our > unauthenticated (open) spam report page.
There are quite a lot of software around, and you can easily develop your own spam-buster too.
The truth is that all search engines have a constraint: size of HTML is limited, it "grabs" in most cases first 65536 bytes only. "SPAM" is usually below a 128Kb-256Kb mark of such long dead pages, and it is not seen by Google anyway. You can find millions of dead pages (hm...) allowing to post to everyone, without limitations, but Google can handle only first 100 links at most, and index only first 65Kb... more or less...
> I have reported a spam site recently by this method. Just be patient, > it doesn't happen overnight - to say the least! > Regards > Data
> On 1 Jul, 04:16, Bambarbia Kirkudu! wrote:
> > Adam wrote:
> > * Please submit spam reports tohttp://www.google.com/contact/spamreport.html. > > * Or better yet -- if you have a Webmaster Tools account -- you > > can report spam directly to Google. On the Dashboard, click Tools, and > > then click Report spam in our index. Note that you don't need to have > > an XML sitemap in order to create an account, and you can typically > > create a Webmaster Tools account in under two minutes. Spam reports > > sent via Webmaster Tools carry more weight than reports via our > > unauthenticated (open) spam report page.
> The truth is that all search engines have a constraint: size of HTML > is limited, it "grabs" in most cases first 65536 bytes only.
That may have been the truth at some point in history but it isn't now, at least for Google.
Do a search on the following string, minus the parenthesis's ("Download the demo files" cass-hacks)
There will likely only be a single result with a "repeat the search with the omitted results included." link displayed. Click the repeat search link and then take a look at the second result.
The phrase "Download the demo files" is located at the bottom of a 76k portion of that page with the entire page being around 91k.
As far as I have seen, there is no longer any limit as to how much search engines will parse and index.
> but Google can handle only first 100 links at most, and index only first 65Kb.
> > The truth is that all search engines have a constraint: size of HTML > > is limited, it "grabs" in most cases first 65536 bytes only.
> That may have been the truth at some point in history but it isn't > now, at least for Google.
> Do a search on the following string, minus the parenthesis's > ("Download the demo files" cass-hacks)
> There will likely only be a single result with a "repeat the search > with the omitted results included." link displayed. Click the repeat > search link and then take a look at the second result.
> The phrase "Download the demo files" is located at the bottom of a 76k > portion of that page with the entire page being around 91k.
> As far as I have seen, there is no longer any limit as to how much > search engines will parse and index.
> > but Google can handle only first 100 links at most, and index only first 65Kb.
> There are quite a lot of software around, and you can easily develop > your own spam-buster too.
> The truth is that all search engines have a constraint: size of HTML > is limited, it "grabs" in most cases first 65536 bytes only. "SPAM" is > usually below a 128Kb-256Kb mark of such long dead pages, and it is > not seen by Google anyway. You can find millions of dead pages (hm...) > allowing to post to everyone, without limitations, but Google can > handle only first 100 links at most, and index only first 65Kb... more > or less...
> Thanks
> On Jul 1, 7:38 am, Data wrote:
> > I have reported a spam site recently by this method. Just be patient, > > it doesn't happen overnight - to say the least! > > Regards > > Data
> > On 1 Jul, 04:16, Bambarbia Kirkudu! wrote:
> > > Adam wrote:
> > > * Please submit spam reports tohttp://www.google.com/contact/spamreport.html. > > > * Or better yet -- if you have a Webmaster Tools account -- you > > > can report spam directly to Google. On the Dashboard, click Tools, and > > > then click Report spam in our index. Note that you don't need to have > > > an XML sitemap in order to create an account, and you can typically > > > create a Webmaster Tools account in under two minutes. Spam reports > > > sent via Webmaster Tools carry more weight than reports via our > > > unauthenticated (open) spam report page.
> > > It smells bad, isn't it? > > > +1- Hide quoted text -
Google crawled them all just because it found links in top-100 anchors on other pages. Can you check Webmaster Tools and ensure that Links page contain 500 outgoing links from an index page pointing to different pages? Sure, it is very difficult: Google Links shows only incoming links to a single page, it does not show 500 outgoing links from / my_site_index.html:
> I have some pages that are 150~250k and they get indexed. I have > index's with over 500 links. Google crawled them all, just not all at > once.
> On Jul 4, 1:08 pm, Bambarbia Kirkudu! wrote:
> > What is spam anyway?
> > There are quite a lot of software around, and you can easily develop > > your own spam-buster too.
> > The truth is that all search engines have a constraint: size of HTML > > is limited, it "grabs" in most cases first 65536 bytes only. "SPAM" is > > usually below a 128Kb-256Kb mark of such long dead pages, and it is > > not seen by Google anyway. You can find millions of dead pages (hm...) > > allowing to post to everyone, without limitations, but Google can > > handle only first 100 links at most, and index only first 65Kb... more > > or less...
> > Thanks
> > On Jul 1, 7:38 am, Data wrote:
> > > I have reported a spam site recently by this method. Just be patient, > > > it doesn't happen overnight - to say the least! > > > Regards > > > Data
> > > On 1 Jul, 04:16, Bambarbia Kirkudu! wrote:
> > > > Adam wrote:
> > > > * Please submit spam reports tohttp://www.google.com/contact/spamreport.html. > > > > * Or better yet -- if you have a Webmaster Tools account -- you > > > > can report spam directly to Google. On the Dashboard, click Tools, and > > > > then click Report spam in our index. Note that you don't need to have > > > > an XML sitemap in order to create an account, and you can typically > > > > create a Webmaster Tools account in under two minutes. Spam reports > > > > sent via Webmaster Tools carry more weight than reports via our > > > > unauthenticated (open) spam report page.
> > > > It smells bad, isn't it? > > > > +1- Hide quoted text -
> I had a page 300kb > With 3000 links and Google indexed the page and crwaled all the links!
> On Jul 5, 8:48 am, cass-hacks wrote:
> > > The truth is that all search engines have a constraint: size of HTML > > > is limited, it "grabs" in most cases first 65536 bytes only.
> > That may have been the truth at some point in history but it isn't > > now, at least for Google.
> > Do a search on the following string, minus the parenthesis's > > ("Download the demo files" cass-hacks)
> > There will likely only be a single result with a "repeat the search > > with the omitted results included." link displayed. Click the repeat > > search link and then take a look at the second result.
> > The phrase "Download the demo files" is located at the bottom of a 76k > > portion of that page with the entire page being around 91k.
> > As far as I have seen, there is no longer any limit as to how much > > search engines will parse and index.
> > > but Google can handle only first 100 links at most, and index only first 65Kb.
> The phrase "Download the demo files" is located at the bottom of a 76k > portion of that page with the entire page being around 91k.
My point is: size is limited. At least for HTML. It could be 64Kb 10 years ago, it could be 128Kb last year, and 256Kb now. But it is limited, it must be limited in order to avoid Denial-of- Service attacks from some websites. As a sample, Nutch search engine has default setting 64Kb which could be changed via configuration, and it does have such limitation by a reason. It is used by Yahoo now, for some searches (Creative Commons).
> > The truth is that all search engines have a constraint: size of HTML > > is limited, it "grabs" in most cases first 65536 bytes only.
> That may have been the truth at some point in history but it isn't > now, at least for Google.
> Do a search on the following string, minus the parenthesis's > ("Download the demo files" cass-hacks)
> There will likely only be a single result with a "repeat the search > with the omitted results included." link displayed. Click the repeat > search link and then take a look at the second result.
> The phrase "Download the demo files" is located at the bottom of a 76k > portion of that page with the entire page being around 91k.
> As far as I have seen, there is no longer any limit as to how much > search engines will parse and index.
> > but Google can handle only first 100 links at most, and index only first 65Kb.
> Google crawled them all just because it found links in top-100 anchors > on other pages. Can you check Webmaster Tools and ensure that Links > page contain 500 outgoing links from an index page pointing to > different pages?
Actually you can. If it is up to date, which is questionable at best, the "Internal" Links section of the webmaster tools will show the count of internal links that it knows about.
Just add up the total links you know to be on the site and the total internal links Google shows and compare.
I'm curious though, where are you getting the information that search engines ignore anything after the first 100 links?
>> "This table provides a list of internal pages that LINK TO http://www.tokenizer.org/?q=APRO" >That link is supposed to show what?
I simply copy-pasted text from Google Webmaster Tools. I can't prove anything I can only guess (sometimes); I am a search engine developer (mostly with open-source: Lucene, Nutch, ...).
> > Google crawled them all just because it found links in top-100 anchors > > on other pages. Can you check Webmaster Tools and ensure that Links > > page contain 500 outgoing links from an index page pointing to > > different pages?
> Actually you can. If it is up to date, which is questionable at best, > the "Internal" Links section of the webmaster tools will show the > count of internal links that it knows about.
> Just add up the total links you know to be on the site and the total > internal links Google shows and compare.
> I'm curious though, where are you getting the information that search > engines ignore anything after the first 100 links?
This isn't exactly spam but I'm wondering if any of you know or have an idea what the motive is behind it.
Note: These are not the links that appear on hacked sites
There seem to be what I call "crawl and run sites". They exist for a very short time and are nothing more than links to other sites BUT none of the links are valid. All they seem to do is target domains with keywords in the links but to pages that don't exist. If these sites were link farms, they'd link to valid URLs. If they were just trying to pickup visitors for AdSense, they last longer that just getting crawled. Google seems to be the target for most SEO scams but I really can't think these types of sites would even make it into Google so I am thinking it's some scheme used to fool another search engines, just clueless as to which one it would be. It's rare they last long enough to make it to any engines index. Those where I have finally been able to find any of the crazy URLs appear to be some half- ass useless engines, usually in some country you didn't even know existed.
> Google crawled them all just because it found links in top-100 anchors > on other pages. Can you check Webmaster Tools and ensure that Links > page contain 500 outgoing links from an index page pointing to > different pages? > Sure, it is very difficult: Google Links shows only incoming links to > a single page, it does not show 500 outgoing links from / > my_site_index.html:
> > I have some pages that are 150~250k and they get indexed. I have > > index's with over 500 links. Google crawled them all, just not all at > > once.
> > On Jul 4, 1:08 pm, Bambarbia Kirkudu! wrote:
> > > What is spam anyway?
> > > There are quite a lot of software around, and you can easily develop > > > your own spam-buster too.
> > > The truth is that all search engines have a constraint: size of HTML > > > is limited, it "grabs" in most cases first 65536 bytes only. "SPAM" is > > > usually below a 128Kb-256Kb mark of such long dead pages, and it is > > > not seen by Google anyway. You can find millions of dead pages (hm...) > > > allowing to post to everyone, without limitations, but Google can > > > handle only first 100 links at most, and index only first 65Kb... more > > > or less...
> > > Thanks
> > > On Jul 1, 7:38 am, Data wrote:
> > > > I have reported a spam site recently by this method. Just be patient, > > > > it doesn't happen overnight - to say the least! > > > > Regards > > > > Data
> > > > > * Please submit spam reports tohttp://www.google.com/contact/spamreport.html. > > > > > * Or better yet -- if you have a Webmaster Tools account -- you > > > > > can report spam directly to Google. On the Dashboard, click Tools, and > > > > > then click Report spam in our index. Note that you don't need to have > > > > > an XML sitemap in order to create an account, and you can typically > > > > > create a Webmaster Tools account in under two minutes. Spam reports > > > > > sent via Webmaster Tools carry more weight than reports via our > > > > > unauthenticated (open) spam report page.
> > > > > It smells bad, isn't it? > > > > > +1- Hide quoted text -
but site: command can not prove that Google counted 5000 outgoing links from a single page. 5000 pages were crawled, and site: command shows it; are we sure that we have "pointers" to those pages from a single page only?
> Google tools does not show the whole story. > You have to do a site: command.
> On Jul 6, 11:27 pm, Bambarbia Kirkudu! wrote:
> > Google crawled them all just because it found links in top-100 anchors > > on other pages. Can you check Webmaster Tools and ensure that Links > > page contain 500 outgoing links from an index page pointing to > > different pages? > > Sure, it is very difficult: Google Links shows only incoming links to > > a single page, it does not show 500 outgoing links from / > > my_site_index.html:
> > > I have some pages that are 150~250k and they get indexed. I have > > > index's with over 500 links. Google crawled them all, just not all at > > > once.
> > > > There are quite a lot of software around, and you can easily develop > > > > your own spam-buster too.
> > > > The truth is that all search engines have a constraint: size of HTML > > > > is limited, it "grabs" in most cases first 65536 bytes only. "SPAM" is > > > > usually below a 128Kb-256Kb mark of such long dead pages, and it is > > > > not seen by Google anyway. You can find millions of dead pages (hm...) > > > > allowing to post to everyone, without limitations, but Google can > > > > handle only first 100 links at most, and index only first 65Kb... more > > > > or less...
> > > > Thanks
> > > > On Jul 1, 7:38 am, Data wrote:
> > > > > I have reported a spam site recently by this method. Just be patient, > > > > > it doesn't happen overnight - to say the least! > > > > > Regards > > > > > Data
> > > > > > * Please submit spam reports tohttp://www.google.com/contact/spamreport.html. > > > > > > * Or better yet -- if you have a Webmaster Tools account -- you > > > > > > can report spam directly to Google. On the Dashboard, click Tools, and > > > > > > then click Report spam in our index. Note that you don't need to have > > > > > > an XML sitemap in order to create an account, and you can typically > > > > > > create a Webmaster Tools account in under two minutes. Spam reports > > > > > > sent via Webmaster Tools carry more weight than reports via our > > > > > > unauthenticated (open) spam report page.
> but site: command can not prove that Google counted 5000 outgoing > links from a single page. 5000 pages were crawled, and site: command > shows it; are we sure that we have "pointers" to those pages from a > single page only?
> On Jul 6, 10:59 am, ivb wrote:
> > Google tools does not show the whole story. > > You have to do a site: command.
> > On Jul 6, 11:27 pm, Bambarbia Kirkudu! wrote:
> > > Google crawled them all just because it found links in top-100 anchors > > > on other pages. Can you check Webmaster Tools and ensure that Links > > > page contain 500 outgoing links from an index page pointing to > > > different pages? > > > Sure, it is very difficult: Google Links shows only incoming links to > > > a single page, it does not show 500 outgoing links from / > > > my_site_index.html:
> > > > I have some pages that are 150~250k and they get indexed. I have > > > > index's with over 500 links. Google crawled them all, just not all at > > > > once.
> > > > > There are quite a lot of software around, and you can easily develop > > > > > your own spam-buster too.
> > > > > The truth is that all search engines have a constraint: size of HTML > > > > > is limited, it "grabs" in most cases first 65536 bytes only. "SPAM" is > > > > > usually below a 128Kb-256Kb mark of such long dead pages, and it is > > > > > not seen by Google anyway. You can find millions of dead pages (hm...) > > > > > allowing to post to everyone, without limitations, but Google can > > > > > handle only first 100 links at most, and index only first 65Kb... more > > > > > or less...
> > > > > Thanks
> > > > > On Jul 1, 7:38 am, Data wrote:
> > > > > > I have reported a spam site recently by this method. Just be patient, > > > > > > it doesn't happen overnight - to say the least! > > > > > > Regards > > > > > > Data
> > > > > > > * Please submit spam reports tohttp://www.google.com/contact/spamreport.html. > > > > > > > * Or better yet -- if you have a Webmaster Tools account -- you > > > > > > > can report spam directly to Google. On the Dashboard, click Tools, and > > > > > > > then click Report spam in our index. Note that you don't need to have > > > > > > > an XML sitemap in order to create an account, and you can typically > > > > > > > create a Webmaster Tools account in under two minutes. Spam reports > > > > > > > sent via Webmaster Tools carry more weight than reports via our > > > > > > > unauthenticated (open) spam report page.