I wanted to give everyone a brief end-of-the-year update on the
blogroll problem. When we switched blogsearch to indexing the full
text of posts, we started seeing a lot more results where the only
matches for a query where from the blogroll or other parts of the page
that frame the actual post. (There's been a lot of discussion of the
problem. You can search for [google blogsearch] using Google
Blogsearch.)
We're in the midst of deploying a solution for this problem. The
basic approach is to analyze each blog to look for text and markup
that is common to all of the posts. Usually, these comment elements
include the blogroll, any navigational elements, and other parts of
the page that aren't part of the post. This approach works well for a
lot of blogs, but we're continuing to improve the algorithm. The
search results should ignore matches that only come from these common
elements. The indexing change to implement it is deployed almost
everywhere now.
We expect users will continue to see some spurious results, but many
fewer than before. I tried a search for my own name, which does
appear in a few blogrolls, and all the results looked good. If you
are still seeing blogroll hits, the problem is most likely caused by
our failure to analyze a particular blog correctly. Feel free to
follow up with examples in private email or in this forum.
Curious - around the same time of the initial report, I started
getting Google Alerts with blogroll links. If anything, it's become
*more* common and not less common lately. Does the change you write
about, Jeremy, impact Google Alerts?
If not, perhaps someone should take a look.
Thanks.
On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> I wanted to give everyone a brief end-of-the-year update on the
> blogroll problem. When we switched blogsearch to indexing the full
> text of posts, we started seeing a lot more results where the only
> matches for a query where from the blogroll or other parts of the page
> that frame the actual post. (There's been a lot of discussion of the
> problem. You can search for [google blogsearch] using Google
> Blogsearch.)
> We're in the midst of deploying a solution for this problem. The
> basic approach is to analyze each blog to look for text and markup
> that is common to all of the posts. Usually, these comment elements
> include the blogroll, any navigational elements, and other parts of
> the page that aren't part of the post. This approach works well for a
> lot of blogs, but we're continuing to improve the algorithm. The
> search results should ignore matches that only come from these common
> elements. The indexing change to implement it is deployed almost
> everywhere now.
> We expect users will continue to see some spurious results, but many
> fewer than before. I tried a search for my own name, which does
> appear in a few blogrolls, and all the results looked good. If you
> are still seeing blogroll hits, the problem is most likely caused by
> our failure to analyze a particular blog correctly. Feel free to
> follow up with examples in private email or in this forum.
It has become even more common. If Google Blog Search isn't finding
these blogroll hits, it is finding spam. In the last 3 days, I have
seen exactly ONE result which was not a result from the blogroll or a
SPLOG.
On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> Curious - around the same time of the initial report, I started
> getting Google Alerts with blogroll links. If anything, it's become
> *more* common and not less common lately. Does the change you write
> about, Jeremy, impact Google Alerts?
> If not, perhaps someone should take a look.
> Thanks.
> On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > I wanted to give everyone a brief end-of-the-year update on the
> > blogroll problem. When we switched blogsearch to indexing the full
> > text of posts, we started seeing a lot more results where the only
> > matches for a query where from the blogroll or other parts of the page
> > that frame the actual post. (There's been a lot of discussion of the
> > problem. You can search for [google blogsearch] using Google
> > Blogsearch.)
> > We're in the midst of deploying a solution for this problem. The
> > basic approach is to analyze each blog to look for text and markup
> > that is common to all of the posts. Usually, these comment elements
> > include the blogroll, any navigational elements, and other parts of
> > the page that aren't part of the post. This approach works well for a
> > lot of blogs, but we're continuing to improve the algorithm. The
> > search results should ignore matches that only come from these common
> > elements. The indexing change to implement it is deployed almost
> > everywhere now.
> > We expect users will continue to see some spurious results, but many
> > fewer than before. I tried a search for my own name, which does
> > appear in a few blogrolls, and all the results looked good. If you
> > are still seeing blogroll hits, the problem is most likely caused by
> > our failure to analyze a particular blog correctly. Feel free to
> > follow up with examples in private email or in this forum.
On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> Tamar,
> It has become even more common. If Google Blog Search isn't finding
> these blogroll hits, it is finding spam. In the last 3 days, I have
> seen exactly ONE result which was not a result from the blogroll or a
> SPLOG.
Can you tell me the specific queries that are showing bad results?
Also, is the problem specific to alerts or do you see them in regular
blogsearch results, too?
> On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > Curious - around the same time of the initial report, I started
> > getting Google Alerts with blogroll links. If anything, it's become
> > *more* common and not less common lately. Does the change you write
> > about, Jeremy, impact Google Alerts?
> > If not, perhaps someone should take a look.
> > Thanks.
> > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > I wanted to give everyone a brief end-of-the-year update on the
> > > blogroll problem. When we switched blogsearch to indexing the full
> > > text of posts, we started seeing a lot more results where the only
> > > matches for a query where from the blogroll or other parts of the page
> > > that frame the actual post. (There's been a lot of discussion of the
> > > problem. You can search for [google blogsearch] using Google
> > > Blogsearch.)
> > > We're in the midst of deploying a solution for this problem. The
> > > basic approach is to analyze each blog to look for text and markup
> > > that is common to all of the posts. Usually, these comment elements
> > > include the blogroll, any navigational elements, and other parts of
> > > the page that aren't part of the post. This approach works well for a
> > > lot of blogs, but we're continuing to improve the algorithm. The
> > > search results should ignore matches that only come from these common
> > > elements. The indexing change to implement it is deployed almost
> > > everywhere now.
> > > We expect users will continue to see some spurious results, but many
> > > fewer than before. I tried a search for my own name, which does
> > > appear in a few blogrolls, and all the results looked good. If you
> > > are still seeing blogroll hits, the problem is most likely caused by
> > > our failure to analyze a particular blog correctly. Feel free to
> > > follow up with examples in private email or in this forum.
I could write out a lengthy explanation of the different search I do
in Google Blog Search, but I decided since this is all visual, it
would be more efficient just to use screenshots.
I have tagged almost all of the results with what they are, either
Blogroll results or my personal favorite, Fake DVD Review SPLOGS. A
few that are either legit or I am unsure what they are, are left
mostly blank.
> On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > Tamar,
> > It has become even more common. If Google Blog Search isn't finding
> > these blogroll hits, it is finding spam. In the last 3 days, I have
> > seen exactly ONE result which was not a result from the blogroll or a
> > SPLOG.
> Can you tell me the specific queries that are showing bad results?
> Also, is the problem specific to alerts or do you see them in regular
> blogsearch results, too?
> Jeremy
> > On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > > Curious - around the same time of the initial report, I started
> > > getting Google Alerts with blogroll links. If anything, it's become
> > > *more* common and not less common lately. Does the change you write
> > > about, Jeremy, impact Google Alerts?
> > > If not, perhaps someone should take a look.
> > > Thanks.
> > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > blogroll problem. When we switched blogsearch to indexing the full
> > > > text of posts, we started seeing a lot more results where the only
> > > > matches for a query where from the blogroll or other parts of the page
> > > > that frame the actual post. (There's been a lot of discussion of the
> > > > problem. You can search for [google blogsearch] using Google
> > > > Blogsearch.)
> > > > We're in the midst of deploying a solution for this problem. The
> > > > basic approach is to analyze each blog to look for text and markup
> > > > that is common to all of the posts. Usually, these comment elements
> > > > include the blogroll, any navigational elements, and other parts of
> > > > the page that aren't part of the post. This approach works well for a
> > > > lot of blogs, but we're continuing to improve the algorithm. The
> > > > search results should ignore matches that only come from these common
> > > > elements. The indexing change to implement it is deployed almost
> > > > everywhere now.
> > > > We expect users will continue to see some spurious results, but many
> > > > fewer than before. I tried a search for my own name, which does
> > > > appear in a few blogrolls, and all the results looked good. If you
> > > > are still seeing blogroll hits, the problem is most likely caused by
> > > > our failure to analyze a particular blog correctly. Feel free to
> > > > follow up with examples in private email or in this forum.
Jeremy, I'm doing searches for "tamar weinberg," my blog title name,
or link:www.domain.com (where domain.com is my blog).
I don't check blogsearch results regularly, but I just performed a
search for the purposes of giving you as much information as possible
and saw a result that showed my blog on the sidebar navigation from 4
hours ago.
That said, I'm pretty certain that this isn't fully addressed. :(
On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > Tamar,
> > It has become even more common. If Google Blog Search isn't finding
> > these blogroll hits, it is finding spam. In the last 3 days, I have
> > seen exactly ONE result which was not a result from the blogroll or a
> > SPLOG.
> Can you tell me the specific queries that are showing bad results?
> Also, is the problem specific to alerts or do you see them in regular
> blogsearch results, too?
> Jeremy
> > On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > > Curious - around the same time of the initial report, I started
> > > getting Google Alerts with blogroll links. If anything, it's become
> > > *more* common and not less common lately. Does the change you write
> > > about, Jeremy, impact Google Alerts?
> > > If not, perhaps someone should take a look.
> > > Thanks.
> > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > blogroll problem. When we switched blogsearch to indexing the full
> > > > text of posts, we started seeing a lot more results where the only
> > > > matches for a query where from the blogroll or other parts of the page
> > > > that frame the actual post. (There's been a lot of discussion of the
> > > > problem. You can search for [google blogsearch] using Google
> > > > Blogsearch.)
> > > > We're in the midst of deploying a solution for this problem. The
> > > > basic approach is to analyze each blog to look for text and markup
> > > > that is common to all of the posts. Usually, these comment elements
> > > > include the blogroll, any navigational elements, and other parts of
> > > > the page that aren't part of the post. This approach works well for a
> > > > lot of blogs, but we're continuing to improve the algorithm. The
> > > > search results should ignore matches that only come from these common
> > > > elements. The indexing change to implement it is deployed almost
> > > > everywhere now.
> > > > We expect users will continue to see some spurious results, but many
> > > > fewer than before. I tried a search for my own name, which does
> > > > appear in a few blogrolls, and all the results looked good. If you
> > > > are still seeing blogroll hits, the problem is most likely caused by
> > > > our failure to analyze a particular blog correctly. Feel free to
> > > > follow up with examples in private email or in this forum.
On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> Jeremy, I'm doing searches for "tamar weinberg," my blog title name,
> or link:www.domain.com(where domain.com is my blog).
> I don't check blogsearch results regularly, but I just performed a
> search for the purposes of giving you as much information as possible
> and saw a result that showed my blog on the sidebar navigation from 4
> hours ago.
> That said, I'm pretty certain that this isn't fully addressed. :(
I agree that the problem isn't fully addressed :-(. I just did a
link: search for your blog. It returned 10 results ranging from 37
minutes old to several days old (Jan 1). There were two results that
obviously came from the blogroll, one from http://janefouts.com/ and
one from http://simplystated.realsimple.com/. We'll have to see why
we failed to detect those links as coming from the blogroll. There
are also a few results that came from Techcrunch posts that you
commented on. The comment has a link to your blog. I think those are
legitimate results, but I'd be interested to hear what users thinks.
So we're at 80% accuracy at this very moment. It's better than it
was, but obviously a lot of room for improvement.
> On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > Tamar,
> > > It has become even more common. If Google Blog Search isn't finding
> > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > seen exactly ONE result which was not a result from the blogroll or a
> > > SPLOG.
> > Can you tell me the specific queries that are showing bad results?
> > Also, is the problem specific to alerts or do you see them in regular
> > blogsearch results, too?
> > Jeremy
> > > On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > > > Curious - around the same time of the initial report, I started
> > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > *more* common and not less common lately. Does the change you write
> > > > about, Jeremy, impact Google Alerts?
> > > > If not, perhaps someone should take a look.
> > > > Thanks.
> > > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > > blogroll problem. When we switched blogsearch to indexing the full
> > > > > text of posts, we started seeing a lot more results where the only
> > > > > matches for a query where from the blogroll or other parts of the page
> > > > > that frame the actual post. (There's been a lot of discussion of the
> > > > > problem. You can search for [google blogsearch] using Google
> > > > > Blogsearch.)
> > > > > We're in the midst of deploying a solution for this problem. The
> > > > > basic approach is to analyze each blog to look for text and markup
> > > > > that is common to all of the posts. Usually, these comment elements
> > > > > include the blogroll, any navigational elements, and other parts of
> > > > > the page that aren't part of the post. This approach works well for a
> > > > > lot of blogs, but we're continuing to improve the algorithm. The
> > > > > search results should ignore matches that only come from these common
> > > > > elements. The indexing change to implement it is deployed almost
> > > > > everywhere now.
> > > > > We expect users will continue to see some spurious results, but many
> > > > > fewer than before. I tried a search for my own name, which does
> > > > > appear in a few blogrolls, and all the results looked good. If you
> > > > > are still seeing blogroll hits, the problem is most likely caused by
> > > > > our failure to analyze a particular blog correctly. Feel free to
> > > > > follow up with examples in private email or in this forum.
> > > > > Jeremy Hylton
> > > > > Google Blogsearch
Thanks Jeremy. As far as comments showing up in these searches,
you're right - that may be a little out of place, but I'm actually not
adverse to seeing those in my queries/alerts emails. It's more of a
concern when I see links coming from random sidebars (repeatedly, like
simplystated.realsimple.com).
I appreciate that you're still looking into it!
On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > Jeremy, I'm doing searches for "tamar weinberg," my blog title name,
> > or link:www.domain.com(wheredomain.com is my blog).
> > I don't check blogsearch results regularly, but I just performed a
> > search for the purposes of giving you as much information as possible
> > and saw a result that showed my blog on the sidebar navigation from 4
> > hours ago.
> > That said, I'm pretty certain that this isn't fully addressed. :(
> I agree that the problem isn't fully addressed :-(. I just did a
> link: search for your blog. It returned 10 results ranging from 37
> minutes old to several days old (Jan 1). There were two results that
> obviously came from the blogroll, one fromhttp://janefouts.com/and > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> we failed to detect those links as coming from the blogroll. There
> are also a few results that came from Techcrunch posts that you
> commented on. The comment has a link to your blog. I think those are
> legitimate results, but I'd be interested to hear what users thinks.
> So we're at 80% accuracy at this very moment. It's better than it
> was, but obviously a lot of room for improvement.
> Jeremy
> > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > Tamar,
> > > > It has become even more common. If Google Blog Search isn't finding
> > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > SPLOG.
> > > Can you tell me the specific queries that are showing bad results?
> > > Also, is the problem specific to alerts or do you see them in regular
> > > blogsearch results, too?
> > > Jeremy
> > > > On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > > > > Curious - around the same time of the initial report, I started
> > > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > > *more* common and not less common lately. Does the change you write
> > > > > about, Jeremy, impact Google Alerts?
> > > > > If not, perhaps someone should take a look.
> > > > > Thanks.
> > > > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > > > blogroll problem. When we switched blogsearch to indexing the full
> > > > > > text of posts, we started seeing a lot more results where the only
> > > > > > matches for a query where from the blogroll or other parts of the page
> > > > > > that frame the actual post. (There's been a lot of discussion of the
> > > > > > problem. You can search for [google blogsearch] using Google
> > > > > > Blogsearch.)
> > > > > > We're in the midst of deploying a solution for this problem. The
> > > > > > basic approach is to analyze each blog to look for text and markup
> > > > > > that is common to all of the posts. Usually, these comment elements
> > > > > > include the blogroll, any navigational elements, and other parts of
> > > > > > the page that aren't part of the post. This approach works well for a
> > > > > > lot of blogs, but we're continuing to improve the algorithm. The
> > > > > > search results should ignore matches that only come from these common
> > > > > > elements. The indexing change to implement it is deployed almost
> > > > > > everywhere now.
> > > > > > We expect users will continue to see some spurious results, but many
> > > > > > fewer than before. I tried a search for my own name, which does
> > > > > > appear in a few blogrolls, and all the results looked good. If you
> > > > > > are still seeing blogroll hits, the problem is most likely caused by
> > > > > > our failure to analyze a particular blog correctly. Feel free to
> > > > > > follow up with examples in private email or in this forum.
> > > > > > Jeremy Hylton
> > > > > > Google Blogsearch
> Thanks Jeremy. As far as comments showing up in these searches,
> you're right - that may be a little out of place, but I'm actually not
> adverse to seeing those in my queries/alerts emails. It's more of a
> concern when I see links coming from random sidebars (repeatedly, like
> simplystated.realsimple.com).
> I appreciate that you're still looking into it!
> On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > > Jeremy, I'm doing searches for "tamar weinberg," my blog title name,
> > > or link:www.domain.com(wheredomain.comis my blog).
> > > I don't check blogsearch results regularly, but I just performed a
> > > search for the purposes of giving you as much information as possible
> > > and saw a result that showed my blog on the sidebar navigation from 4
> > > hours ago.
> > > That said, I'm pretty certain that this isn't fully addressed. :(
> > I agree that the problem isn't fully addressed :-(. I just did a
> > link: search for your blog. It returned 10 results ranging from 37
> > minutes old to several days old (Jan 1). There were two results that
> > obviously came from the blogroll, one fromhttp://janefouts.com/and > > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> > we failed to detect those links as coming from the blogroll. There
> > are also a few results that came from Techcrunch posts that you
> > commented on. The comment has a link to your blog. I think those are
> > legitimate results, but I'd be interested to hear what users thinks.
> > So we're at 80% accuracy at this very moment. It's better than it
> > was, but obviously a lot of room for improvement.
> > Jeremy
> > > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > Tamar,
> > > > > It has become even more common. If Google Blog Search isn't finding
> > > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > > SPLOG.
> > > > Can you tell me the specific queries that are showing bad results?
> > > > Also, is the problem specific to alerts or do you see them in regular
> > > > blogsearch results, too?
> > > > Jeremy
> > > > > On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > > > > > Curious - around the same time of the initial report, I started
> > > > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > > > *more* common and not less common lately. Does the change you write
> > > > > > about, Jeremy, impact Google Alerts?
> > > > > > If not, perhaps someone should take a look.
> > > > > > Thanks.
> > > > > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > > > > blogroll problem. When we switched blogsearch to indexing the full
> > > > > > > text of posts, we started seeing a lot more results where the only
> > > > > > > matches for a query where from the blogroll or other parts of the page
> > > > > > > that frame the actual post. (There's been a lot of discussion of the
> > > > > > > problem. You can search for [google blogsearch] using Google
> > > > > > > Blogsearch.)
> > > > > > > We're in the midst of deploying a solution for this problem. The
> > > > > > > basic approach is to analyze each blog to look for text and markup
> > > > > > > that is common to all of the posts. Usually, these comment elements
> > > > > > > include the blogroll, any navigational elements, and other parts of
> > > > > > > the page that aren't part of the post. This approach works well for a
> > > > > > > lot of blogs, but we're continuing to improve the algorithm. The
> > > > > > > search results should ignore matches that only come from these common
> > > > > > > elements. The indexing change to implement it is deployed almost
> > > > > > > everywhere now.
> > > > > > > We expect users will continue to see some spurious results, but many
> > > > > > > fewer than before. I tried a search for my own name, which does
> > > > > > > appear in a few blogrolls, and all the results looked good. If you
> > > > > > > are still seeing blogroll hits, the problem is most likely caused by
> > > > > > > our failure to analyze a particular blog correctly. Feel free to
> > > > > > > follow up with examples in private email or in this forum.
> > > > > > > Jeremy Hylton
> > > > > > > Google Blogsearch
In my particular case, it's a little weird. Before Blogsearch started
to index blogroll links and everything was fine, when I searched using
the command link: mysite.com it used to bring around 50+ backlinks.
Now, it only shows 2.
Why is that? Maybe some reset or something?
On Dec 19 2008, 4:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> I wanted to give everyone a brief end-of-the-year update on the
> blogroll problem. When we switched blogsearch to indexing the full
> text of posts, we started seeing a lot more results where the only
> matches for a query where from the blogroll or other parts of the page
> that frame the actual post. (There's been a lot of discussion of the
> problem. You can search for [google blogsearch] using Google
> Blogsearch.)
> We're in the midst of deploying a solution for this problem. The
> basic approach is to analyze each blog to look for text and markup
> that is common to all of the posts. Usually, these comment elements
> include the blogroll, any navigational elements, and other parts of
> the page that aren't part of the post. This approach works well for a
> lot of blogs, but we're continuing to improve the algorithm. The
> search results should ignore matches that only come from these common
> elements. The indexing change to implement it is deployed almost
> everywhere now.
> We expect users will continue to see some spurious results, but many
> fewer than before. I tried a search for my own name, which does
> appear in a few blogrolls, and all the results looked good. If you
> are still seeing blogroll hits, the problem is most likely caused by
> our failure to analyze a particular blog correctly. Feel free to
> follow up with examples in private email or in this forum.
> On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > Jeremy, I'm doing searches for "tamar weinberg," my blog title name,
> > or link:www.domain.com(wheredomain.com is my blog).
> > I don't check blogsearch results regularly, but I just performed a
> > search for the purposes of giving you as much information as possible
> > and saw a result that showed my blog on the sidebar navigation from 4
> > hours ago.
> > That said, I'm pretty certain that this isn't fully addressed. :(
> I agree that the problem isn't fully addressed :-(. I just did a
> link: search for your blog. It returned 10 results ranging from 37
> minutes old to several days old (Jan 1). There were two results that
> obviously came from the blogroll, one fromhttp://janefouts.com/and > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> we failed to detect those links as coming from the blogroll. There
> are also a few results that came from Techcrunch posts that you
> commented on. The comment has a link to your blog. I think those are
> legitimate results, but I'd be interested to hear what users thinks.
> So we're at 80% accuracy at this very moment. It's better than it
> was, but obviously a lot of room for improvement.
> Jeremy
> > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > Tamar,
> > > > It has become even more common. If Google Blog Search isn't finding
> > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > SPLOG.
> > > Can you tell me the specific queries that are showing bad results?
> > > Also, is the problem specific to alerts or do you see them in regular
> > > blogsearch results, too?
> > > Jeremy
> > > > On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > > > > Curious - around the same time of the initial report, I started
> > > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > > *more* common and not less common lately. Does the change you write
> > > > > about, Jeremy, impact Google Alerts?
> > > > > If not, perhaps someone should take a look.
> > > > > Thanks.
> > > > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > > > blogroll problem. When we switched blogsearch to indexing the full
> > > > > > text of posts, we started seeing a lot more results where the only
> > > > > > matches for a query where from the blogroll or other parts of the page
> > > > > > that frame the actual post. (There's been a lot of discussion of the
> > > > > > problem. You can search for [google blogsearch] using Google
> > > > > > Blogsearch.)
> > > > > > We're in the midst of deploying a solution for this problem. The
> > > > > > basic approach is to analyze each blog to look for text and markup
> > > > > > that is common to all of the posts. Usually, these comment elements
> > > > > > include the blogroll, any navigational elements, and other parts of
> > > > > > the page that aren't part of the post. This approach works well for a
> > > > > > lot of blogs, but we're continuing to improve the algorithm. The
> > > > > > search results should ignore matches that only come from these common
> > > > > > elements. The indexing change to implement it is deployed almost
> > > > > > everywhere now.
> > > > > > We expect users will continue to see some spurious results, but many
> > > > > > fewer than before. I tried a search for my own name, which does
> > > > > > appear in a few blogrolls, and all the results looked good. If you
> > > > > > are still seeing blogroll hits, the problem is most likely caused by
> > > > > > our failure to analyze a particular blog correctly. Feel free to
> > > > > > follow up with examples in private email or in this forum.
> > > > > > Jeremy Hylton
> > > > > > Google Blogsearch
It looks like no progress has been made on this front AT ALL. The
Google Alert emails I receive are spam and nothing but at this point.
Plus, I keep receiving the same emails again and again -- it's not
necessarily a "blogroll" issue but the same OLD content is being
treated by Google Blogsearch as new content. On one search query,
I've received the same result at least 10 times.
Jeremy and team, please don't forget about us.
On Jan 22, 9:39 am, tamar <puntr...@gmail.com> wrote:
> On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > > Jeremy, I'm doing searches for "tamar weinberg," my blog title name,
> > > or link:www.domain.com(wheredomain.comis my blog).
> > > I don't check blogsearch results regularly, but I just performed a
> > > search for the purposes of giving you as much information as possible
> > > and saw a result that showed my blog on the sidebar navigation from 4
> > > hours ago.
> > > That said, I'm pretty certain that this isn't fully addressed. :(
> > I agree that the problem isn't fully addressed :-(. I just did a
> > link: search for your blog. It returned 10 results ranging from 37
> > minutes old to several days old (Jan 1). There were two results that
> > obviously came from the blogroll, one fromhttp://janefouts.com/and > > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> > we failed to detect those links as coming from the blogroll. There
> > are also a few results that came from Techcrunch posts that you
> > commented on. The comment has a link to your blog. I think those are
> > legitimate results, but I'd be interested to hear what users thinks.
> > So we're at 80% accuracy at this very moment. It's better than it
> > was, but obviously a lot of room for improvement.
> > Jeremy
> > > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > Tamar,
> > > > > It has become even more common. If Google Blog Search isn't finding
> > > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > > SPLOG.
> > > > Can you tell me the specific queries that are showing bad results?
> > > > Also, is the problem specific to alerts or do you see them in regular
> > > > blogsearch results, too?
> > > > Jeremy
> > > > > On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > > > > > Curious - around the same time of the initial report, I started
> > > > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > > > *more* common and not less common lately. Does the change you write
> > > > > > about, Jeremy, impact Google Alerts?
> > > > > > If not, perhaps someone should take a look.
> > > > > > Thanks.
> > > > > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > > > > blogroll problem. When we switched blogsearch to indexing the full
> > > > > > > text of posts, we started seeing a lot more results where the only
> > > > > > > matches for a query where from the blogroll or other parts of the page
> > > > > > > that frame the actual post. (There's been a lot of discussion of the
> > > > > > > problem. You can search for [google blogsearch] using Google
> > > > > > > Blogsearch.)
> > > > > > > We're in the midst of deploying a solution for this problem. The
> > > > > > > basic approach is to analyze each blog to look for text and markup
> > > > > > > that is common to all of the posts. Usually, these comment elements
> > > > > > > include the blogroll, any navigational elements, and other parts of
> > > > > > > the page that aren't part of the post. This approach works well for a
> > > > > > > lot of blogs, but we're continuing to improve the algorithm. The
> > > > > > > search results should ignore matches that only come from these common
> > > > > > > elements. The indexing change to implement it is deployed almost
> > > > > > > everywhere now.
> > > > > > > We expect users will continue to see some spurious results, but many
> > > > > > > fewer than before. I tried a search for my own name, which does
> > > > > > > appear in a few blogrolls, and all the results looked good. If you
> > > > > > > are still seeing blogroll hits, the problem is most likely caused by
> > > > > > > our failure to analyze a particular blog correctly. Feel free to
> > > > > > > follow up with examples in private email or in this forum.
> > > > > > > Jeremy Hylton
> > > > > > > Google Blogsearch
We are having similar experiences, not just with blogroll references
but also recent post widgets and such on the blogs. Anytime another
post is mentioned with a link, we were frequently seeing a mostly
irrelevant page substituted for a relevant page in the index. It has
led to a lesser user experience, but we've ended up removing our
blogrolls from the sidebars, removing "recent post" references from
the sidebars, altering the recent comment widget so it does not cite
posts by title, and changing "recent/next" post references at the top
of posts so that the links are generic references rather than post
titles. That seems to make the SERPs more appropriate but it's
really not an ideal presentation. Hope this issue can be worked out.
On Jan 27, 8:22 am, tamar <puntr...@gmail.com> wrote:
> It looks like no progress has been made on this front AT ALL. The
> Google Alert emails I receive are spam and nothing but at this point.
> Plus, I keep receiving the same emails again and again -- it's not
> necessarily a "blogroll" issue but the same OLD content is being
> treated by Google Blogsearch as new content. On one search query,
> I've received the same result at least 10 times.
> Jeremy and team, please don't forget about us.
> On Jan 22, 9:39 am, tamar <puntr...@gmail.com> wrote:
> > Any update? It's been 3 weeks.
> > On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > > > I don't check blogsearch results regularly, but I just performed a
> > > > search for the purposes of giving you as much information as possible
> > > > and saw a result that showed my blog on the sidebar navigation from 4
> > > > hours ago.
> > > > That said, I'm pretty certain that this isn't fully addressed. :(
> > > I agree that the problem isn't fully addressed :-(. I just did a
> > > link: search for your blog. It returned 10 results ranging from 37
> > > minutes old to several days old (Jan 1). There were two results that
> > > obviously came from the blogroll, one fromhttp://janefouts.com/and > > > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> > > we failed to detect those links as coming from the blogroll. There
> > > are also a few results that came from Techcrunch posts that you
> > > commented on. The comment has a link to your blog. I think those are
> > > legitimate results, but I'd be interested to hear what users thinks.
> > > So we're at 80% accuracy at this very moment. It's better than it
> > > was, but obviously a lot of room for improvement.
> > > Jeremy
> > > > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > Tamar,
> > > > > > It has become even more common. If Google Blog Search isn't finding
> > > > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > > > SPLOG.
> > > > > Can you tell me the specific queries that are showing bad results?
> > > > > Also, is the problem specific to alerts or do you see them in regular
> > > > > blogsearch results, too?
> > > > > Jeremy
> > > > > > On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > Curious - around the same time of the initial report, I started
> > > > > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > > > > *more* common and not less common lately. Does the change you write
> > > > > > > about, Jeremy, impact Google Alerts?
> > > > > > > If not, perhaps someone should take a look.
> > > > > > > Thanks.
> > > > > > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > > > > > blogroll problem. When we switched blogsearch to indexing the full
> > > > > > > > text of posts, we started seeing a lot more results where the only
> > > > > > > > matches for a query where from the blogroll or other parts of the page
> > > > > > > > that frame the actual post. (There's been a lot of discussion of the
> > > > > > > > problem. You can search for [google blogsearch] using Google
> > > > > > > > Blogsearch.)
> > > > > > > > We're in the midst of deploying a solution for this problem. The
> > > > > > > > basic approach is to analyze each blog to look for text and markup
> > > > > > > > that is common to all of the posts. Usually, these comment elements
> > > > > > > > include the blogroll, any navigational elements, and other parts of
> > > > > > > > the page that aren't part of the post. This approach works well for a
> > > > > > > > lot of blogs, but we're continuing to improve the algorithm. The
> > > > > > > > search results should ignore matches that only come from these common
> > > > > > > > elements. The indexing change to implement it is deployed almost
> > > > > > > > everywhere now.
> > > > > > > > We expect users will continue to see some spurious results, but many
> > > > > > > > fewer than before. I tried a search for my own name, which does
> > > > > > > > appear in a few blogrolls, and all the results looked good. If you
> > > > > > > > are still seeing blogroll hits, the problem is most likely caused by
> > > > > > > > our failure to analyze a particular blog correctly. Feel free to
> > > > > > > > follow up with examples in private email or in this forum.
> > > > > > > > Jeremy Hylton
> > > > > > > > Google Blogsearch
Yep, the problem remains. Either SPAM or Blogroll for 90% of
results. The SPAM is actually getting worse. It's funny to see
SPLOGS at the top of the relevancy rankings, or better yet, almost the
entire first page of relevancy rankings being SPLOGS.
On Jan 27, 10:22 am, tamar <puntr...@gmail.com> wrote:
> It looks like no progress has been made on this front AT ALL. The
> Google Alert emails I receive are spam and nothing but at this point.
> Plus, I keep receiving the same emails again and again -- it's not
> necessarily a "blogroll" issue but the same OLD content is being
> treated by Google Blogsearch as new content. On one search query,
> I've received the same result at least 10 times.
> Jeremy and team, please don't forget about us.
> On Jan 22, 9:39 am, tamar <puntr...@gmail.com> wrote:
> > Any update? It's been 3 weeks.
> > On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > > > I don't check blogsearch results regularly, but I just performed a
> > > > search for the purposes of giving you as much information as possible
> > > > and saw a result that showed my blog on the sidebar navigation from 4
> > > > hours ago.
> > > > That said, I'm pretty certain that this isn't fully addressed. :(
> > > I agree that the problem isn't fully addressed :-(. I just did a
> > > link: search for your blog. It returned 10 results ranging from 37
> > > minutes old to several days old (Jan 1). There were two results that
> > > obviously came from the blogroll, one fromhttp://janefouts.com/and > > > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> > > we failed to detect those links as coming from the blogroll. There
> > > are also a few results that came from Techcrunch posts that you
> > > commented on. The comment has a link to your blog. I think those are
> > > legitimate results, but I'd be interested to hear what users thinks.
> > > So we're at 80% accuracy at this very moment. It's better than it
> > > was, but obviously a lot of room for improvement.
> > > Jeremy
> > > > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > Tamar,
> > > > > > It has become even more common. If Google Blog Search isn't finding
> > > > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > > > SPLOG.
> > > > > Can you tell me the specific queries that are showing bad results?
> > > > > Also, is the problem specific to alerts or do you see them in regular
> > > > > blogsearch results, too?
> > > > > Jeremy
> > > > > > On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > Curious - around the same time of the initial report, I started
> > > > > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > > > > *more* common and not less common lately. Does the change you write
> > > > > > > about, Jeremy, impact Google Alerts?
> > > > > > > If not, perhaps someone should take a look.
> > > > > > > Thanks.
> > > > > > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > > > > > blogroll problem. When we switched blogsearch to indexing the full
> > > > > > > > text of posts, we started seeing a lot more results where the only
> > > > > > > > matches for a query where from the blogroll or other parts of the page
> > > > > > > > that frame the actual post. (There's been a lot of discussion of the
> > > > > > > > problem. You can search for [google blogsearch] using Google
> > > > > > > > Blogsearch.)
> > > > > > > > We're in the midst of deploying a solution for this problem. The
> > > > > > > > basic approach is to analyze each blog to look for text and markup
> > > > > > > > that is common to all of the posts. Usually, these comment elements
> > > > > > > > include the blogroll, any navigational elements, and other parts of
> > > > > > > > the page that aren't part of the post. This approach works well for a
> > > > > > > > lot of blogs, but we're continuing to improve the algorithm. The
> > > > > > > > search results should ignore matches that only come from these common
> > > > > > > > elements. The indexing change to implement it is deployed almost
> > > > > > > > everywhere now.
> > > > > > > > We expect users will continue to see some spurious results, but many
> > > > > > > > fewer than before. I tried a search for my own name, which does
> > > > > > > > appear in a few blogrolls, and all the results looked good. If you
> > > > > > > > are still seeing blogroll hits, the problem is most likely caused by
> > > > > > > > our failure to analyze a particular blog correctly. Feel free to
> > > > > > > > follow up with examples in private email or in this forum.
> > > > > > > > Jeremy Hylton
> > > > > > > > Google Blogsearch
> Yep, the problem remains. Either SPAM or Blogroll for 90% of
> results. The SPAM is actually getting worse. It's funny to see
> SPLOGS at the top of the relevancy rankings, or better yet, almost the
> entire first page of relevancy rankings being SPLOGS.
> On Jan 27, 10:22 am, tamar <puntr...@gmail.com> wrote:
> > It looks like no progress has been made on this front AT ALL. The
> > Google Alert emails I receive are spam and nothing but at this point.
> > Plus, I keep receiving the same emails again and again -- it's not
> > necessarily a "blogroll" issue but the same OLD content is being
> > treated by Google Blogsearch as new content. On one search query,
> > I've received the same result at least 10 times.
> > Jeremy and team, please don't forget about us.
> > On Jan 22, 9:39 am, tamar <puntr...@gmail.com> wrote:
> > > Any update? It's been 3 weeks.
> > > On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > > > > I don't check blogsearch results regularly, but I just performed a
> > > > > search for the purposes of giving you as much information as possible
> > > > > and saw a result that showed my blog on the sidebar navigation from 4
> > > > > hours ago.
> > > > > That said, I'm pretty certain that this isn't fully addressed. :(
> > > > I agree that the problem isn't fully addressed :-(. I just did a
> > > > link: search for your blog. It returned 10 results ranging from 37
> > > > minutes old to several days old (Jan 1). There were two results that
> > > > obviously came from the blogroll, one fromhttp://janefouts.com/and > > > > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> > > > we failed to detect those links as coming from the blogroll. There
> > > > are also a few results that came from Techcrunch posts that you
> > > > commented on. The comment has a link to your blog. I think those are
> > > > legitimate results, but I'd be interested to hear what users thinks.
> > > > So we're at 80% accuracy at this very moment. It's better than it
> > > > was, but obviously a lot of room for improvement.
> > > > Jeremy
> > > > > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > > Tamar,
> > > > > > > It has become even more common. If Google Blog Search isn't finding
> > > > > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > > > > SPLOG.
> > > > > > Can you tell me the specific queries that are showing bad results?
> > > > > > Also, is the problem specific to alerts or do you see them in regular
> > > > > > blogsearch results, too?
> > > > > > Jeremy
> > > > > > > On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > Curious - around the same time of the initial report, I started
> > > > > > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > > > > > *more* common and not less common lately. Does the change you write
> > > > > > > > about, Jeremy, impact Google Alerts?
> > > > > > > > If not, perhaps someone should take a look.
> > > > > > > > Thanks.
> > > > > > > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > > > > > > blogroll problem. When we switched blogsearch to indexing the full
> > > > > > > > > text of posts, we started seeing a lot more results where the only
> > > > > > > > > matches for a query where from the blogroll or other parts of the page
> > > > > > > > > that frame the actual post. (There's been a lot of discussion of the
> > > > > > > > > problem. You can search for [google blogsearch] using Google
> > > > > > > > > Blogsearch.)
> > > > > > > > > We're in the midst of deploying a solution for this problem. The
> > > > > > > > > basic approach is to analyze each blog to look for text and markup
> > > > > > > > > that is common to all of the posts. Usually, these comment elements
> > > > > > > > > include the blogroll, any navigational elements, and other parts of
> > > > > > > > > the page that aren't part of the post. This approach works well for a
> > > > > > > > > lot of blogs, but we're continuing to improve the algorithm. The
> > > > > > > > > search results should ignore matches that only come from these common
> > > > > > > > > elements. The indexing change to implement it is deployed almost
> > > > > > > > > everywhere now.
> > > > > > > > > We expect users will continue to see some spurious results, but many
> > > > > > > > > fewer than before. I tried a search for my own name, which does
> > > > > > > > > appear in a few blogrolls, and all the results looked good. If you
> > > > > > > > > are still seeing blogroll hits, the problem is most likely caused by
> > > > > > > > > our failure to analyze a particular blog correctly. Feel free to
> > > > > > > > > follow up with examples in private email or in this forum.
> Today, I got links from 2006 and 2007 in my link: query emails.
> :(
> On Jan 28, 6:25 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > Yep, the problem remains. Either SPAM or Blogroll for 90% of
> > results. The SPAM is actually getting worse. It's funny to see
> > SPLOGS at the top of the relevancy rankings, or better yet, almost the
> > entire first page of relevancy rankings being SPLOGS.
> > On Jan 27, 10:22 am, tamar <puntr...@gmail.com> wrote:
> > > It looks like no progress has been made on this front AT ALL. The
> > > Google Alert emails I receive are spam and nothing but at this point.
> > > Plus, I keep receiving the same emails again and again -- it's not
> > > necessarily a "blogroll" issue but the same OLD content is being
> > > treated by Google Blogsearch as new content. On one search query,
> > > I've received the same result at least 10 times.
> > > Jeremy and team, please don't forget about us.
> > > On Jan 22, 9:39 am, tamar <puntr...@gmail.com> wrote:
> > > > Any update? It's been 3 weeks.
> > > > On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > > > > > I don't check blogsearch results regularly, but I just performed a
> > > > > > search for the purposes of giving you as much information as possible
> > > > > > and saw a result that showed my blog on the sidebar navigation from 4
> > > > > > hours ago.
> > > > > > That said, I'm pretty certain that this isn't fully addressed. :(
> > > > > I agree that the problem isn't fully addressed :-(. I just did a
> > > > > link: search for your blog. It returned 10 results ranging from 37
> > > > > minutes old to several days old (Jan 1). There were two results that
> > > > > obviously came from the blogroll, one fromhttp://janefouts.com/and > > > > > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> > > > > we failed to detect those links as coming from the blogroll. There
> > > > > are also a few results that came from Techcrunch posts that you
> > > > > commented on. The comment has a link to your blog. I think those are
> > > > > legitimate results, but I'd be interested to hear what users thinks.
> > > > > So we're at 80% accuracy at this very moment. It's better than it
> > > > > was, but obviously a lot of room for improvement.
> > > > > Jeremy
> > > > > > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > > > Tamar,
> > > > > > > > It has become even more common. If Google Blog Search isn't finding
> > > > > > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > > > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > > > > > SPLOG.
> > > > > > > Can you tell me the specific queries that are showing bad results?
> > > > > > > Also, is the problem specific to alerts or do you see them in regular
> > > > > > > blogsearch results, too?
> > > > > > > Jeremy
> > > > > > > > On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > Curious - around the same time of the initial report, I started
> > > > > > > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > > > > > > *more* common and not less common lately. Does the change you write
> > > > > > > > > about, Jeremy, impact Google Alerts?
> > > > > > > > > If not, perhaps someone should take a look.
> > > > > > > > > Thanks.
> > > > > > > > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > > > > > > > blogroll problem. When we switched blogsearch to indexing the full
> > > > > > > > > > text of posts, we started seeing a lot more results where the only
> > > > > > > > > > matches for a query where from the blogroll or other parts of the page
> > > > > > > > > > that frame the actual post. (There's been a lot of discussion of the
> > > > > > > > > > problem. You can search for [google blogsearch] using Google
> > > > > > > > > > Blogsearch.)
> > > > > > > > > > We're in the midst of deploying a solution for this problem. The
> > > > > > > > > > basic approach is to analyze each blog to look for text and markup
> > > > > > > > > > that is common to all of the posts. Usually, these comment elements
> > > > > > > > > > include the blogroll, any navigational elements, and other parts of
> > > > > > > > > > the page that aren't part of the post. This approach works well for a
> > > > > > > > > > lot of blogs, but we're continuing to improve the algorithm. The
> > > > > > > > > > search results should ignore matches that only come from these common
> > > > > > > > > > elements. The indexing change to implement it is deployed almost
> > > > > > > > > > everywhere now.
> > > > > > > > > > We expect users will continue to see some spurious results, but many
> > > > > > > > > > fewer than before. I tried a search for my own name, which does
> > > > > > > > > > appear in a few blogrolls, and all the results looked good. If you
> > > > > > > > > > are still seeing blogroll hits, the problem is most likely caused by
> > > > > > > > > > our failure to analyze a particular blog correctly. Feel free to
> > > > > > > > > > follow up with examples in private email or in this forum.
Is anything at ALL being done about this? I'm starting to consider
either:
1. flagging all Google Alerts sent to my Gmail inbox as spam (cuz uh,
they contain spammy results)
2. unsubscribing from Google Alerts -- since the results returned
aren't relevant and they certainly aren't fresh. (Come on, isn't
Google's mission to organize the world's information? This is clearly
disorganized and in a very bad way.)
Google: we've been pretty darn patient. This thread started in
December and referenced an even older incident. It's February now.
Is ANYONE paying attention to this? Please?
Thanks.
(p.s. a Google Alert email just prompted this post update. I don't
really post about this out of the blue.)
On Feb 2, 12:11 am, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> Yeah, same thing for me. It keeps reverting to these old results
> which are completely worthless.
> On Jan 31, 7:00 pm, tamar <puntr...@gmail.com> wrote:
> > Today, I got links from 2006 and 2007 in my link: query emails.
> > :(
> > On Jan 28, 6:25 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > Yep, the problem remains. Either SPAM or Blogroll for 90% of
> > > results. The SPAM is actually getting worse. It's funny to see
> > > SPLOGS at the top of the relevancy rankings, or better yet, almost the
> > > entire first page of relevancy rankings being SPLOGS.
> > > On Jan 27, 10:22 am, tamar <puntr...@gmail.com> wrote:
> > > > It looks like no progress has been made on this front AT ALL. The
> > > > Google Alert emails I receive are spam and nothing but at this point.
> > > > Plus, I keep receiving the same emails again and again -- it's not
> > > > necessarily a "blogroll" issue but the same OLD content is being
> > > > treated by Google Blogsearch as new content. On one search query,
> > > > I've received the same result at least 10 times.
> > > > Jeremy and team, please don't forget about us.
> > > > On Jan 22, 9:39 am, tamar <puntr...@gmail.com> wrote:
> > > > > Any update? It's been 3 weeks.
> > > > > On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > > > > > > I don't check blogsearch results regularly, but I just performed a
> > > > > > > search for the purposes of giving you as much information as possible
> > > > > > > and saw a result that showed my blog on the sidebar navigation from 4
> > > > > > > hours ago.
> > > > > > > That said, I'm pretty certain that this isn't fully addressed. :(
> > > > > > I agree that the problem isn't fully addressed :-(. I just did a
> > > > > > link: search for your blog. It returned 10 results ranging from 37
> > > > > > minutes old to several days old (Jan 1). There were two results that
> > > > > > obviously came from the blogroll, one fromhttp://janefouts.com/and > > > > > > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> > > > > > we failed to detect those links as coming from the blogroll. There
> > > > > > are also a few results that came from Techcrunch posts that you
> > > > > > commented on. The comment has a link to your blog. I think those are
> > > > > > legitimate results, but I'd be interested to hear what users thinks.
> > > > > > So we're at 80% accuracy at this very moment. It's better than it
> > > > > > was, but obviously a lot of room for improvement.
> > > > > > Jeremy
> > > > > > > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > > > > Tamar,
> > > > > > > > > It has become even more common. If Google Blog Search isn't finding
> > > > > > > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > > > > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > > > > > > SPLOG.
> > > > > > > > Can you tell me the specific queries that are showing bad results?
> > > > > > > > Also, is the problem specific to alerts or do you see them in regular
> > > > > > > > blogsearch results, too?
> > > > > > > > Jeremy
> > > > > > > > > On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > > Curious - around the same time of the initial report, I started
> > > > > > > > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > > > > > > > *more* common and not less common lately. Does the change you write
> > > > > > > > > > about, Jeremy, impact Google Alerts?
> > > > > > > > > > If not, perhaps someone should take a look.
> > > > > > > > > > Thanks.
> > > > > > > > > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > > > > > > > > blogroll problem. When we switched blogsearch to indexing the full
> > > > > > > > > > > text of posts, we started seeing a lot more results where the only
> > > > > > > > > > > matches for a query where from the blogroll or other parts of the page
> > > > > > > > > > > that frame the actual post. (There's been a lot of discussion of the
> > > > > > > > > > > problem. You can search for [google blogsearch] using Google
> > > > > > > > > > > Blogsearch.)
> > > > > > > > > > > We're in the midst of deploying a solution for this problem. The
> > > > > > > > > > > basic approach is to analyze each blog to look for text and markup
> > > > > > > > > > > that is common to all of the posts. Usually, these comment elements
> > > > > > > > > > > include the blogroll, any navigational elements, and other parts of
> > > > > > > > > > > the page that aren't part of the post. This approach works well for a
> > > > > > > > > > > lot of blogs, but we're continuing to improve the algorithm. The
> > > > > > > > > > > search results should ignore matches that only come from these common
> > > > > > > > > > > elements. The indexing change to implement it is deployed almost
> > > > > > > > > > > everywhere now.
> > > > > > > > > > > We expect users will continue to see some spurious results, but many
> > > > > > > > > > > fewer than before. I tried a search for my own name, which does
> > > > > > > > > > > appear in a few blogrolls, and all the results looked good. If you
> > > > > > > > > > > are still seeing blogroll hits, the problem is most likely caused by
> > > > > > > > > > > our failure to analyze a particular blog correctly. Feel free to
> > > > > > > > > > > follow up with examples in private email or in this forum.
Apologies for my tardy response. I'll be sure to give everyone an
update every week, even if we don't have much news to report.
As I mentioned, we made an initial attempt to fix the blogroll problem
in December. It fixed some fraction of the results that were coming
from blogrolls, but was inadequate in a number of ways. For some
blogs, the blog roll detection didn't pick anything up. For other
blogs, it detect some items in the blog roll, but not all of them. My
colleague Rick Klau was particularly unlucky. His blog appears in the
blog rolls of many legal blogs. I noticed that we often detect every
blog but his as a blogroll entry. We've been looking at a collection
of backlink queries (with the link: operator) and still see about 50%
of the results coming from blog rolls. So there is obviously a lot of
room for improvement.
We have been working on an improved blog roll detector. Our internal
tests look fairly promising, but there is a lot of variability in blog
markup that we need to handle. It's going to be a few more weeks
until we can start to deploy it. I'll see if I can provide a better
ETA next week.
I haven't been paying attention to the Google Alerts specifically.
The accuracy I mentioned earlier was for the regular search results.
I'll make sure we add some metrics that look at Alerts quality so that
we don't forgot about it again. The basic solution is the same for
search results and for alerts, but maybe there's something more we can
do for alerts in the short term.
Jeremy
On Feb 6, 8:07 am, tamar <puntr...@gmail.com> wrote:
> Is anything at ALL being done about this? I'm starting to consider
> either:
> 1. flagging all Google Alerts sent to my Gmail inbox as spam (cuz uh,
> they contain spammy results)
> 2. unsubscribing from Google Alerts -- since the results returned
> aren't relevant and they certainly aren't fresh. (Come on, isn't
> Google's mission to organize the world's information? This is clearly
> disorganized and in a very bad way.)
> Google: we've been pretty darn patient. This thread started in
> December and referenced an even older incident. It's February now.
> Is ANYONE paying attention to this? Please?
> Thanks.
> (p.s. a Google Alert email just prompted this post update. I don't
> really post about this out of the blue.)
> On Feb 2, 12:11 am, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > Yeah, same thing for me. It keeps reverting to these old results
> > which are completely worthless.
> > On Jan 31, 7:00 pm, tamar <puntr...@gmail.com> wrote:
> > > Today, I got links from 2006 and 2007 in my link: query emails.
> > > :(
> > > On Jan 28, 6:25 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > Yep, the problem remains. Either SPAM or Blogroll for 90% of
> > > > results. The SPAM is actually getting worse. It's funny to see
> > > > SPLOGS at the top of the relevancy rankings, or better yet, almost the
> > > > entire first page of relevancy rankings being SPLOGS.
> > > > On Jan 27, 10:22 am, tamar <puntr...@gmail.com> wrote:
> > > > > It looks like no progress has been made on this front AT ALL. The
> > > > > Google Alert emails I receive are spam and nothing but at this point.
> > > > > Plus, I keep receiving the same emails again and again -- it's not
> > > > > necessarily a "blogroll" issue but the same OLD content is being
> > > > > treated by Google Blogsearch as new content. On one search query,
> > > > > I've received the same result at least 10 times.
> > > > > Jeremy and team, please don't forget about us.
> > > > > On Jan 22, 9:39 am, tamar <puntr...@gmail.com> wrote:
> > > > > > Any update? It's been 3 weeks.
> > > > > > On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > > > > > > > I don't check blogsearch results regularly, but I just performed a
> > > > > > > > search for the purposes of giving you as much information as possible
> > > > > > > > and saw a result that showed my blog on the sidebar navigation from 4
> > > > > > > > hours ago.
> > > > > > > > That said, I'm pretty certain that this isn't fully addressed. :(
> > > > > > > I agree that the problem isn't fully addressed :-(. I just did a
> > > > > > > link: search for your blog. It returned 10 results ranging from 37
> > > > > > > minutes old to several days old (Jan 1). There were two results that
> > > > > > > obviously came from the blogroll, one fromhttp://janefouts.com/and > > > > > > > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> > > > > > > we failed to detect those links as coming from the blogroll. There
> > > > > > > are also a few results that came from Techcrunch posts that you
> > > > > > > commented on. The comment has a link to your blog. I think those are
> > > > > > > legitimate results, but I'd be interested to hear what users thinks.
> > > > > > > So we're at 80% accuracy at this very moment. It's better than it
> > > > > > > was, but obviously a lot of room for improvement.
> > > > > > > Jeremy
> > > > > > > > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > > > > > Tamar,
> > > > > > > > > > It has become even more common. If Google Blog Search isn't finding
> > > > > > > > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > > > > > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > > > > > > > SPLOG.
> > > > > > > > > Can you tell me the specific queries that are showing bad results?
> > > > > > > > > Also, is the problem specific to alerts or do you see them in regular
> > > > > > > > > blogsearch results, too?
> > > > > > > > > Jeremy
> > > > > > > > > > On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > > > Curious - around the same time of the initial report, I started
> > > > > > > > > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > > > > > > > > *more* common and not less common lately. Does the change you write
> > > > > > > > > > > about, Jeremy, impact Google Alerts?
> > > > > > > > > > > If not, perhaps someone should take a look.
> > > > > > > > > > > Thanks.
> > > > > > > > > > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > > > > > > > > > blogroll problem. When we switched blogsearch to indexing the full
> > > > > > > > > > > > text of posts, we started seeing a lot more results where the only
> > > > > > > > > > > > matches for a query where from the blogroll or other parts of the page
> > > > > > > > > > > > that frame the actual post. (There's been a lot of discussion of the
> > > > > > > > > > > > problem. You can search for [google blogsearch] using Google
> > > > > > > > > > > > Blogsearch.)
> > > > > > > > > > > > We're in the midst of deploying a solution for this problem. The
> > > > > > > > > > > > basic approach is to analyze each blog to look for text and markup
> > > > > > > > > > > > that is common to all of the posts. Usually, these comment elements
> > > > > > > > > > > > include the blogroll, any navigational elements, and other parts of
> > > > > > > > > > > > the page that aren't part of the post. This approach works well for a
> > > > > > > > > > > > lot of blogs, but we're continuing to improve the algorithm. The
> > > > > > > > > > > > search results should ignore matches that only come from these common
> > > > > > > > > > > > elements. The indexing change to implement it is deployed almost
> > > > > > > > > > > > everywhere now.
> > > > > > > > > > > > We expect users will continue to see some spurious results, but many
> > > > > > > > > > > > fewer than before. I tried a search for my own name, which does
> > > > > > > > > > > > appear in a few blogrolls, and all the results looked good. If you
> > > > > > > > > > > > are still seeing blogroll hits, the problem is most likely caused by
> > > > > > > > > > > > our failure to analyze a particular blog correctly. Feel free to
> > > > > > > > > > > > follow up with examples in private email or in this forum.
> Apologies for my tardy response. I'll be sure to give everyone an
> update every week, even if we don't have much news to report.
> As I mentioned, we made an initial attempt to fix the blogroll problem
> in December. It fixed some fraction of the results that were coming
> from blogrolls, but was inadequate in a number of ways. For some
> blogs, the blog roll detection didn't pick anything up. For other
> blogs, it detect some items in the blog roll, but not all of them. My
> colleague Rick Klau was particularly unlucky. His blog appears in the
> blog rolls of many legal blogs. I noticed that we often detect every
> blog but his as a blogroll entry. We've been looking at a collection
> of backlink queries (with the link: operator) and still see about 50%
> of the results coming from blog rolls. So there is obviously a lot of
> room for improvement.
I wanted to clarify this point a little bit. The problem really is
worst for people with popular blogs. The average user is getting more
and better results as a consequence of the indexing changes that
introduced the blogroll problems. We're return results from blogs
with partial content feeds. We're index comments. We discover more
links. So a lot of our internal analysis shows that most queries do
better as a result of the changes. If there weren't some real
benefits to the indexing changes, we would have reverted to the old
version.
> We have been working on an improved blog roll detector. Our internal
> tests look fairly promising, but there is a lot of variability in blog
> markup that we need to handle. It's going to be a few more weeks
> until we can start to deploy it. I'll see if I can provide a better
> ETA next week.
> I haven't been paying attention to the Google Alerts specifically.
> The accuracy I mentioned earlier was for the regular search results.
> I'll make sure we add some metrics that look at Alerts quality so that
> we don't forgot about it again. The basic solution is the same for
> search results and for alerts, but maybe there's something more we can
> do for alerts in the short term.
> Jeremy
> On Feb 6, 8:07 am, tamar <puntr...@gmail.com> wrote:
> > Is anything at ALL being done about this? I'm starting to consider
> > either:
> > 1. flagging all Google Alerts sent to my Gmail inbox as spam (cuz uh,
> > they contain spammy results)
> > 2. unsubscribing from Google Alerts -- since the results returned
> > aren't relevant and they certainly aren't fresh. (Come on, isn't
> > Google's mission to organize the world's information? This is clearly
> > disorganized and in a very bad way.)
> > Google: we've been pretty darn patient. This thread started in
> > December and referenced an even older incident. It's February now.
> > Is ANYONE paying attention to this? Please?
> > Thanks.
> > (p.s. a Google Alert email just prompted this post update. I don't
> > really post about this out of the blue.)
> > On Feb 2, 12:11 am, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > Yeah, same thing for me. It keeps reverting to these old results
> > > which are completely worthless.
> > > On Jan 31, 7:00 pm, tamar <puntr...@gmail.com> wrote:
> > > > Today, I got links from 2006 and 2007 in my link: query emails.
> > > > :(
> > > > On Jan 28, 6:25 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > Yep, the problem remains. Either SPAM or Blogroll for 90% of
> > > > > results. The SPAM is actually getting worse. It's funny to see
> > > > > SPLOGS at the top of the relevancy rankings, or better yet, almost the
> > > > > entire first page of relevancy rankings being SPLOGS.
> > > > > On Jan 27, 10:22 am, tamar <puntr...@gmail.com> wrote:
> > > > > > It looks like no progress has been made on this front AT ALL. The
> > > > > > Google Alert emails I receive are spam and nothing but at this point.
> > > > > > Plus, I keep receiving the same emails again and again -- it's not
> > > > > > necessarily a "blogroll" issue but the same OLD content is being
> > > > > > treated by Google Blogsearch as new content. On one search query,
> > > > > > I've received the same result at least 10 times.
> > > > > > Jeremy and team, please don't forget about us.
> > > > > > On Jan 22, 9:39 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > Any update? It's been 3 weeks.
> > > > > > > On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > Jeremy, I'm doing searches for "tamar weinberg," my blog title name,
> > > > > > > > > or link:www.domain.com(wheredomain.comismyblog).
> > > > > > > > > I don't check blogsearch results regularly, but I just performed a
> > > > > > > > > search for the purposes of giving you as much information as possible
> > > > > > > > > and saw a result that showed my blog on the sidebar navigation from 4
> > > > > > > > > hours ago.
> > > > > > > > > That said, I'm pretty certain that this isn't fully addressed. :(
> > > > > > > > I agree that the problem isn't fully addressed :-(. I just did a
> > > > > > > > link: search for your blog. It returned 10 results ranging from 37
> > > > > > > > minutes old to several days old (Jan 1). There were two results that
> > > > > > > > obviously came from the blogroll, one fromhttp://janefouts.com/and > > > > > > > > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> > > > > > > > we failed to detect those links as coming from the blogroll. There
> > > > > > > > are also a few results that came from Techcrunch posts that you
> > > > > > > > commented on. The comment has a link to your blog. I think those are
> > > > > > > > legitimate results, but I'd be interested to hear what users thinks.
> > > > > > > > So we're at 80% accuracy at this very moment. It's better than it
> > > > > > > > was, but obviously a lot of room for improvement.
> > > > > > > > Jeremy
> > > > > > > > > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > > > > > > Tamar,
> > > > > > > > > > > It has become even more common. If Google Blog Search isn't finding
> > > > > > > > > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > > > > > > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > > > > > > > > SPLOG.
> > > > > > > > > > Can you tell me the specific queries that are showing bad results?
> > > > > > > > > > Also, is the problem specific to alerts or do you see them in regular
> > > > > > > > > > blogsearch results, too?
> > > > > > > > > > Jeremy
> > > > > > > > > > > On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > > > > Curious - around the same time of the initial report, I started
> > > > > > > > > > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > > > > > > > > > *more* common and not less common lately. Does the change you write
> > > > > > > > > > > > about, Jeremy, impact Google Alerts?
> > > > > > > > > > > > If not, perhaps someone should take a look.
> > > > > > > > > > > > Thanks.
> > > > > > > > > > > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > > > > > > > > > > blogroll problem. When we switched blogsearch to indexing the full
> > > > > > > > > > > > > text of posts, we started seeing a lot more results where the only
> > > > > > > > > > > > > matches for a query where from the blogroll or other parts of the page
> > > > > > > > > > > > > that frame the actual post. (There's been a lot of discussion of the
> > > > > > > > > > > > > problem. You can search for [google blogsearch] using Google
> > > > > > > > > > > > > Blogsearch.)
> > > > > > > > > > > > > We're in the midst of deploying a solution for this problem. The
> > > > > > > > > > > > > basic approach is to analyze each blog to look for text and markup
> > > > > > > > > > > > > that is common to all of the posts. Usually, these comment elements
> > > > > > > > > > > > > include the blogroll, any navigational elements, and other parts of
> > > > > > > > > > > > > the page that aren't part of the post. This approach works well for a
> > > > > > > > > > > > > lot of blogs, but we're continuing to improve the algorithm. The
> > > > > > > > > > > > > search results should ignore matches that only come from these common
> > > > > > > > > > > > > elements. The indexing change to implement it is deployed almost
> > > > > > > > > > > > > everywhere now.
> > > > > > > > > > > > > We expect users will continue to see some spurious results, but many
> > > > > > > > > > > > > fewer than before. I tried a search for my own name, which does
> > > > > > > > > > > > > appear in a few blogrolls, and all the results looked good. If you
> > > > > > > > > > > > > are still seeing blogroll hits, the problem is most likely caused by
> > > > > > > > > > > > > our failure to analyze a particular blog correctly. Feel free to
> > > > > > > > > > > > > follow up with examples in private email or in this forum.
1. Lots of redundancy. For example, 25 separate Google Alerts have
arrived in my inbox since 12/18/08 from a single blog source citing
the SAME exact blog post (nothing new!)
2. Old posts from 2006/2007.
3. The blogroll issue
That said, the issue seems to not necessarily be limited to the
blogroll itself. The entire system is a mess. And while I say Google
Alerts, I'm able to reproduce the problems every time simply by going
to blogsearch.google.com, so I don't really think you need to focus
too much on Google Alerts. After all, it seems to be gathering data
from a system that isn't exactly returning relevant results.
Also, some of the data I actually receive is not tied to popular blogs
of mine at all. I understand the indexing problems; I'm not
requesting that you revert to the old system, but I still contend that
the new system gives me 95% noise and 5% reasonable results, which is
pretty poor.
Hopefully Google's deployment of the fixes will address the issue.
p.s. I'll be happy to send you the *really* awkward results I've
received that illustrate all above issues if you want them...unless,
of course, you already received them. ;)
On Feb 6, 10:12 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> On Feb 6, 6:03 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > Tamar,
> > Apologies for my tardy response. I'll be sure to give everyone an
> > update every week, even if we don't have much news to report.
> > As I mentioned, we made an initial attempt to fix the blogroll problem
> > in December. It fixed some fraction of the results that were coming
> > from blogrolls, but was inadequate in a number of ways. For some
> > blogs, the blog roll detection didn't pick anything up. For other
> > blogs, it detect some items in the blog roll, but not all of them. My
> > colleague Rick Klau was particularly unlucky. His blog appears in the
> > blog rolls of many legal blogs. I noticed that we often detect every
> > blog but his as a blogroll entry. We've been looking at a collection
> > of backlink queries (with the link: operator) and still see about 50%
> > of the results coming from blog rolls. So there is obviously a lot of
> > room for improvement.
> I wanted to clarify this point a little bit. The problem really is
> worst for people with popular blogs. The average user is getting more
> and better results as a consequence of the indexing changes that
> introduced the blogroll problems. We're return results from blogs
> with partial content feeds. We're index comments. We discover more
> links. So a lot of our internal analysis shows that most queries do
> better as a result of the changes. If there weren't some real
> benefits to the indexing changes, we would have reverted to the old
> version.
> Jeremy
> > We have been working on an improved blog roll detector. Our internal
> > tests look fairly promising, but there is a lot of variability in blog
> > markup that we need to handle. It's going to be a few more weeks
> > until we can start to deploy it. I'll see if I can provide a better
> > ETA next week.
> > I haven't been paying attention to the Google Alerts specifically.
> > The accuracy I mentioned earlier was for the regular search results.
> > I'll make sure we add some metrics that look at Alerts quality so that
> > we don't forgot about it again. The basic solution is the same for
> > search results and for alerts, but maybe there's something more we can
> > do for alerts in the short term.
> > Jeremy
> > On Feb 6, 8:07 am, tamar <puntr...@gmail.com> wrote:
> > > Is anything at ALL being done about this? I'm starting to consider
> > > either:
> > > 1. flagging all Google Alerts sent to my Gmail inbox as spam (cuz uh,
> > > they contain spammy results)
> > > 2. unsubscribing from Google Alerts -- since the results returned
> > > aren't relevant and they certainly aren't fresh. (Come on, isn't
> > > Google's mission to organize the world's information? This is clearly
> > > disorganized and in a very bad way.)
> > > Google: we've been pretty darn patient. This thread started in
> > > December and referenced an even older incident. It's February now.
> > > Is ANYONE paying attention to this? Please?
> > > Thanks.
> > > (p.s. a Google Alert email just prompted this post update. I don't
> > > really post about this out of the blue.)
> > > On Feb 2, 12:11 am, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > Yeah, same thing for me. It keeps reverting to these old results
> > > > which are completely worthless.
> > > > On Jan 31, 7:00 pm, tamar <puntr...@gmail.com> wrote:
> > > > > Today, I got links from 2006 and 2007 in my link: query emails.
> > > > > :(
> > > > > On Jan 28, 6:25 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > Yep, the problem remains. Either SPAM or Blogroll for 90% of
> > > > > > results. The SPAM is actually getting worse. It's funny to see
> > > > > > SPLOGS at the top of the relevancy rankings, or better yet, almost the
> > > > > > entire first page of relevancy rankings being SPLOGS.
> > > > > > On Jan 27, 10:22 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > It looks like no progress has been made on this front AT ALL. The
> > > > > > > Google Alert emails I receive are spam and nothing but at this point.
> > > > > > > Plus, I keep receiving the same emails again and again -- it's not
> > > > > > > necessarily a "blogroll" issue but the same OLD content is being
> > > > > > > treated by Google Blogsearch as new content. On one search query,
> > > > > > > I've received the same result at least 10 times.
> > > > > > > Jeremy and team, please don't forget about us.
> > > > > > > On Jan 22, 9:39 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > Any update? It's been 3 weeks.
> > > > > > > > On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > > Jeremy, I'm doing searches for "tamar weinberg," my blog title name,
> > > > > > > > > > or link:www.domain.com(wheredomain.comismyblog).
> > > > > > > > > > I don't check blogsearch results regularly, but I just performed a
> > > > > > > > > > search for the purposes of giving you as much information as possible
> > > > > > > > > > and saw a result that showed my blog on the sidebar navigation from 4
> > > > > > > > > > hours ago.
> > > > > > > > > > That said, I'm pretty certain that this isn't fully addressed. :(
> > > > > > > > > I agree that the problem isn't fully addressed :-(. I just did a
> > > > > > > > > link: search for your blog. It returned 10 results ranging from 37
> > > > > > > > > minutes old to several days old (Jan 1). There were two results that
> > > > > > > > > obviously came from the blogroll, one fromhttp://janefouts.com/and > > > > > > > > > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> > > > > > > > > we failed to detect those links as coming from the blogroll. There
> > > > > > > > > are also a few results that came from Techcrunch posts that you
> > > > > > > > > commented on. The comment has a link to your blog. I think those are
> > > > > > > > > legitimate results, but I'd be interested to hear what users thinks.
> > > > > > > > > So we're at 80% accuracy at this very moment. It's better than it
> > > > > > > > > was, but obviously a lot of room for improvement.
> > > > > > > > > Jeremy
> > > > > > > > > > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > > On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > > > > > > > Tamar,
> > > > > > > > > > > > It has become even more common. If Google Blog Search isn't finding
> > > > > > > > > > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > > > > > > > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > > > > > > > > > SPLOG.
> > > > > > > > > > > Can you tell me the specific queries that are showing bad results?
> > > > > > > > > > > Also, is the problem specific to alerts or do you see them in regular
> > > > > > > > > > > blogsearch results, too?
> > > > > > > > > > > Jeremy
> > > > > > > > > > > > On Dec 26, 8:34 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > > > > > Curious - around the same time of the initial report, I started
> > > > > > > > > > > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > > > > > > > > > > *more* common and not less common lately. Does the change you write
> > > > > > > > > > > > > about, Jeremy, impact Google Alerts?
> > > > > > > > > > > > > If not, perhaps someone should take a look.
> > > > > > > > > > > > > Thanks.
> > > > > > > > > > > > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > > > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > > > > > > > > > > > blogroll problem. When we switched blogsearch to indexing the full
> > > > > > > > > > > > > > text of posts, we started seeing a lot more results where the only
> > > > > > > > > > > > > > matches for a query where from the blogroll or other parts of the page
> > > > > > > > > > > > > > that frame the actual post. (There's been a lot of discussion of the
> > > > > > > > > > > > > > problem. You can search for [google blogsearch] using Google
> > > > > > > > > > > > > > Blogsearch.)
> > > > > > > > > > > > > > We're in the midst of deploying a solution for this problem. The
> > > > > > > > > > > > > > basic approach is to analyze each blog to look for
It seems to have been better as of late until yesterday. All of a
sudden it reverted back to some old version and results from 2007 and
now coming up as the most relevant. As always, most of the recent
results have vanished if you search by date with the majority from 2
weeks to 2 months ago.
On Feb 7, 10:00 pm, tamar <puntr...@gmail.com> wrote:
> 1. Lots of redundancy. For example, 25 separate Google Alerts have
> arrived in my inbox since 12/18/08 from a single blog source citing
> the SAME exact blog post (nothing new!)
> 2. Old posts from 2006/2007.
> 3. The blogroll issue
> That said, the issue seems to not necessarily be limited to the
> blogroll itself. The entire system is a mess. And while I say Google
> Alerts, I'm able to reproduce the problems every time simply by going
> to blogsearch.google.com, so I don't really think you need to focus
> too much on Google Alerts. After all, it seems to be gathering data
> from a system that isn't exactly returning relevant results.
> Also, some of the data I actually receive is not tied to popular blogs
> of mine at all. I understand the indexing problems; I'm not
> requesting that you revert to the old system, but I still contend that
> the new system gives me 95% noise and 5% reasonable results, which is
> pretty poor.
> Hopefully Google's deployment of the fixes will address the issue.
> p.s. I'll be happy to send you the *really* awkward results I've
> received that illustrate all above issues if you want them...unless,
> of course, you already received them. ;)
> On Feb 6, 10:12 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > On Feb 6, 6:03 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > Tamar,
> > > Apologies for my tardy response. I'll be sure to give everyone an
> > > update every week, even if we don't have much news to report.
> > > As I mentioned, we made an initial attempt to fix the blogroll problem
> > > in December. It fixed some fraction of the results that were coming
> > > from blogrolls, but was inadequate in a number of ways. For some
> > > blogs, the blog roll detection didn't pick anything up. For other
> > > blogs, it detect some items in the blog roll, but not all of them. My
> > > colleague Rick Klau was particularly unlucky. His blog appears in the
> > > blog rolls of many legal blogs. I noticed that we often detect every
> > > blog but his as a blogroll entry. We've been looking at a collection
> > > of backlink queries (with the link: operator) and still see about 50%
> > > of the results coming from blog rolls. So there is obviously a lot of
> > > room for improvement.
> > I wanted to clarify this point a little bit. The problem really is
> > worst for people with popular blogs. The average user is getting more
> > and better results as a consequence of the indexing changes that
> > introduced the blogroll problems. We're return results from blogs
> > with partial content feeds. We're index comments. We discover more
> > links. So a lot of our internal analysis shows that most queries do
> > better as a result of the changes. If there weren't some real
> > benefits to the indexing changes, we would have reverted to the old
> > version.
> > Jeremy
> > > We have been working on an improved blog roll detector. Our internal
> > > tests look fairly promising, but there is a lot of variability in blog
> > > markup that we need to handle. It's going to be a few more weeks
> > > until we can start to deploy it. I'll see if I can provide a better
> > > ETA next week.
> > > I haven't been paying attention to the Google Alerts specifically.
> > > The accuracy I mentioned earlier was for the regular search results.
> > > I'll make sure we add some metrics that look at Alerts quality so that
> > > we don't forgot about it again. The basic solution is the same for
> > > search results and for alerts, but maybe there's something more we can
> > > do for alerts in the short term.
> > > Jeremy
> > > On Feb 6, 8:07 am, tamar <puntr...@gmail.com> wrote:
> > > > Is anything at ALL being done about this? I'm starting to consider
> > > > either:
> > > > 1. flagging all Google Alerts sent to my Gmail inbox as spam (cuz uh,
> > > > they contain spammy results)
> > > > 2. unsubscribing from Google Alerts -- since the results returned
> > > > aren't relevant and they certainly aren't fresh. (Come on, isn't
> > > > Google's mission to organize the world's information? This is clearly
> > > > disorganized and in a very bad way.)
> > > > Google: we've been pretty darn patient. This thread started in
> > > > December and referenced an even older incident. It's February now.
> > > > Is ANYONE paying attention to this? Please?
> > > > Thanks.
> > > > (p.s. a Google Alert email just prompted this post update. I don't
> > > > really post about this out of the blue.)
> > > > On Feb 2, 12:11 am, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > Yeah, same thing for me. It keeps reverting to these old results
> > > > > which are completely worthless.
> > > > > On Jan 31, 7:00 pm, tamar <puntr...@gmail.com> wrote:
> > > > > > Today, I got links from 2006 and 2007 in my link: query emails.
> > > > > > :(
> > > > > > On Jan 28, 6:25 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > > Yep, the problem remains. Either SPAM or Blogroll for 90% of
> > > > > > > results. The SPAM is actually getting worse. It's funny to see
> > > > > > > SPLOGS at the top of the relevancy rankings, or better yet, almost the
> > > > > > > entire first page of relevancy rankings being SPLOGS.
> > > > > > > On Jan 27, 10:22 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > It looks like no progress has been made on this front AT ALL. The
> > > > > > > > Google Alert emails I receive are spam and nothing but at this point.
> > > > > > > > Plus, I keep receiving the same emails again and again -- it's not
> > > > > > > > necessarily a "blogroll" issue but the same OLD content is being
> > > > > > > > treated by Google Blogsearch as new content. On one search query,
> > > > > > > > I've received the same result at least 10 times.
> > > > > > > > Jeremy and team, please don't forget about us.
> > > > > > > > On Jan 22, 9:39 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > Any update? It's been 3 weeks.
> > > > > > > > > On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > > > Jeremy, I'm doing searches for "tamar weinberg," my blog title name,
> > > > > > > > > > > or link:www.domain.com(wheredomain.comismyblog).
> > > > > > > > > > > I don't check blogsearch results regularly, but I just performed a
> > > > > > > > > > > search for the purposes of giving you as much information as possible
> > > > > > > > > > > and saw a result that showed my blog on the sidebar navigation from 4
> > > > > > > > > > > hours ago.
> > > > > > > > > > > That said, I'm pretty certain that this isn't fully addressed. :(
> > > > > > > > > > I agree that the problem isn't fully addressed :-(. I just did a
> > > > > > > > > > link: search for your blog. It returned 10 results ranging from 37
> > > > > > > > > > minutes old to several days old (Jan 1). There were two results that
> > > > > > > > > > obviously came from the blogroll, one fromhttp://janefouts.com/and > > > > > > > > > > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> > > > > > > > > > we failed to detect those links as coming from the blogroll. There
> > > > > > > > > > are also a few results that came from Techcrunch posts that you
> > > > > > > > > > commented on. The comment has a link to your blog. I think those are
> > > > > > > > > > legitimate results, but I'd be interested to hear what users thinks.
> > > > > > > > > > So we're at 80% accuracy at this very moment. It's better than it
> > > > > > > > > > was, but obviously a lot of room for improvement.
> > > > > > > > > > Jeremy
> > > > > > > > > > > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > > > On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > > > > > > > > Tamar,
> > > > > > > > > > > > > It has become even more common. If Google Blog Search isn't finding
> > > > > > > > > > > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > > > > > > > > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > > > > > > > > > > SPLOG.
> > > > > > > > > > > > Can you tell me the specific queries that are showing bad results?
> > > > > > > > > > > > Also, is the problem specific to alerts or do you see them in regular
> > > > > > > > > > > > blogsearch results, too?
> > > > > > > > > > > > > > Curious - around the same time of the initial report, I started
> > > > > > > > > > > > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > > > > > > > > > > > *more* common and not less common lately. Does the change you write
> > > > > > > > > > > > > > about, Jeremy, impact Google Alerts?
> > > > > > > > > > > > > > If not, perhaps someone should take a look.
> > > > > > > > > > > > > > Thanks.
> > > > > > > > > > > > > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > > > > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > > > > > > > > > > > > blogroll problem. When we switched
This is just a brief status report. We've been continuing to
experiment with blogroll detectors. We're going to do some user-
visible experiments early next month, probably starting with link:
queries. I'll follow up here when the experiments are running.
Jeremy
On Feb 7, 11:00 pm, tamar <puntr...@gmail.com> wrote:
> 1. Lots of redundancy. For example, 25 separate Google Alerts have
> arrived in my inbox since 12/18/08 from a single blog source citing
> the SAME exact blog post (nothing new!)
> 2. Old posts from 2006/2007.
> 3. The blogroll issue
> That said, the issue seems to not necessarily be limited to the
> blogroll itself. The entire system is a mess. And while I say Google
> Alerts, I'm able to reproduce the problems every time simply by going
> to blogsearch.google.com, so I don't really think you need to focus
> too much on Google Alerts. After all, it seems to be gathering data
> from a system that isn't exactly returning relevant results.
> Also, some of the data I actually receive is not tied to popular blogs
> of mine at all. I understand the indexing problems; I'm not
> requesting that you revert to the old system, but I still contend that
> the new system gives me 95% noise and 5% reasonable results, which is
> pretty poor.
> Hopefully Google's deployment of the fixes will address the issue.
> p.s. I'll be happy to send you the *really* awkward results I've
> received that illustrate all above issues if you want them...unless,
> of course, you already received them. ;)
> On Feb 6, 10:12 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > On Feb 6, 6:03 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > Tamar,
> > > Apologies for my tardy response. I'll be sure to give everyone an
> > > update every week, even if we don't have much news to report.
> > > As I mentioned, we made an initial attempt to fix the blogroll problem
> > > in December. It fixed some fraction of the results that were coming
> > > from blogrolls, but was inadequate in a number of ways. For some
> > > blogs, the blog roll detection didn't pick anything up. For other
> > > blogs, it detect some items in the blog roll, but not all of them. My
> > > colleague Rick Klau was particularly unlucky. His blog appears in the
> > > blog rolls of many legal blogs. I noticed that we often detect every
> > > blog but his as a blogroll entry. We've been looking at a collection
> > > of backlink queries (with the link: operator) and still see about 50%
> > > of the results coming from blog rolls. So there is obviously a lot of
> > > room for improvement.
> > I wanted to clarify this point a little bit. The problem really is
> > worst for people with popular blogs. The average user is getting more
> > and better results as a consequence of the indexing changes that
> > introduced the blogroll problems. We're return results from blogs
> > with partial content feeds. We're index comments. We discover more
> > links. So a lot of our internal analysis shows that most queries do
> > better as a result of the changes. If there weren't some real
> > benefits to the indexing changes, we would have reverted to the old
> > version.
> > Jeremy
> > > We have been working on an improved blog roll detector. Our internal
> > > tests look fairly promising, but there is a lot of variability in blog
> > > markup that we need to handle. It's going to be a few more weeks
> > > until we can start to deploy it. I'll see if I can provide a better
> > > ETA next week.
> > > I haven't been paying attention to the Google Alerts specifically.
> > > The accuracy I mentioned earlier was for the regular search results.
> > > I'll make sure we add some metrics that look at Alerts quality so that
> > > we don't forgot about it again. The basic solution is the same for
> > > search results and for alerts, but maybe there's something more we can
> > > do for alerts in the short term.
> > > Jeremy
> > > On Feb 6, 8:07 am, tamar <puntr...@gmail.com> wrote:
> > > > Is anything at ALL being done about this? I'm starting to consider
> > > > either:
> > > > 1. flagging all Google Alerts sent to my Gmail inbox as spam (cuz uh,
> > > > they contain spammy results)
> > > > 2. unsubscribing from Google Alerts -- since the results returned
> > > > aren't relevant and they certainly aren't fresh. (Come on, isn't
> > > > Google's mission to organize the world's information? This is clearly
> > > > disorganized and in a very bad way.)
> > > > Google: we've been pretty darn patient. This thread started in
> > > > December and referenced an even older incident. It's February now.
> > > > Is ANYONE paying attention to this? Please?
> > > > Thanks.
> > > > (p.s. a Google Alert email just prompted this post update. I don't
> > > > really post about this out of the blue.)
> > > > On Feb 2, 12:11 am, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > Yeah, same thing for me. It keeps reverting to these old results
> > > > > which are completely worthless.
> > > > > On Jan 31, 7:00 pm, tamar <puntr...@gmail.com> wrote:
> > > > > > Today, I got links from 2006 and 2007 in my link: query emails.
> > > > > > :(
> > > > > > On Jan 28, 6:25 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > > Yep, the problem remains. Either SPAM or Blogroll for 90% of
> > > > > > > results. The SPAM is actually getting worse. It's funny to see
> > > > > > > SPLOGS at the top of the relevancy rankings, or better yet, almost the
> > > > > > > entire first page of relevancy rankings being SPLOGS.
> > > > > > > On Jan 27, 10:22 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > It looks like no progress has been made on this front AT ALL. The
> > > > > > > > Google Alert emails I receive are spam and nothing but at this point.
> > > > > > > > Plus, I keep receiving the same emails again and again -- it's not
> > > > > > > > necessarily a "blogroll" issue but the same OLD content is being
> > > > > > > > treated by Google Blogsearch as new content. On one search query,
> > > > > > > > I've received the same result at least 10 times.
> > > > > > > > Jeremy and team, please don't forget about us.
> > > > > > > > On Jan 22, 9:39 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > Any update? It's been 3 weeks.
> > > > > > > > > On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > > > Jeremy, I'm doing searches for "tamar weinberg," my blog title name,
> > > > > > > > > > > or link:www.domain.com(wheredomain.comismyblog).
> > > > > > > > > > > I don't check blogsearch results regularly, but I just performed a
> > > > > > > > > > > search for the purposes of giving you as much information as possible
> > > > > > > > > > > and saw a result that showed my blog on the sidebar navigation from 4
> > > > > > > > > > > hours ago.
> > > > > > > > > > > That said, I'm pretty certain that this isn't fully addressed. :(
> > > > > > > > > > I agree that the problem isn't fully addressed :-(. I just did a
> > > > > > > > > > link: search for your blog. It returned 10 results ranging from 37
> > > > > > > > > > minutes old to several days old (Jan 1). There were two results that
> > > > > > > > > > obviously came from the blogroll, one fromhttp://janefouts.com/and > > > > > > > > > > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> > > > > > > > > > we failed to detect those links as coming from the blogroll. There
> > > > > > > > > > are also a few results that came from Techcrunch posts that you
> > > > > > > > > > commented on. The comment has a link to your blog. I think those are
> > > > > > > > > > legitimate results, but I'd be interested to hear what users thinks.
> > > > > > > > > > So we're at 80% accuracy at this very moment. It's better than it
> > > > > > > > > > was, but obviously a lot of room for improvement.
> > > > > > > > > > Jeremy
> > > > > > > > > > > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > > > On Dec 28, 11:10 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > > > > > > > > Tamar,
> > > > > > > > > > > > > It has become even more common. If Google Blog Search isn't finding
> > > > > > > > > > > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > > > > > > > > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > > > > > > > > > > SPLOG.
> > > > > > > > > > > > Can you tell me the specific queries that are showing bad results?
> > > > > > > > > > > > Also, is the problem specific to alerts or do you see them in regular
> > > > > > > > > > > > blogsearch results, too?
> > > > > > > > > > > > > > Curious - around the same time of the initial report, I started
> > > > > > > > > > > > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > > > > > > > > > > > *more* common and not less common lately. Does the change you write
> > > > > > > > > > > > > > about, Jeremy, impact Google Alerts?
> > > > > > > > > > > > > > If not, perhaps someone should take a look.
> > > > > > > > > > > > > > Thanks.
> > > > > > > > > > > > > > On Dec 19, 1:25 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > > > > > > I wanted to give everyone a brief end-of-the-year update on the
> > > > > > > > > > > > > > > blogroll problem. When we switched blogsearch to indexing the
> This is just a brief status report. We've been continuing to
> experiment with blogroll detectors. We're going to do some user-
> visible experiments early next month, probably starting with link:
> queries. I'll follow up here when the experiments are running.
> Jeremy
> On Feb 7, 11:00 pm, tamar <puntr...@gmail.com> wrote:
> > Thanks for the update.
> > A few things I noticed lately:
> > 1. Lots of redundancy. For example, 25 separate Google Alerts have
> > arrived in my inbox since 12/18/08 from a single blog source citing
> > the SAME exact blog post (nothing new!)
> > 2. Old posts from 2006/2007.
> > 3. The blogroll issue
> > That said, the issue seems to not necessarily be limited to the
> > blogroll itself. The entire system is a mess. And while I say Google
> > Alerts, I'm able to reproduce the problems every time simply by going
> > to blogsearch.google.com, so I don't really think you need to focus
> > too much on Google Alerts. After all, it seems to be gathering data
> > from a system that isn't exactly returning relevant results.
> > Also, some of the data I actually receive is not tied to popular blogs
> > of mine at all. I understand the indexing problems; I'm not
> > requesting that you revert to the old system, but I still contend that
> > the new system gives me 95% noise and 5% reasonable results, which is
> > pretty poor.
> > Hopefully Google's deployment of the fixes will address the issue.
> > p.s. I'll be happy to send you the *really* awkward results I've
> > received that illustrate all above issues if you want them...unless,
> > of course, you already received them. ;)
> > On Feb 6, 10:12 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > On Feb 6, 6:03 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > Tamar,
> > > > Apologies for my tardy response. I'll be sure to give everyone an
> > > > update every week, even if we don't have much news to report.
> > > > As I mentioned, we made an initial attempt to fix the blogroll problem
> > > > in December. It fixed some fraction of the results that were coming
> > > > from blogrolls, but was inadequate in a number of ways. For some
> > > > blogs, the blog roll detection didn't pick anything up. For other
> > > > blogs, it detect some items in the blog roll, but not all of them. My
> > > > colleague Rick Klau was particularly unlucky. His blog appears in the
> > > > blog rolls of many legal blogs. I noticed that we often detect every
> > > > blog but his as a blogroll entry. We've been looking at a collection
> > > > of backlink queries (with the link: operator) and still see about 50%
> > > > of the results coming from blog rolls. So there is obviously a lot of
> > > > room for improvement.
> > > I wanted to clarify this point a little bit. The problem really is
> > > worst for people with popular blogs. The average user is getting more
> > > and better results as a consequence of the indexing changes that
> > > introduced the blogroll problems. We're return results from blogs
> > > with partial content feeds. We're index comments. We discover more
> > > links. So a lot of our internal analysis shows that most queries do
> > > better as a result of the changes. If there weren't some real
> > > benefits to the indexing changes, we would have reverted to the old
> > > version.
> > > Jeremy
> > > > We have been working on an improved blog roll detector. Our internal
> > > > tests look fairly promising, but there is a lot of variability in blog
> > > > markup that we need to handle. It's going to be a few more weeks
> > > > until we can start to deploy it. I'll see if I can provide a better
> > > > ETA next week.
> > > > I haven't been paying attention to the Google Alerts specifically.
> > > > The accuracy I mentioned earlier was for the regular search results.
> > > > I'll make sure we add some metrics that look at Alerts quality so that
> > > > we don't forgot about it again. The basic solution is the same for
> > > > search results and for alerts, but maybe there's something more we can
> > > > do for alerts in the short term.
> > > > Jeremy
> > > > On Feb 6, 8:07 am, tamar <puntr...@gmail.com> wrote:
> > > > > Is anything at ALL being done about this? I'm starting to consider
> > > > > either:
> > > > > 1. flagging all Google Alerts sent to my Gmail inbox as spam (cuz uh,
> > > > > they contain spammy results)
> > > > > 2. unsubscribing from Google Alerts -- since the results returned
> > > > > aren't relevant and they certainly aren't fresh. (Come on, isn't
> > > > > Google's mission to organize the world's information? This is clearly
> > > > > disorganized and in a very bad way.)
> > > > > Google: we've been pretty darn patient. This thread started in
> > > > > December and referenced an even older incident. It's February now.
> > > > > Is ANYONE paying attention to this? Please?
> > > > > Thanks.
> > > > > (p.s. a Google Alert email just prompted this post update. I don't
> > > > > really post about this out of the blue.)
> > > > > On Feb 2, 12:11 am, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > Yeah, same thing for me. It keeps reverting to these old results
> > > > > > which are completely worthless.
> > > > > > On Jan 31, 7:00 pm, tamar <puntr...@gmail.com> wrote:
> > > > > > > Today, I got links from 2006 and 2007 in my link: query emails.
> > > > > > > :(
> > > > > > > On Jan 28, 6:25 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > > > Yep, the problem remains. Either SPAM or Blogroll for 90% of
> > > > > > > > results. The SPAM is actually getting worse. It's funny to see
> > > > > > > > SPLOGS at the top of the relevancy rankings, or better yet, almost the
> > > > > > > > entire first page of relevancy rankings being SPLOGS.
> > > > > > > > On Jan 27, 10:22 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > It looks like no progress has been made on this front AT ALL. The
> > > > > > > > > Google Alert emails I receive are spam and nothing but at this point.
> > > > > > > > > Plus, I keep receiving the same emails again and again -- it's not
> > > > > > > > > necessarily a "blogroll" issue but the same OLD content is being
> > > > > > > > > treated by Google Blogsearch as new content. On one search query,
> > > > > > > > > I've received the same result at least 10 times.
> > > > > > > > > Jeremy and team, please don't forget about us.
> > > > > > > > > On Jan 22, 9:39 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > > Any update? It's been 3 weeks.
> > > > > > > > > > On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > > On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > > > > Jeremy, I'm doing searches for "tamar weinberg," my blog title name,
> > > > > > > > > > > > or link:www.domain.com(wheredomain.comismyblog).
> > > > > > > > > > > > I don't check blogsearch results regularly, but I just performed a
> > > > > > > > > > > > search for the purposes of giving you as much information as possible
> > > > > > > > > > > > and saw a result that showed my blog on the sidebar navigation from 4
> > > > > > > > > > > > hours ago.
> > > > > > > > > > > > That said, I'm pretty certain that this isn't fully addressed. :(
> > > > > > > > > > > I agree that the problem isn't fully addressed :-(. I just did a
> > > > > > > > > > > link: search for your blog. It returned 10 results ranging from 37
> > > > > > > > > > > minutes old to several days old (Jan 1). There were two results that
> > > > > > > > > > > obviously came from the blogroll, one fromhttp://janefouts.com/and > > > > > > > > > > > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> > > > > > > > > > > we failed to detect those links as coming from the blogroll. There
> > > > > > > > > > > are also a few results that came from Techcrunch posts that you
> > > > > > > > > > > commented on. The comment has a link to your blog. I think those are
> > > > > > > > > > > legitimate results, but I'd be interested to hear what users thinks.
> > > > > > > > > > > So we're at 80% accuracy at this very moment. It's better than it
> > > > > > > > > > > was, but obviously a lot of room for improvement.
> > > > > > > > > > > Jeremy
> > > > > > > > > > > > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > > > > > It has become even more common. If Google Blog Search isn't finding
> > > > > > > > > > > > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > > > > > > > > > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > > > > > > > > > > > SPLOG.
> > > > > > > > > > > > > Can you tell me the specific queries that are showing bad results?
> > > > > > > > > > > > > Also, is the problem specific to alerts or do you see them in regular
> > > > > > > > > > > > > blogsearch results, too?
> > > > > > > > > > > > > > > Curious - around the same time of the initial report, I started
> > > > > > > > > > > > > > > getting Google Alerts with blogroll links. If anything, it's become
> > > > > > > > > > > > > > > *more* common and not less common lately. Does the change you
> Unfortunately, we ran into some delays with these experiments and had
> to push back the schedule a couple of weeks.
> Jeremy
> On Feb 25, 11:22 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > This is just a brief status report. We've been continuing to
> > experiment with blogroll detectors. We're going to do some user-
> > visible experiments early next month, probably starting with link:
> > queries. I'll follow up here when the experiments are running.
> > Jeremy
> > On Feb 7, 11:00 pm, tamar <puntr...@gmail.com> wrote:
> > > Thanks for the update.
> > > A few things I noticed lately:
> > > 1. Lots of redundancy. For example, 25 separate Google Alerts have
> > > arrived in my inbox since 12/18/08 from a single blog source citing
> > > the SAME exact blog post (nothing new!)
> > > 2. Old posts from 2006/2007.
> > > 3. The blogroll issue
> > > That said, the issue seems to not necessarily be limited to the
> > > blogroll itself. The entire system is a mess. And while I say Google
> > > Alerts, I'm able to reproduce the problems every time simply by going
> > > to blogsearch.google.com, so I don't really think you need to focus
> > > too much on Google Alerts. After all, it seems to be gathering data
> > > from a system that isn't exactly returning relevant results.
> > > Also, some of the data I actually receive is not tied to popular blogs
> > > of mine at all. I understand the indexing problems; I'm not
> > > requesting that you revert to the old system, but I still contend that
> > > the new system gives me 95% noise and 5% reasonable results, which is
> > > pretty poor.
> > > Hopefully Google's deployment of the fixes will address the issue.
> > > p.s. I'll be happy to send you the *really* awkward results I've
> > > received that illustrate all above issues if you want them...unless,
> > > of course, you already received them. ;)
> > > On Feb 6, 10:12 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > On Feb 6, 6:03 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > Tamar,
> > > > > Apologies for my tardy response. I'll be sure to give everyone an
> > > > > update every week, even if we don't have much news to report.
> > > > > As I mentioned, we made an initial attempt to fix the blogroll problem
> > > > > in December. It fixed some fraction of the results that were coming
> > > > > from blogrolls, but was inadequate in a number of ways. For some
> > > > > blogs, the blog roll detection didn't pick anything up. For other
> > > > > blogs, it detect some items in the blog roll, but not all of them. My
> > > > > colleague Rick Klau was particularly unlucky. His blog appears in the
> > > > > blog rolls of many legal blogs. I noticed that we often detect every
> > > > > blog but his as a blogroll entry. We've been looking at a collection
> > > > > of backlink queries (with the link: operator) and still see about 50%
> > > > > of the results coming from blog rolls. So there is obviously a lot of
> > > > > room for improvement.
> > > > I wanted to clarify this point a little bit. The problem really is
> > > > worst for people with popular blogs. The average user is getting more
> > > > and better results as a consequence of the indexing changes that
> > > > introduced the blogroll problems. We're return results from blogs
> > > > with partial content feeds. We're index comments. We discover more
> > > > links. So a lot of our internal analysis shows that most queries do
> > > > better as a result of the changes. If there weren't some real
> > > > benefits to the indexing changes, we would have reverted to the old
> > > > version.
> > > > Jeremy
> > > > > We have been working on an improved blog roll detector. Our internal
> > > > > tests look fairly promising, but there is a lot of variability in blog
> > > > > markup that we need to handle. It's going to be a few more weeks
> > > > > until we can start to deploy it. I'll see if I can provide a better
> > > > > ETA next week.
> > > > > I haven't been paying attention to the Google Alerts specifically.
> > > > > The accuracy I mentioned earlier was for the regular search results.
> > > > > I'll make sure we add some metrics that look at Alerts quality so that
> > > > > we don't forgot about it again. The basic solution is the same for
> > > > > search results and for alerts, but maybe there's something more we can
> > > > > do for alerts in the short term.
> > > > > Jeremy
> > > > > On Feb 6, 8:07 am, tamar <puntr...@gmail.com> wrote:
> > > > > > Is anything at ALL being done about this? I'm starting to consider
> > > > > > either:
> > > > > > 1. flagging all Google Alerts sent to my Gmail inbox as spam (cuz uh,
> > > > > > they contain spammy results)
> > > > > > 2. unsubscribing from Google Alerts -- since the results returned
> > > > > > aren't relevant and they certainly aren't fresh. (Come on, isn't
> > > > > > Google's mission to organize the world's information? This is clearly
> > > > > > disorganized and in a very bad way.)
> > > > > > Google: we've been pretty darn patient. This thread started in
> > > > > > December and referenced an even older incident. It's February now.
> > > > > > Is ANYONE paying attention to this? Please?
> > > > > > Thanks.
> > > > > > (p.s. a Google Alert email just prompted this post update. I don't
> > > > > > really post about this out of the blue.)
> > > > > > On Feb 2, 12:11 am, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > > Yeah, same thing for me. It keeps reverting to these old results
> > > > > > > which are completely worthless.
> > > > > > > On Jan 31, 7:00 pm, tamar <puntr...@gmail.com> wrote:
> > > > > > > > Today, I got links from 2006 and 2007 in my link: query emails.
> > > > > > > > :(
> > > > > > > > On Jan 28, 6:25 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > > > > Yep, the problem remains. Either SPAM or Blogroll for 90% of
> > > > > > > > > results. The SPAM is actually getting worse. It's funny to see
> > > > > > > > > SPLOGS at the top of the relevancy rankings, or better yet, almost the
> > > > > > > > > entire first page of relevancy rankings being SPLOGS.
> > > > > > > > > On Jan 27, 10:22 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > > It looks like no progress has been made on this front AT ALL. The
> > > > > > > > > > Google Alert emails I receive are spam and nothing but at this point.
> > > > > > > > > > Plus, I keep receiving the same emails again and again -- it's not
> > > > > > > > > > necessarily a "blogroll" issue but the same OLD content is being
> > > > > > > > > > treated by Google Blogsearch as new content. On one search query,
> > > > > > > > > > I've received the same result at least 10 times.
> > > > > > > > > > Jeremy and team, please don't forget about us.
> > > > > > > > > > On Jan 22, 9:39 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > > > Any update? It's been 3 weeks.
> > > > > > > > > > > On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > > > On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > > > > > Jeremy, I'm doing searches for "tamar weinberg," my blog title name,
> > > > > > > > > > > > > or link:www.domain.com(wheredomain.comismyblog).
> > > > > > > > > > > > > I don't check blogsearch results regularly, but I just performed a
> > > > > > > > > > > > > search for the purposes of giving you as much information as possible
> > > > > > > > > > > > > and saw a result that showed my blog on the sidebar navigation from 4
> > > > > > > > > > > > > hours ago.
> > > > > > > > > > > > > That said, I'm pretty certain that this isn't fully addressed. :(
> > > > > > > > > > > > I agree that the problem isn't fully addressed :-(. I just did a
> > > > > > > > > > > > link: search for your blog. It returned 10 results ranging from 37
> > > > > > > > > > > > minutes old to several days old (Jan 1). There were two results that
> > > > > > > > > > > > obviously came from the blogroll, one fromhttp://janefouts.com/and > > > > > > > > > > > > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> > > > > > > > > > > > we failed to detect those links as coming from the blogroll. There
> > > > > > > > > > > > are also a few results that came from Techcrunch posts that you
> > > > > > > > > > > > commented on. The comment has a link to your blog. I think those are
> > > > > > > > > > > > legitimate results, but I'd be interested to hear what users thinks.
> > > > > > > > > > > > So we're at 80% accuracy at this very moment. It's better than it
> > > > > > > > > > > > was, but obviously a lot of room for improvement.
> > > > > > > > > > > > Jeremy
> > > > > > > > > > > > > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > > > > > > It has become even more common. If Google Blog Search isn't finding
> > > > > > > > > > > > > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > > > > > > > > > > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > > > > > > > > > > > > SPLOG.
> > > > > > > > > > > > > > Can you tell me the specific queries that are showing bad results?
> > > > > > > > > > > > > > Also, is the problem specific to alerts or do you see them in regular
> > > > > > > > > > > > > > blogsearch results, too?
> Unfortunately, we ran into some delays with these experiments and had
> to push back the schedule a couple of weeks.
> Jeremy
> On Feb 25, 11:22 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > This is just a brief status report. We've been continuing to
> > experiment with blogroll detectors. We're going to do some user-
> > visible experiments early next month, probably starting with link:
> > queries. I'll follow up here when the experiments are running.
> > Jeremy
> > On Feb 7, 11:00 pm, tamar <puntr...@gmail.com> wrote:
> > > Thanks for the update.
> > > A few things I noticed lately:
> > > 1. Lots of redundancy. For example, 25 separate Google Alerts have
> > > arrived in my inbox since 12/18/08 from a single blog source citing
> > > the SAME exact blog post (nothing new!)
> > > 2. Old posts from 2006/2007.
> > > 3. The blogroll issue
> > > That said, the issue seems to not necessarily be limited to the
> > > blogroll itself. The entire system is a mess. And while I say Google
> > > Alerts, I'm able to reproduce the problems every time simply by going
> > > to blogsearch.google.com, so I don't really think you need to focus
> > > too much on Google Alerts. After all, it seems to be gathering data
> > > from a system that isn't exactly returning relevant results.
> > > Also, some of the data I actually receive is not tied to popular blogs
> > > of mine at all. I understand the indexing problems; I'm not
> > > requesting that you revert to the old system, but I still contend that
> > > the new system gives me 95% noise and 5% reasonable results, which is
> > > pretty poor.
> > > Hopefully Google's deployment of the fixes will address the issue.
> > > p.s. I'll be happy to send you the *really* awkward results I've
> > > received that illustrate all above issues if you want them...unless,
> > > of course, you already received them. ;)
> > > On Feb 6, 10:12 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > On Feb 6, 6:03 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > Tamar,
> > > > > Apologies for my tardy response. I'll be sure to give everyone an
> > > > > update every week, even if we don't have much news to report.
> > > > > As I mentioned, we made an initial attempt to fix the blogroll problem
> > > > > in December. It fixed some fraction of the results that were coming
> > > > > from blogrolls, but was inadequate in a number of ways. For some
> > > > > blogs, the blog roll detection didn't pick anything up. For other
> > > > > blogs, it detect some items in the blog roll, but not all of them. My
> > > > > colleague Rick Klau was particularly unlucky. His blog appears in the
> > > > > blog rolls of many legal blogs. I noticed that we often detect every
> > > > > blog but his as a blogroll entry. We've been looking at a collection
> > > > > of backlink queries (with the link: operator) and still see about 50%
> > > > > of the results coming from blog rolls. So there is obviously a lot of
> > > > > room for improvement.
> > > > I wanted to clarify this point a little bit. The problem really is
> > > > worst for people with popular blogs. The average user is getting more
> > > > and better results as a consequence of the indexing changes that
> > > > introduced the blogroll problems. We're return results from blogs
> > > > with partial content feeds. We're index comments. We discover more
> > > > links. So a lot of our internal analysis shows that most queries do
> > > > better as a result of the changes. If there weren't some real
> > > > benefits to the indexing changes, we would have reverted to the old
> > > > version.
> > > > Jeremy
> > > > > We have been working on an improved blog roll detector. Our internal
> > > > > tests look fairly promising, but there is a lot of variability in blog
> > > > > markup that we need to handle. It's going to be a few more weeks
> > > > > until we can start to deploy it. I'll see if I can provide a better
> > > > > ETA next week.
> > > > > I haven't been paying attention to the Google Alerts specifically.
> > > > > The accuracy I mentioned earlier was for the regular search results.
> > > > > I'll make sure we add some metrics that look at Alerts quality so that
> > > > > we don't forgot about it again. The basic solution is the same for
> > > > > search results and for alerts, but maybe there's something more we can
> > > > > do for alerts in the short term.
> > > > > Jeremy
> > > > > On Feb 6, 8:07 am, tamar <puntr...@gmail.com> wrote:
> > > > > > Is anything at ALL being done about this? I'm starting to consider
> > > > > > either:
> > > > > > 1. flagging all Google Alerts sent to my Gmail inbox as spam (cuz uh,
> > > > > > they contain spammy results)
> > > > > > 2. unsubscribing from Google Alerts -- since the results returned
> > > > > > aren't relevant and they certainly aren't fresh. (Come on, isn't
> > > > > > Google's mission to organize the world's information? This is clearly
> > > > > > disorganized and in a very bad way.)
> > > > > > Google: we've been pretty darn patient. This thread started in
> > > > > > December and referenced an even older incident. It's February now.
> > > > > > Is ANYONE paying attention to this? Please?
> > > > > > Thanks.
> > > > > > (p.s. a Google Alert email just prompted this post update. I don't
> > > > > > really post about this out of the blue.)
> > > > > > On Feb 2, 12:11 am, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > > Yeah, same thing for me. It keeps reverting to these old results
> > > > > > > which are completely worthless.
> > > > > > > On Jan 31, 7:00 pm, tamar <puntr...@gmail.com> wrote:
> > > > > > > > Today, I got links from 2006 and 2007 in my link: query emails.
> > > > > > > > :(
> > > > > > > > On Jan 28, 6:25 pm, Kyle_Texas <Reiko.Admi...@gmail.com> wrote:
> > > > > > > > > Yep, the problem remains. Either SPAM or Blogroll for 90% of
> > > > > > > > > results. The SPAM is actually getting worse. It's funny to see
> > > > > > > > > SPLOGS at the top of the relevancy rankings, or better yet, almost the
> > > > > > > > > entire first page of relevancy rankings being SPLOGS.
> > > > > > > > > On Jan 27, 10:22 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > > It looks like no progress has been made on this front AT ALL. The
> > > > > > > > > > Google Alert emails I receive are spam and nothing but at this point.
> > > > > > > > > > Plus, I keep receiving the same emails again and again -- it's not
> > > > > > > > > > necessarily a "blogroll" issue but the same OLD content is being
> > > > > > > > > > treated by Google Blogsearch as new content. On one search query,
> > > > > > > > > > I've received the same result at least 10 times.
> > > > > > > > > > Jeremy and team, please don't forget about us.
> > > > > > > > > > On Jan 22, 9:39 am, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > > > Any update? It's been 3 weeks.
> > > > > > > > > > > On Jan 7, 12:58 pm, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > > > On Jan 1, 9:54 pm, tamar <puntr...@gmail.com> wrote:
> > > > > > > > > > > > > Jeremy, I'm doing searches for "tamar weinberg," my blog title name,
> > > > > > > > > > > > > or link:www.domain.com(wheredomain.comismyblog).
> > > > > > > > > > > > > I don't check blogsearch results regularly, but I just performed a
> > > > > > > > > > > > > search for the purposes of giving you as much information as possible
> > > > > > > > > > > > > and saw a result that showed my blog on the sidebar navigation from 4
> > > > > > > > > > > > > hours ago.
> > > > > > > > > > > > > That said, I'm pretty certain that this isn't fully addressed. :(
> > > > > > > > > > > > I agree that the problem isn't fully addressed :-(. I just did a
> > > > > > > > > > > > link: search for your blog. It returned 10 results ranging from 37
> > > > > > > > > > > > minutes old to several days old (Jan 1). There were two results that
> > > > > > > > > > > > obviously came from the blogroll, one fromhttp://janefouts.com/and > > > > > > > > > > > > one fromhttp://simplystated.realsimple.com/. We'll have to see why
> > > > > > > > > > > > we failed to detect those links as coming from the blogroll. There
> > > > > > > > > > > > are also a few results that came from Techcrunch posts that you
> > > > > > > > > > > > commented on. The comment has a link to your blog. I think those are
> > > > > > > > > > > > legitimate results, but I'd be interested to hear what users thinks.
> > > > > > > > > > > > So we're at 80% accuracy at this very moment. It's better than it
> > > > > > > > > > > > was, but obviously a lot of room for improvement.
> > > > > > > > > > > > Jeremy
> > > > > > > > > > > > > On Dec 29 2008, 11:35 am, Jeremy Hylton <jhyl...@gmail.com> wrote:
> > > > > > > > > > > > > > > It has become even more common. If Google Blog Search isn't finding
> > > > > > > > > > > > > > > these blogroll hits, it is finding spam. In the last 3 days, I have
> > > > > > > > > > > > > > > seen exactly ONE result which was not a result from the blogroll or a
> > > > > > > > > > > > > > > SPLOG.
> > > > > > > > > > > > > > Can you tell me the specific queries that are showing bad results?
> > > > > > > > > > > > > > Also, is the problem specific to alerts or do you see them in regular
> > > > > > > > > > > > > > blogsearch results, too?