Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Message sorting.

0 views
Skip to first unread message

agent-user

unread,
Nov 1, 2007, 12:10:02 PM11/1/07
to

Agent v2 ... but I'm not even sure if it's an Agent issue...

Can someone explain how it is that recently I've seen a few new
sub-threads of newly posted articles suddenly appear under old
threads which have entirely different subject titles. I can see no
reason why they appear like that and not as new threads or better
still, as sub-threads of their original Subject: What causes this?

Is it something to do with bad article numbering? quoted MIDs?
Is it a newsserver problem?
Is my Agent sorting not configured right?

Ralph Fox

unread,
Nov 1, 2007, 2:07:04 PM11/1/07
to


It is an Agent shortcoming, caused by a "hash collision" within Agent.

If you search for "hash collision" (as exact phrase) in the
Google Groups archives for this group, you will find plenty of
information about it.

http://groups.google.com/groups?as_epq=hash+collision&as_ugroup=alt.usenet.offline-reader.forte-agent

--
Cheers,
Ralph

Al Balmer

unread,
Nov 1, 2007, 3:08:47 PM11/1/07
to
On Fri, 02 Nov 2007 07:07:04 +1300, Ralph Fox <-rf-nz-@-.invalid>
wrote:

>On Thu, 01 Nov 2007 16:10:02 +0000, in message <b17f66df2bc6702c...@localhost.127.0.0.1>,
>agent-user wrote:
>
>>
>> Agent v2 ... but I'm not even sure if it's an Agent issue...
>>
>> Can someone explain how it is that recently I've seen a few new
>> sub-threads of newly posted articles suddenly appear under old
>> threads which have entirely different subject titles. I can see no
>> reason why they appear like that and not as new threads or better
>> still, as sub-threads of their original Subject: What causes this?
>>
>> Is it something to do with bad article numbering? quoted MIDs?
>> Is it a newsserver problem?
>> Is my Agent sorting not configured right?
>
>
>It is an Agent shortcoming, caused by a "hash collision" within Agent.
>
>If you search for "hash collision" (as exact phrase) in the
>Google Groups archives for this group, you will find plenty of
>information about it.

As exact phrase, I get nothing, but search for "all words" gets good
information.
>
>http://groups.google.com/groups?as_epq=hash+collisions&as_ugroup=alt.usenet.offline-reader.forte-agent

I don't understand why the problem is still there. Hash lookups are
cut and dried. Collisions are expected (except for perfect hashes) and
compensated for in any of a number of standard ways.

--
Al Balmer
Sun City, AZ

agent-user

unread,
Nov 1, 2007, 4:36:57 PM11/1/07
to

On Fri, 02 Nov 2007 07:07:04 +1300 'Ralph Fox'
wrote this on alt.usenet.offline-reader.forte-agent:

Thanks Ralph and Al Balmer ... this extract from one of those
Google archive posts seems to explain and sum up that problem:

"Message-Ids are hashed [by Agent] to smaller values to facilitate
rapid threading. The problem is that the hash value chosen was too
small even for most text groups [ ... ]. So you occasionally see a
message in the wrong thread."


To be honest, I've never heard of that problem and I seem to see
it arise on several text groups where I have a local store of posts
in excess of 64,000 articles. I guess there's no solution to it?
Does Agent v4 remove the problem???


HOWEVER, in a way that still doesn't answer a part of my original
question (which I never actually asked ;-) ) which is what makes
one article appear in the same thread under the article that it is
a reply to? Clearly it isn't the MID.

Don Kirkman

unread,
Nov 1, 2007, 7:52:25 PM11/1/07
to
It seems to me I heard somewhere that agent-user wrote in article
<a7c0901e2ec4d416...@localhost.127.0.0.1>:

>>http://groups.google.com/groups?as_epq=hash+collision&as_ugroup=alt.usenet.offline-reader.forte-agent

I don't think there's anything anyone but Forte can do about hash
collisions, but a possible answer to your question is that, using the
reference header which has previous MIDs , Agent can (usually) attach a
message to those that preceded it in the thread. This action can be
changed by setting the option in Tools | Options | Message List Pane |
[] Start a new thread when a follow-up subject changes;" when this is
set Agent will ignore the reference head and thread on the new subject.

Ralph Fox

unread,
Nov 2, 2007, 6:55:52 AM11/2/07
to
On Thu, 01 Nov 2007 20:36:57 +0000, in message <a7c0901e2ec4d416...@localhost.127.0.0.1>,
agent-user wrote:

> >> Agent v2 ...

> To be honest, I've never heard of that problem and I seem to see

> it arise on several text groups where I have a local store of posts
> in excess of 64,000 articles. I guess there's no solution to it?


There are several things which can be done to reduce the problem,
but each of them has some side effect.

A. Turn on "Start a new thread when a follow-up subject changes"
at Options -> General Preferences -> Display -> Message List Pane

Side effect: if someone changes the subject _within_ a thread, it
will show as a separate thread.

B. Or, don't keep a lot of messages in a folder -- set a short retention
period and purge regularly.

See this message for a table of probabilities vs the number of
messages in a folder:
http://groups.google.com/groups?selm=eiusft4mchja40rpfhdf3019o59sgtooj9%404ax.com


C. Or, for the more advanced hacker -- apply the hash mod which is
described in these messages, to Agent:
http://groups.google.com/groups?selm=3370f7be.46129406%40J%2EE%2EH%2EA%2ED
http://groups.google.com/groups?selm=pelegi$3affe45a.9693145%40pelegi.dialin.t-online.de

Side effect: new headers downloaded after the mod won't thread with
old headers downloaded before the mod. To fix, export the folder to
a Unix message file and re-import. Headers without bodies can not be
fixed; they must be dumped and re-downloaded.


> Does Agent v4 remove the problem???

No change.


> HOWEVER, in a way that still doesn't answer a part of my original
> question (which I never actually asked ;-) ) which is what makes
> one article appear in the same thread under the article that it is
> a reply to? Clearly it isn't the MID.

1. In principle, message 'A' should appear under message 'B' if the
MID of message 'B' appears in the references header of 'A'.

2. Agent does not compare the MIDs themselves. Agent compares
hashcodes. This is quicker, but occasionally produces
false successful matches.

To Agent, message 'A' will appear under message 'B' if the
hashcode of the MID of message 'B' appears in the list of
hashcodes of the MIDs from the references header of message 'A'.

3. Agent does not even store the actual references header
with bodiless headers. Agent only stores the corresponding
hashcodes.

Note that storing the references header would increase the
database size and (without a significant database redesign)
would reduce the maximum number of headers which could be stored.


--
Cheers,
Ralph

agent-user

unread,
Nov 2, 2007, 12:13:58 PM11/2/07
to

On Fri, 02 Nov 2007 23:55:52 +1300 'Ralph Fox'
wrote this on alt.usenet.offline-reader.forte-agent:

>On Thu, 01 Nov 2007 20:36:57 +0000, in message <a7c0901e2ec4d416...@localhost.127.0.0.1>,


>agent-user wrote:
>
>> >> Agent v2 ...
>
>> To be honest, I've never heard of that problem and I seem to see
>> it arise on several text groups where I have a local store of posts
>> in excess of 64,000 articles. I guess there's no solution to it?


>There are several things which can be done to reduce the problem,
>but each of them has some side effect.
>
>A. Turn on "Start a new thread when a follow-up subject changes"
> at Options -> General Preferences -> Display -> Message List Pane
>
> Side effect: if someone changes the subject _within_ a thread, it
> will show as a separate thread.
>
>B. Or, don't keep a lot of messages in a folder -- set a short retention
> period and purge regularly.

No, I'm not really interested in either of those solutions. I'm
quite rigid about retaining the original threading and I like to
retain a good local store on some newsgroups.

> See this message for a table of probabilities vs the number of
> messages in a folder:
> http://groups.google.com/groups?selm=eiusft4mchja40rpfhdf3019o59sgtooj9%404ax.com

Thx, I need to read that some more...


>C. Or, for the more advanced hacker -- apply the hash mod which is
> described in these messages, to Agent:
> http://groups.google.com/groups?selm=3370f7be.46129406%40J%2EE%2EH%2EA%2ED
> http://groups.google.com/groups?selm=pelegi$3affe45a.9693145%40pelegi.dialin.t-online.de
>
> Side effect: new headers downloaded after the mod won't thread with
> old headers downloaded before the mod. To fix, export the folder to
> a Unix message file and re-import. Headers without bodies can not be
> fixed; they must be dumped and re-downloaded.

This looks interesting but I'll have to consider the implications,
since I also run Hamster.

>> Does Agent v4 remove the problem???
>
>No change.

:-(



>> HOWEVER, in a way that still doesn't answer a part of my original
>> question (which I never actually asked ;-) ) which is what makes
>> one article appear in the same thread under the article that it is
>> a reply to? Clearly it isn't the MID.
>
>1. In principle, message 'A' should appear under message 'B' if the
> MID of message 'B' appears in the references header of 'A'.
>
>2. Agent does not compare the MIDs themselves. Agent compares
> hashcodes. This is quicker, but occasionally produces
> false successful matches.
>
> To Agent, message 'A' will appear under message 'B' if the
> hashcode of the MID of message 'B' appears in the list of
> hashcodes of the MIDs from the references header of message 'A'.

Ah ok, so it is the hashed MID which Agent uses to thread,
which of course falters if there's a hash collision.

>3. Agent does not even store the actual references header
> with bodiless headers. Agent only stores the corresponding
> hashcodes.
>
> Note that storing the references header would increase the
> database size and (without a significant database redesign)
> would reduce the maximum number of headers which could be stored.

Indeed, there's a DB max size I believe.

Thanks for that info Ralph .. I have some more reading to do.

Al Balmer

unread,
Nov 2, 2007, 1:16:16 PM11/2/07
to
On Fri, 02 Nov 2007 16:13:58 +0000, agent-user
<agent...@privacy.invalid.com> wrote:

>>B. Or, don't keep a lot of messages in a folder -- set a short retention
>> period and purge regularly.
>
>No, I'm not really interested in either of those solutions. I'm
>quite rigid about retaining the original threading and I like to
>retain a good local store on some newsgroups.

Have you considered keeping older articles in an archive folder? Just
sort by date (or whatever other criteria you like), highlight a few
hundred, and "move to folder."

I do that with some mailing lists which aren't archived elsewhere,
usually in folders marked with the year.

Nick Spalding

unread,
Nov 2, 2007, 2:06:39 PM11/2/07
to
Al Balmer wrote, in <vkmmi3tu4eiputocv...@4ax.com>
on Fri, 02 Nov 2007 17:16:16 GMT:

I've gone a step further and moved stuff from the last millennium into a
separate instance!
--
Nick Spalding

Vista Home Premium, Intel Viiv dual core E6300 (1.86Ghz, 1066MHz FSB),
2GB RAM, 320GB NTFS HD, Video Nvidia GeForce 7900GS LCD 1280x1024x60Hz

Message has been deleted

agent-user

unread,
Nov 2, 2007, 9:25:36 PM11/2/07
to

On Fri, 02 Nov 2007 17:16:16 GMT 'Al Balmer'
wrote this on alt.usenet.offline-reader.forte-agent:

IIUC you mean move the articles I want to keep from my newsgroup
folders into a single archive folder.Is that right? It's certainly a
possibility and has its uses, but I think I like Nick's idea better
of creating a second instance as an archive. Maybe one archive for
every 1-2 years. That way I could retain the group naming structure
in each archive which would make it quicker to find something.

agent-user

unread,
Nov 2, 2007, 9:26:16 PM11/2/07
to

On Fri, 02 Nov 2007 18:06:39 +0000 'Nick Spalding'
wrote this on alt.usenet.offline-reader.forte-agent:

>Al Balmer wrote, in <vkmmi3tu4eiputocv...@4ax.com>
> on Fri, 02 Nov 2007 17:16:16 GMT:
>
>> On Fri, 02 Nov 2007 16:13:58 +0000, agent-user
>> <agent...@privacy.invalid.com> wrote:
>>
>> >>B. Or, don't keep a lot of messages in a folder -- set a short retention
>> >> period and purge regularly.
>> >
>> >No, I'm not really interested in either of those solutions. I'm
>> >quite rigid about retaining the original threading and I like to
>> >retain a good local store on some newsgroups.
>>
>> Have you considered keeping older articles in an archive folder? Just
>> sort by date (or whatever other criteria you like), highlight a few
>> hundred, and "move to folder."
>>
>> I do that with some mailing lists which aren't archived elsewhere,
>> usually in folders marked with the year.
>
>I've gone a step further and moved stuff from the last millennium into a
>separate instance!

Yes, thanks Nick that looks interesting.

agent-user

unread,
Nov 9, 2007, 4:48:41 PM11/9/07
to

On Thu, 01 Nov 2007 16:52:25 -0700 'Don Kirkman'
wrote this on alt.usenet.offline-reader.forte-agent:

Thanks but the problem is that I like to retain threading even when
a subject title gets changed.

Don Kirkman

unread,
Nov 9, 2007, 7:34:32 PM11/9/07
to
It seems to me I heard somewhere that agent-user wrote in article
<1ac1a3128f34a16d...@localhost.127.0.0.1>:

>
>On Thu, 01 Nov 2007 16:52:25 -0700 'Don Kirkman'
>wrote this on alt.usenet.offline-reader.forte-agent:

>>I don't think there's anything anyone but Forte can do about hash


>>collisions, but a possible answer to your question is that, using the
>>reference header which has previous MIDs , Agent can (usually) attach a
>>message to those that preceded it in the thread. This action can be
>>changed by setting the option in Tools | Options | Message List Pane |
>>[] Start a new thread when a follow-up subject changes;" when this is
>>set Agent will ignore the reference head and thread on the new subject.

>Thanks but the problem is that I like to retain threading even when
>a subject title gets changed.

In that case, if you mean you want everything threaded in the original
thread, don't select "Start a new thread . . .." or deselect it if it's
already selected. If you want new subjects broken out separately they
can still be threaded by selecting that option.

agent-user

unread,
Nov 9, 2007, 8:31:46 PM11/9/07
to

On Fri, 09 Nov 2007 16:34:32 -0800 'Don Kirkman'
wrote this on alt.usenet.offline-reader.forte-agent:

>It seems to me I heard somewhere that agent-user wrote in article
><1ac1a3128f34a16d...@localhost.127.0.0.1>:
>
>>
>>On Thu, 01 Nov 2007 16:52:25 -0700 'Don Kirkman'
>>wrote this on alt.usenet.offline-reader.forte-agent:
>
>>>I don't think there's anything anyone but Forte can do about hash
>>>collisions, but a possible answer to your question is that, using the
>>>reference header which has previous MIDs , Agent can (usually) attach a
>>>message to those that preceded it in the thread. This action can be
>>>changed by setting the option in Tools | Options | Message List Pane |
>>>[] Start a new thread when a follow-up subject changes;" when this is
>>>set Agent will ignore the reference head and thread on the new subject.

>>Thanks but the problem is that I like to retain threading even when
>>a subject title gets changed.

>In that case, if you mean you want everything threaded in the original
>thread, don't select "Start a new thread . . .." or deselect it if it's
>already selected. If you want new subjects broken out separately they
>can still be threaded by selecting that option.

Yeah sure, what I meant is that when someone changes the thread,
I still want articles to continue in the original thread; that's how
I've got it configured and it works fine.....

But then along comes a hash collision and all of a sudden a new
sub-thread appears with articles attached to any other thread.
It used to befuddle me until Ralph mentioned the hash problem.

0 new messages