Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

usenet optimalisation idea for compact high bandwith env

4 views

Skip to first unread message

newsmaster

unread,

Dec 3, 2009, 6:15:27 PM12/3/09

I would like to share an idea, and Iï¿½m very interested if you have
recommendations or blocking issues or have some historical or technical
insight on this.

Introduction:
A main key in serving binary news messages to a large crowed is having
enough disk io capacity for the randomness created by the large amount
of articles served to a lot of people.

The chance of having the same request done twice in a short interval is
so low, that the memory caching systems (generally speaking) donï¿½t do
any noticeable offloading. The way most nntp software stores articles is
enabling caching.

A few years back we operated a news setup where our dreaders did
disk-caching. The effectiveness of that was too low to implement it in
our new setup.

Current setup:
We have 4 SANï¿½s exporting lunï¿½s to all hostï¿½s and use a SAN-Filesystem.
Readers can read files over FC from shared filesystems being written by
the feeders, and get the message location by doing whereis callï¿½s on the
feeders.
All readers are able to serve the full retention up to 1,6 Gbit/sec per
node, and we only use 2 feeders.
If we need more output, then we add disks and create an extra shared
filesystem and sometimes add extra readers.

Idea:
Because weï¿½re mostly interested in output performance, and growing extra
data retention is not the primary objective, buying tons of fast disks
if there are other ways would make no sense (except if youï¿½ve a mission
to kill polar bears).

I would like to try to steer the userï¿½s interaction with the servers
while he/she's downloading a specific set of messages, without the need
for user input/selection. I would like to do this with a minor
serverside change and with clients supporting this client-server
communication (option). Preferably through sharing this and updating rfcï¿½s

* Client startï¿½s and connects to primary NNTP server hostname.
* Client sendï¿½s: list servers (a non-existing ï¿½listï¿½ extension used for
example)
* Server answers: (server side configurable data)
215 information follows
-0 server0.example.com
-4 server1.example.com
-8 server2.example.com
-12 server3.example.com
-16 server4.example.com
-20 server5.example.com
-24 server6.example.com
-
And finally
-48 serverx.example.com (everything older then)
.
* Now the client will load a NZB and select the group and send HEAD
requests for every message-id
* the client now knows the ï¿½injection-dateï¿½ of the message
* The client calculates from localtime to ï¿½injection-dateï¿½(GMT)
* When the article is between 8-12 hours old the client will send the
BODY request to server2.example.com

Using a combination of memory and flash technology I probably would be
able to lower disk IO up to 50% while doubling capacity during peak
moments (wild guess).
A problem that needs a workaround is connection accounting.

Your help and ideas are very much appreciated.

With kind regards,
Sebastiaan Jansen

KPN
The Netherlands
Sebastiaanï¿½a pointï¿½Jansen at kpn.com

Sebastiaan Jansen

unread,

Dec 3, 2009, 7:16:46 PM12/3/09

newsmaster schreef:

b.t.w. sorry for the account profile on my start post.

Julien ÉLIE

unread,

Dec 5, 2009, 8:05:49 PM12/5/09

> * Client sendï¿½s: list servers (a non-existing ï¿½listï¿½ extension used for example)
> * Server answers: (server side configurable data)
> 215 information follows
> -0 server0.example.com
> -4 server1.example.com
> -8 server2.example.com
> -12 server3.example.com
> -16 server4.example.com
> -20 server5.example.com
> -24 server6.example.com
> -
> And finally
> -48 serverx.example.com (everything older then)
> .
> * Now the client will load a NZB and select the group and send HEAD requests for every message-id

Why should it select the group before sending HEAD requests by message-IDs?

> * the client now knows the ï¿½injection-dateï¿½ of the message
> * The client calculates from localtime to ï¿½injection-dateï¿½(GMT)
> * When the article is between 8-12 hours old the client will send the BODY request to server2.example.com

That is to say that the clients disconnects from the server and opens
a new connection to server2.example.com?

Will your servers keep exchanging articles between each other?
I mean that articles will keep being sent from server0 to server1, which
also sends to server2, etc. As soon as an article is 4 hours old,
it should be transferred from server0 to server1, etc.
Won't it cause problems?

--
Julien ï¿½LIE

ï¿½ Love isn't all smiles and laughs for the moment;
but crying and fighting for what you believe is right
and will last forever. ï¿½

Sebastiaan Jansen

unread,

Dec 7, 2009, 1:05:45 AM12/7/09

Julien ï¿½LIE schreef:

>> * Client sendï¿½s: list servers (a non-existing ï¿½listï¿½ extension used
>> for example)
>> * Server answers: (server side configurable data)
>> 215 information follows
>> -0 server0.example.com
>> -4 server1.example.com
>> -8 server2.example.com
>> -12 server3.example.com
>> -16 server4.example.com
>> -20 server5.example.com
>> -24 server6.example.com
>> -
>> And finally
>> -48 serverx.example.com (everything older then)
>> .
>> * Now the client will load a NZB and select the group and send HEAD
>> requests for every message-id
>
> Why should it select the group before sending HEAD requests by message-IDs?
>
>

sorry, I don't know exactly how nntp client-server communication works.
thanks for the leason

>> * the client now knows the ï¿½injection-dateï¿½ of the message
>> * The client calculates from localtime to ï¿½injection-dateï¿½(GMT)
>> * When the article is between 8-12 hours old the client will send the
>> BODY request to server2.example.com
>
> That is to say that the clients disconnects from the server and opens
> a new connection to server2.example.com?

yes, I believe that's possible?

>
>
> Will your servers keep exchanging articles between each other?
> I mean that articles will keep being sent from server0 to server1, which
> also sends to server2, etc. As soon as an article is 4 hours old,
> it should be transferred from server0 to server1, etc.
> Won't it cause problems?
>

no, all servers will be able to serve all (the full retention) articles.
users abusing this need to be detected (by analysing logs for example)

Curt Welch

unread,

Dec 8, 2009, 8:51:09 PM12/8/09

newsmaster <newsm...@planet.nl> wrote:
> Hi
>
> I would like to share an idea, and I�m very interested if you have

> recommendations or blocking issues or have some historical or technical
> insight on this.
>
> Introduction:
> A main key in serving binary news messages to a large crowed is having
> enough disk io capacity for the randomness created by the large amount
> of articles served to a lot of people.

> The chance of having the same request done twice in a short interval is

> so low, that the memory caching systems (generally speaking) don�t do
> any noticeable offloading.

That's often true.

> The way most nntp software stores articles is
> enabling caching.
>
> A few years back we operated a news setup where our dreaders did
> disk-caching. The effectiveness of that was too low to implement it in
> our new setup.
>
> Current setup:

> We have 4 SAN�s exporting lun�s to all host�s and use a SAN-Filesystem.

> Readers can read files over FC from shared filesystems being written by

> the feeders, and get the message location by doing whereis call�s on the

> feeders.
> All readers are able to serve the full retention up to 1,6 Gbit/sec per
> node, and we only use 2 feeders.
> If we need more output, then we add disks and create an extra shared
> filesystem and sometimes add extra readers.
>
> Idea:

> Because we�re mostly interested in output performance, and growing extra

> data retention is not the primary objective, buying tons of fast disks

> if there are other ways would make no sense (except if you�ve a mission
> to kill polar bears).
>
> I would like to try to steer the user�s interaction with the servers

> while he/she's downloading a specific set of messages, without the need
> for user input/selection.

Why?

> I would like to do this with a minor
> serverside change and with clients supporting this client-server
> communication (option). Preferably through sharing this and updating

> rfc�s

Seems like a bad idea to me.

> * Client start�s and connects to primary NNTP server hostname.
> * Client send�s: list servers (a non-existing �list� extension used for

> example)
> * Server answers: (server side configurable data)
> 215 information follows
> -0 server0.example.com
> -4 server1.example.com
> -8 server2.example.com
> -12 server3.example.com
> -16 server4.example.com
> -20 server5.example.com
> -24 server6.example.com
> -
> And finally
> -48 serverx.example.com (everything older then)
> .
> * Now the client will load a NZB and select the group and send HEAD
> requests for every message-id

> * the client now knows the �injection-date� of the message
> * The client calculates from localtime to �injection-date�(GMT)

> * When the article is between 8-12 hours old the client will send the
> BODY request to server2.example.com
>
> Using a combination of memory and flash technology I probably would be
> able to lower disk IO up to 50% while doubling capacity during peak
> moments (wild guess).
> A problem that needs a workaround is connection accounting.
>
> Your help and ideas are very much appreciated.

You are just asking the user to do the work your servers should be doing.
Why on earth would you push that responsibility off to the client when it
means getting all the major newsreaders updated to support a very odd and
strange client side load distribution system?

> With kind regards,
> Sebastiaan Jansen
>
> KPN
> The Netherlands

> Sebastiaan�a point�Jansen at kpn.com

I don't really grasp what you are thinking here. You are suggesting the
client do server load balancing for you by distributing the user load
across the servers based on the age of the article they are fetching????

First off, age based balancing is a very poor way do do load balancing for
Usenet because I/O is not distributed evenly by age. The bulk of the user
access is for newer articles. If you graph user bandwidth load vs age of
articles accessed, you see the peek is for newer articles and fades with
age. In your example, it means that your server0 and server1 could end up
with 80% of the total load so it wouldn't be balanced at all. To adjust
for that, your server would have to send out a different list to each
client, or it would have to skew the range per server where .5 day old
articles could be on server0, 1 day old on server1, 2 day on server2, 4 day
on server4, or something like that. You would have to manually adjust
those age limits to manually balance your load (or write some very smart
automatic load testing and balancing code).

You could do better just using DNS load balancing and you can do that in 10
minutes seconds without anyone having to change their newsreader so I don't
grasp what you are thinking here.

Was the point of doing the age based load balancing an attempt to improve
caching on the servers since the same articles would get fetched from the
same servers in theory? If you think that will work, you need to implement
it on you side, instead of expecting a 1000 different clients to be
modified just to implement your private version of load balancing to solve
your servers load problem. Create a new layer of front end load balances
in front of your dreaderd server to perform the age based balancing for
you. This layer of front end servers are mostly just routers with little
disk space though they will need a way to find the age of articles quickly
so you have to set up yet another database to make that happen. Good luck
with that. :) But at least that's better than expecting 20 different
newsreader authors to write that code for you in their clients!

Distributing load over servers is a problem that every big Usenet site has
been dealing with for decades now. Welcome to the game. :) There are many
approaches, but they are always implemented on the server side. There is no
advantage to having the client do your load balancing for your and plenty
of problems - mainly that it's hard to change how your system works if you
have to get the clients updated every time you want to make a change.

If your backend spool servers are hitting their bandwidth limit, and all
this idea was a way to try and solve it by making your front end servers do
more caching, sorry, it's not likely to work. You have to buy more and/or
faster backend disks. That's the only real solution. Memory caching of
articles doesn't tend to work for Usenet servers because there are too many
articles and the access is too random to help much. The memory cache would
have to be huge before it would do much good - and at the cost of that much
memory - it's far cheaper to buy more disks instead. Front end disk
caching of popular articles makes little sense because that just moves the
disk I/O load from your large array of back end disks, to some much smaller
number of front end server cache disks. If the bandwidth was too great for
all the back end disks working in parallel, how the hell is a far smaller
number of disks on the front end servers going to handle the same load?
Solid State drives might be a solution to make frontend disk cashing work,
but I suspect that by the time you have added enough of these expensive
drives to do some good, you could have done it cheaper just by buying more
backend disks.

The good news is that you get longer retention for "free" when you buy more
disks to solve the disk I/O bandwidth problem! :)

Usenet is a disk I/O intensive database application. If you sell 10 Gbit
of Load to your customer base, you shouldn't be surprised to find that you
need to build a disk server back end that can support a total of 10 Gbit of
I/O load. :) There are no easy ways around this.

--
Curt Welch http://CurtWelch.Com/
cu...@kcwc.com http://NewsReader.Com/

Sebastiaan Jansen

unread,

Dec 9, 2009, 6:09:21 PM12/9/09

Curt Welch schreef:
> newsmaster <newsm...@planet.nl> wrote:
>> Hi
>>
>> I would like to share an idea, and Iï¿½m very interested if you have

>> recommendations or blocking issues or have some historical or technical
>> insight on this.
>>
>> Introduction:
>> A main key in serving binary news messages to a large crowed is having
>> enough disk io capacity for the randomness created by the large amount
>> of articles served to a lot of people.
>
>> The chance of having the same request done twice in a short interval is

>> so low, that the memory caching systems (generally speaking) donï¿½t do

>> any noticeable offloading.
>
> That's often true.
>
>> The way most nntp software stores articles is
>> enabling caching.
>>
>> A few years back we operated a news setup where our dreaders did
>> disk-caching. The effectiveness of that was too low to implement it in
>> our new setup.
>>
>> Current setup:

>> We have 4 SANï¿½s exporting lunï¿½s to all hostï¿½s and use a SAN-Filesystem.

>> Readers can read files over FC from shared filesystems being written by

>> the feeders, and get the message location by doing whereis callï¿½s on the

>> feeders.
>> All readers are able to serve the full retention up to 1,6 Gbit/sec per
>> node, and we only use 2 feeders.
>> If we need more output, then we add disks and create an extra shared
>> filesystem and sometimes add extra readers.
>>
>> Idea:

>> Because weï¿½re mostly interested in output performance, and growing extra

>> data retention is not the primary objective, buying tons of fast disks

>> if there are other ways would make no sense (except if youï¿½ve a mission
>> to kill polar bears).
>>
>> I would like to try to steer the userï¿½s interaction with the servers

>> while he/she's downloading a specific set of messages, without the need
>> for user input/selection.
>
> Why?

Because I'm told my very expensive loadbalancers can't do that based on
the users request data. And I currently believe that there is nobody
accessable in this universe who can write a affordable system for this
that will be able to perform under the stress of my users :-)

>
>> I would like to do this with a minor
>> serverside change and with clients supporting this client-server
>> communication (option). Preferably through sharing this and updating

>> rfcï¿½s

>
> Seems like a bad idea to me.

I'm trying to understand your reasons (please read next)

>
>> * Client startï¿½s and connects to primary NNTP server hostname.
>> * Client sendï¿½s: list servers (a non-existing ï¿½listï¿½ extension used for

>> example)
>> * Server answers: (server side configurable data)
>> 215 information follows
>> -0 server0.example.com
>> -4 server1.example.com
>> -8 server2.example.com
>> -12 server3.example.com
>> -16 server4.example.com
>> -20 server5.example.com
>> -24 server6.example.com
>> -
>> And finally
>> -48 serverx.example.com (everything older then)
>> .
>> * Now the client will load a NZB and select the group and send HEAD
>> requests for every message-id

>> * the client now knows the ï¿½injection-dateï¿½ of the message
>> * The client calculates from localtime to ï¿½injection-dateï¿½(GMT)

>> * When the article is between 8-12 hours old the client will send the
>> BODY request to server2.example.com
>>
>> Using a combination of memory and flash technology I probably would be
>> able to lower disk IO up to 50% while doubling capacity during peak
>> moments (wild guess).
>> A problem that needs a workaround is connection accounting.
>>
>> Your help and ideas are very much appreciated.
>
> You are just asking the user to do the work your servers should be doing.
> Why on earth would you push that responsibility off to the client when it
> means getting all the major newsreaders updated to support a very odd and
> strange client side load distribution system?
>

We have a redundant loadbalancer distribute load across readers boxes
right now. And that will stay. I only want to help the client make the
decision where it can get the body, the most optimal way.

>> With kind regards,
>> Sebastiaan Jansen
>>
>> KPN
>> The Netherlands

>> Sebastiaanï¿½a pointï¿½Jansen at kpn.com

>
> I don't really grasp what you are thinking here. You are suggesting the
> client do server load balancing for you by distributing the user load
> across the servers based on the age of the article they are fetching????
>
> First off, age based balancing is a very poor way do do load balancing for
> Usenet because I/O is not distributed evenly by age. The bulk of the user
> access is for newer articles. If you graph user bandwidth load vs age of
> articles accessed, you see the peek is for newer articles and fades with
> age. In your example, it means that your server0 and server1 could end up
> with 80% of the total load so it wouldn't be balanced at all. To adjust
> for that, your server would have to send out a different list to each
> client, or it would have to skew the range per server where .5 day old
> articles could be on server0, 1 day old on server1, 2 day on server2, 4 day
> on server4, or something like that. You would have to manually adjust
> those age limits to manually balance your load (or write some very smart
> automatic load testing and balancing code).

server0 etc will be n+1 loadbalancen virtual server ip adresses on a
highend network loadbalancer.
I'm activly using the usage pattern on older article, those already are
on cheaper sata disk spools.
Getting 80% load done on a (few as possible) server(groups) serving a
large quantity from ram cache would be perfect. If 50% of that 80% is
served from ram cache it would save me two racks full of 10k disks
instantly, those disks only serve the hunger for IO capacity on the spools.

Don't worry about the capacity of those server doing caching. 10Gbit
Nics, a lot of ram, fast multithreaded CPU's and 8Gbit FC to the spools
will do it.

>
> You could do better just using DNS load balancing and you can do that in 10
> minutes seconds without anyone having to change their newsreader so I don't
> grasp what you are thinking here.
>
> Was the point of doing the age based load balancing an attempt to improve
> caching on the servers since the same articles would get fetched from the
> same servers in theory? If you think that will work, you need to implement
> it on you side, instead of expecting a 1000 different clients to be
> modified just to implement your private version of load balancing to solve
> your servers load problem. Create a new layer of front end load balances
> in front of your dreaderd server to perform the age based balancing for
> you. This layer of front end servers are mostly just routers with little
> disk space though they will need a way to find the age of articles quickly
> so you have to set up yet another database to make that happen. Good luck
> with that. :) But at least that's better than expecting 20 different
> newsreader authors to write that code for you in their clients!

Don't take this wrong, I'm very thankfull for everybody working on usenet.

Use the power of users.
Make the user expect from their software provider to implement this option.
Rebuild a opensource usenet client to support this optimalisation, brand
the client, give a users who uses "my" favored client 10x the normal
speed while downloading from it's ISP Free usenet server. (he will save
some USP bucks by changing his NZB client)
If popular usenet clients follow, the rest probably also follows.

>
> Distributing load over servers is a problem that every big Usenet site has
> been dealing with for decades now. Welcome to the game. :) There are many
> approaches, but they are always implemented on the server side. There is no
> advantage to having the client do your load balancing for your and plenty
> of problems - mainly that it's hard to change how your system works if you
> have to get the clients updated every time you want to make a change.

I've no problems loadbalancing my traffic, even when the number of
connections triple.

>
> If your backend spool servers are hitting their bandwidth limit, and all
> this idea was a way to try and solve it by making your front end servers do
> more caching, sorry, it's not likely to work. You have to buy more and/or
> faster backend disks. That's the only real solution. Memory caching of
> articles doesn't tend to work for Usenet servers because there are too many
> articles and the access is too random to help much. The memory cache would
> have to be huge before it would do much good - and at the cost of that much
> memory - it's far cheaper to buy more disks instead. Front end disk
> caching of popular articles makes little sense because that just moves the
> disk I/O load from your large array of back end disks, to some much smaller
> number of front end server cache disks. If the bandwidth was too great for
> all the back end disks working in parallel, how the hell is a far smaller
> number of disks on the front end servers going to handle the same load?
> Solid State drives might be a solution to make frontend disk cashing work,
> but I suspect that by the time you have added enough of these expensive
> drives to do some good, you could have done it cheaper just by buying more
> backend disks.
>

I want to use RAM in my frontend servers. The list price of a "backend"
is 250.000 (I already got four, and the business case for number five
and six is on the printer), but if we can make usenet more efficient
that would favor me and probably the rest of the planet.

> The good news is that you get longer retention for "free" when you buy more
> disks to solve the disk I/O bandwidth problem! :)
>

It would be a great idea if older retention would only be maintained a
few times instead of having 10 USP in a country/region have thousands of
disks spinning doing almost nothing for the sake of having 300+ days of
binary retention.

Make a ISP/Telco happy by caching 80% of the traffic and save network
costs and let the USP supply added retention, and save on traffic spending.

Curt Welch

unread,

Dec 9, 2009, 7:21:02 PM12/9/09

Sebastiaan Jansen <an...@fake.nl> wrote:
> Curt Welch schreef:
> > newsmaster <newsm...@planet.nl> wrote:
> >> Hi
> >>

> >> I would like to share an idea, and I�m very interested if you have

> >> recommendations or blocking issues or have some historical or
> >> technical insight on this.
> >>
> >> Introduction:
> >> A main key in serving binary news messages to a large crowed is having
> >> enough disk io capacity for the randomness created by the large amount
> >> of articles served to a lot of people.
> >
> >> The chance of having the same request done twice in a short interval

> >> is so low, that the memory caching systems (generally speaking) don�t

> >> do any noticeable offloading.
> >
> > That's often true.
> >
> >> The way most nntp software stores articles is
> >> enabling caching.
> >>
> >> A few years back we operated a news setup where our dreaders did
> >> disk-caching. The effectiveness of that was too low to implement it in
> >> our new setup.
> >>
> >> Current setup:

> >> We have 4 SAN�s exporting lun�s to all host�s and use a

> >> SAN-Filesystem. Readers can read files over FC from shared filesystems
> >> being written by the feeders, and get the message location by doing

> >> whereis call�s on the feeders.

> >> All readers are able to serve the full retention up to 1,6 Gbit/sec
> >> per node, and we only use 2 feeders.
> >> If we need more output, then we add disks and create an extra shared
> >> filesystem and sometimes add extra readers.
> >>
> >> Idea:

> >> Because we�re mostly interested in output performance, and growing

> >> extra data retention is not the primary objective, buying tons of fast

> >> disks if there are other ways would make no sense (except if you�ve a

> >> mission to kill polar bears).
> >>

> >> I would like to try to steer the user�s interaction with the servers

> >> while he/she's downloading a specific set of messages, without the
> >> need for user input/selection.
> >
> > Why?
>
> Because I'm told my very expensive loadbalancers can't do that based on
> the users request data. And I currently believe that there is nobody
> accessable in this universe who can write a affordable system for this
> that will be able to perform under the stress of my users :-)

Writing custom load balances for Usenet is fairly easy. After all, you are
asking the news client authors to do for you are you not? The only
difference if you did it yourself is that you would be running it locally
with fewer machines instead of the thousands of machines scatter in the
users locations doing it for you.

And I still don't grasp why you think this load balance scheme will even
help you if you already have high end load balances distributing user load
across your front end boxes to begin with.

So just do it like this on your end instead of expecting the end users to
update their software:

users -> [your load balancer] -> [new array of NNTP switches that perform
your age based balancing invisible to your users] -> [ your dreader servers
] ...

> >> I would like to do this with a minor
> >> serverside change and with clients supporting this client-server
> >> communication (option). Preferably through sharing this and updating

> >> rfc�s

> >
> > Seems like a bad idea to me.
>
> I'm trying to understand your reasons (please read next)

Well let me be more clear.

1) there are too many clients and most will NEVER be updated to solve YOUR
problem. And most the users, won't bother to update their version to the
new version even if there is a new version that supports your load
balancing scheme. The conversion will be so slow, it will take YEARS to
get 50% of your users converted. How much will this scheme actually save
you when only half your users are using it after 5 years?

2) you are suggesting only 1 load balancing scheme in your load balance
protocol - age based. There are many others possible, like message-id
hashes, and group hashes, and article number hashes and group+article
number hashes and user name hashes, and random distribution with weighting.
And there are lots of different hash algorithms to choose from such as ones
based on consistent hashing which deal with the problems of servers going
off line graceful without blowing your cache. What if other sites want
to do the same but use a completely different hash algorithm for the load
balancing? How many times will the protocol and all the clients have to be
rewritten to support 20 different load balancing schemes from 20 different
companies?

In short, unless your scheme provides a huge benefit to the end user, they
won't switch clients, and it won't do you any good.

> >> * Client start�s and connects to primary NNTP server hostname.
> >> * Client send�s: list servers (a non-existing �list� extension used

> >> for example)
> >> * Server answers: (server side configurable data)
> >> 215 information follows
> >> -0 server0.example.com
> >> -4 server1.example.com
> >> -8 server2.example.com
> >> -12 server3.example.com
> >> -16 server4.example.com
> >> -20 server5.example.com
> >> -24 server6.example.com
> >> -
> >> And finally
> >> -48 serverx.example.com (everything older then)
> >> .
> >> * Now the client will load a NZB and select the group and send HEAD
> >> requests for every message-id

> >> * the client now knows the �injection-date� of the message
> >> * The client calculates from localtime to �injection-date�(GMT)

> >> * When the article is between 8-12 hours old the client will send the
> >> BODY request to server2.example.com
> >>
> >> Using a combination of memory and flash technology I probably would be
> >> able to lower disk IO up to 50% while doubling capacity during peak
> >> moments (wild guess).
> >> A problem that needs a workaround is connection accounting.
> >>
> >> Your help and ideas are very much appreciated.
> >
> > You are just asking the user to do the work your servers should be
> > doing. Why on earth would you push that responsibility off to the
> > client when it means getting all the major newsreaders updated to
> > support a very odd and strange client side load distribution system?
> >
>
> We have a redundant loadbalancer distribute load across readers boxes
> right now. And that will stay. I only want to help the client make the
> decision where it can get the body, the most optimal way.

So server0 and server1 etc that the users connect to are load balancer IPs?
What's the point of this??? You are confusing me big time.

Assuming all your users did switch to this system, why would it make your
server work better?

> >> With kind regards,
> >> Sebastiaan Jansen
> >>
> >> KPN
> >> The Netherlands

> >> Sebastiaan�a point�Jansen at kpn.com

> >
> > I don't really grasp what you are thinking here. You are suggesting
> > the client do server load balancing for you by distributing the user
> > load across the servers based on the age of the article they are
> > fetching????
> >
> > First off, age based balancing is a very poor way do do load balancing
> > for Usenet because I/O is not distributed evenly by age. The bulk of
> > the user access is for newer articles. If you graph user bandwidth
> > load vs age of articles accessed, you see the peek is for newer
> > articles and fades with age. In your example, it means that your
> > server0 and server1 could end up with 80% of the total load so it
> > wouldn't be balanced at all. To adjust for that, your server would
> > have to send out a different list to each client, or it would have to
> > skew the range per server where .5 day old articles could be on
> > server0, 1 day old on server1, 2 day on server2, 4 day on server4, or
> > something like that. You would have to manually adjust those age
> > limits to manually balance your load (or write some very smart
> > automatic load testing and balancing code).
>
> server0 etc will be n+1 loadbalancen virtual server ip adresses on a
> highend network loadbalancer.
> I'm activly using the usage pattern on older article, those already are
> on cheaper sata disk spools.
> Getting 80% load done on a (few as possible) server(groups) serving a
> large quantity from ram cache would be perfect.

So you are willing to buy enough ram to cache something like 20% of your
entire spool set???? And that's cheaper for you than buying disks???

Do you serve binaries are are you running a text only server???? The
problems of the two are very different.

> If 50% of that 80% is
> served from ram cache it would save me two racks full of 10k disks
> instantly, those disks only serve the hunger for IO capacity on the
> spools.

Yes, but to serve 50% of that 80% from cache you basically need enough ram
cache to equal 50% of that 80% of the space! Is that seriously cheaper for
you than buying a rack of disks?

> Don't worry about the capacity of those server doing caching. 10Gbit
> Nics, a lot of ram, fast multithreaded CPU's and 8Gbit FC to the spools
> will do it.
>
> >
> > You could do better just using DNS load balancing and you can do that
> > in 10 minutes seconds without anyone having to change their newsreader
> > so I don't grasp what you are thinking here.
> >
> > Was the point of doing the age based load balancing an attempt to
> > improve caching on the servers since the same articles would get
> > fetched from the same servers in theory? If you think that will work,
> > you need to implement it on you side, instead of expecting a 1000
> > different clients to be modified just to implement your private version
> > of load balancing to solve your servers load problem. Create a new
> > layer of front end load balances in front of your dreaderd server to
> > perform the age based balancing for you. This layer of front end
> > servers are mostly just routers with little disk space though they will
> > need a way to find the age of articles quickly so you have to set up
> > yet another database to make that happen. Good luck with that. :) But
> > at least that's better than expecting 20 different newsreader authors
> > to write that code for you in their clients!
>
> Don't take this wrong, I'm very thankfull for everybody working on
> usenet.

Oh, don't get me wrong either. I don't believe you have any evil intent of
trying to get someone else to write code to sovle your server problems so
you don't have to. I assume you believe this will make Usenet better for
everyone. I just don't grasp why you think that.

> Use the power of users.
> Make the user expect from their software provider to implement this
> option. Rebuild a opensource usenet client to support this
> optimalisation, brand the client, give a users who uses "my" favored
> client 10x the normal speed while downloading from it's ISP Free usenet
> server.

I really don't grasp what you are thinking here.
Most people who pay for Usenet are able to max out their bandwidth now.
It's not possible to give them 10x more than they already get.

All you can do, is throttle the users that don't use your favored client to
1/10th the speed they could be getting if you made your servers work
correctly.

Are you running a pay service or are you running a free server or
something? I'm just asking because I can't grasp what you are thinking and
why you think distributing load user over already load balanced IP by age
would help you or anyone else.

> (he will save some USP bucks by changing his NZB client)
> If popular usenet clients follow, the rest probably also follows.

Yes, if this technique saved you big money on your servers because you
didn't have to invest another 250K in hardware, you could pass that
savings on to the users in terms of an incentive discount one way or the
other And if the incentive was large enough, they would likely switch.

But I don't grasp yet why you think this will save you money at all. What
problem exactly do you see this solving for you? How is it going to save
you a rack of disks?

> > Distributing load over servers is a problem that every big Usenet site
> > has been dealing with for decades now. Welcome to the game. :) There
> > are many approaches, but they are always implemented on the server
> > side. There is no advantage to having the client do your load balancing
> > for your and plenty of problems - mainly that it's hard to change how
> > your system works if you have to get the clients updated every time you
> > want to make a change.
>
> I've no problems loadbalancing my traffic, even when the number of
> connections triple.

Then what on earth is this scheme of yours going to save you and why?

> > If your backend spool servers are hitting their bandwidth limit, and
> > all this idea was a way to try and solve it by making your front end
> > servers do more caching, sorry, it's not likely to work. You have to
> > buy more and/or faster backend disks. That's the only real solution.
> > Memory caching of articles doesn't tend to work for Usenet servers
> > because there are too many articles and the access is too random to
> > help much. The memory cache would have to be huge before it would do
> > much good - and at the cost of that much memory - it's far cheaper to
> > buy more disks instead. Front end disk caching of popular articles
> > makes little sense because that just moves the disk I/O load from your
> > large array of back end disks, to some much smaller number of front end
> > server cache disks. If the bandwidth was too great for all the back
> > end disks working in parallel, how the hell is a far smaller number of
> > disks on the front end servers going to handle the same load? Solid
> > State drives might be a solution to make frontend disk cashing work,
> > but I suspect that by the time you have added enough of these expensive
> > drives to do some good, you could have done it cheaper just by buying
> > more backend disks.
> >
> I want to use RAM in my frontend servers. The list price of a "backend"
> is 250.000 (I already got four, and the business case for number five
> and six is on the printer), but if we can make usenet more efficient
> that would favor me and probably the rest of the planet.

All of us that run large servers are _constancy_ looking for ways to change
our configuration to make the cost of the service we provide cheaper for us
so we can be more competitive (or just stay competitive) in the market.

If there were a change to the NNTP protocol that would allow all of us to
significantly reduce the cost of our service, I would be all for pushing
the idea of an NNTP protocol change to add some sort of load balancing role
on the clients side. I don't see however how your suggested protocol
change is going to save you, or anyone, any significant money.

> > The good news is that you get longer retention for "free" when you buy
> > more disks to solve the disk I/O bandwidth problem! :)
> >
> It would be a great idea if older retention would only be maintained a
> few times instead of having 10 USP in a country/region have thousands of
> disks spinning doing almost nothing for the sake of having 300+ days of
> binary retention.

That's already happening in the industry. It's been happening for a decade
or so now has the distributed nature of Usenet slowly disappears as servers
continues to get larger and fewer. Mayby you part of the world is still a
bit behind the curve on Usenet?

> Make a ISP/Telco happy by caching 80% of the traffic and save network
> costs and let the USP supply added retention, and save on traffic
> spending.

Ok, so you are suggesting that some of the servers in the list are located
and run by one company, such as an ISP, and the longer retention servers
are run by another company, like a USP located in another location? So
this gives the Usenet provider the ability to offer longer retention to
their clients by redirecting to a different server for those articles?

That is a totally different scheme than what I thought you were implying.
I thought you were saying that all the servers in the list would be YOUR
servers (or your load balancer which made no sense to me).

I guess you really need to explain _exactly_ why you think this is good for
you, your users, or for Usenet. I'm still a bit lost as to what you are
suggesting and why you think it's good.

Curt Welch

unread,

Dec 9, 2009, 7:34:06 PM12/9/09

Sebastiaan Jansen <an...@fake.nl> wrote:
> Julien �LIE schreef:
> >> * Client send�s: list servers (a non-existing �list� extension used

> >> * the client now knows the �injection-date� of the message
> >> * The client calculates from localtime to �injection-date�(GMT)

> >> * When the article is between 8-12 hours old the client will send the

HOURS???

Oh, I didn't see that in the first few reads. I assumed the numbers were
for days. With typical retentions reaching hundreds of days, why would you
be thinking to distribute load in 4 hour increments???

I really don't grasp what you are thinking this will solve.

> >> BODY request to server2.example.com
> >
> > That is to say that the clients disconnects from the server and opens
> > a new connection to server2.example.com?
>
> yes, I believe that's possible?

Sure that's very possible.

> > Will your servers keep exchanging articles between each other?
> > I mean that articles will keep being sent from server0 to server1,
> > which also sends to server2, etc. As soon as an article is 4 hours
> > old, it should be transferred from server0 to server1, etc.
> > Won't it cause problems?
> >
> no, all servers will be able to serve all (the full retention) articles.
> users abusing this need to be detected (by analysing logs for example)

So again, you are suggesting the users distribute their load across your
load balanced servers why exactly??? Why will this make your servers work
better and save you money that you can then pass along to the users to
offset the trouble you will cause them by forcing them to support a new
NNTP protocol extension?

Sebastiaan Jansen

unread,

Dec 10, 2009, 2:18:58 AM12/10/09

Curt Welch schreef:

> Sebastiaan Jansen <an...@fake.nl> wrote:
>> Julien ï¿½LIE schreef:

>>>> * Client sendï¿½s: list servers (a non-existing ï¿½listï¿½ extension used

>>>> for example)
>>>> * Server answers: (server side configurable data)
>>>> 215 information follows
>>>> -0 server0.example.com
>>>> -4 server1.example.com
>>>> -8 server2.example.com
>>>> -12 server3.example.com
>>>> -16 server4.example.com
>>>> -20 server5.example.com
>>>> -24 server6.example.com
>>>> -
>>>> And finally
>>>> -48 serverx.example.com (everything older then)
>>>> .
>>>> * Now the client will load a NZB and select the group and send HEAD
>>>> requests for every message-id
>>> Why should it select the group before sending HEAD requests by
>>> message-IDs?
>>>
>>>
>> sorry, I don't know exactly how nntp client-server communication works.
>> thanks for the leason
>>

>>>> * the client now knows the ï¿½injection-dateï¿½ of the message
>>>> * The client calculates from localtime to ï¿½injection-dateï¿½(GMT)

>>>> * When the article is between 8-12 hours old the client will send the
>
> HOURS???
>
> Oh, I didn't see that in the first few reads. I assumed the numbers were
> for days. With typical retentions reaching hundreds of days, why would you
> be thinking to distribute load in 4 hour increments???
>
> I really don't grasp what you are thinking this will solve.
>
>
>
>
>>>> BODY request to server2.example.com
>>> That is to say that the clients disconnects from the server and opens
>>> a new connection to server2.example.com?
>> yes, I believe that's possible?
>
> Sure that's very possible.
>
>>> Will your servers keep exchanging articles between each other?
>>> I mean that articles will keep being sent from server0 to server1,
>>> which also sends to server2, etc. As soon as an article is 4 hours
>>> old, it should be transferred from server0 to server1, etc.
>>> Won't it cause problems?
>>>
>> no, all servers will be able to serve all (the full retention) articles.
>> users abusing this need to be detected (by analysing logs for example)
>
> So again, you are suggesting the users distribute their load across your
> load balanced servers why exactly??? Why will this make your servers work
> better and save you money that you can then pass along to the users to
> offset the trouble you will cause them by forcing them to support a new
> NNTP protocol extension?
>

I'm hoping to get some win by lowering the cache preasure on the
frontends, but a real life scenario would need to be tweaked for optimal
rendement (the hours where the usage dencity is the highest). So I'm not
deciding how it should be configured yet.
RAM is rather cheap, and a good caching algoritm on the server would be
able to make something out of it.. what the perfect key between RAM
versus serving hours of retention should be, I can't guess.
You can get 72 GB DDR3 server RAM for 2500 USD. I'm guessing my users
only actively read 1/3 of the feed. So 72GB would make a full hour. I
don't know what the memory->NIC output will be when I higher the
preasure to 4 hours per 72 GB config.
Maybe add-in cacheFS on top of my FC shared spools and use SSD's
combined with the 72GB RAM.

You have to look at it this way, users are being upgraded to
eurodocis3.0 VDSL2 and fiber-to-the-home internet. We run a free binary
uncaped news server with a full retention over 30 days and have a daily
usage which I can't tell because of politics, but it's significantly large.
In the next 2-3 years that bandwith usage will double.
All IO caching mechanism's in the backend failed, and all caching
mechanism's in the frontend have failed.

I believe it makes more sence to extend usenet and setup+tweak frontend
caches, than writing loadbalancers/switches that can do near atomic age
based redirecting under this load, or is it a good/accepted solution?

optimizing to serve 20% percent from cache would take out 2 disk
backends in the future.

Thank you.

Curt Welch

unread,

Dec 10, 2009, 10:11:18 AM12/10/09

Sebastiaan Jansen <an...@fake.nl> wrote:
> Curt Welch schreef:

> > So again, you are suggesting the users distribute their load across

Ok, that's all clear now.

But again, you really can't get your users to solve your server problems
for you by pushing the code to distribute load from your infrastructure
onto the users machine. Whatever they can do on their machine, you can
always do a lot better with your own hardware on your side. Does it get
expensive at times to handle extremely high volume loads? Of course. And
if you are offering free binary usenet, it will get very expensive if you
want to continue to provide good performance.

The only justification for having the users deal with load balancing is if
the multiple machines they are connecting to are located in different
locations so then network path from them to the machine is a factor. But if
all their connections are to your same infrastructure, there is NOTHING
they do that way, that you can't do for them on your end by developing
better technology on your end.

Most large USPs custom develop their own infrastructure. They don't just
try to make it all work with off the shelf technologies. Giganews as I
understand it even has some patents on some of their caching technologies
for load distribution in their infrastructure.

Building high performance distributed infrastructures is tricky business.
And when you rely on caching to be working in order to serve your load, it
gets even trickier. There are events that can causes your caches to
become useless such when one of the critical caching servers dies on you
and the load has to shift. If events happen like that in a peak load
period then suddenly your disk IO becomes saturated and backlogged because
the cache isn't effective, and then your news feeds to the servers also
start to backlog and things just go down hill from there and at times might
not be able to recover (depending on your infrastructure design) until the
user load dies down 6 hours later.

If you are running diablo, your front end dreaderd boxes _are_ acting as
NNTP switches carring your entire load already. THer's nothing "atomic
age" about it. CPUs and networks are fast these days and a single box can
push a very large load when it's only acting as an NNTP router.

So what you can to do, is build another set of front-end boxes in front of
your current front ends that act as NNTP routers to implement all the
request routing decisions needed to implement whatever load distributing
you want to try and use. These front end boxes need no disks space at all
- they can net boot off some other server. They don't need any extra ram
because they aren't doing any caching - they are just CPUs pushing bits to
the right places. You need less of these than you currently have dreaderd
front ends because the dreaderds have a lot of disk load serving headers as
well as acting as article switches to the spools.

You then have total control over your load distribution. You can change it
hour by hour if the caching demands it. You can change it in an instant to
support putting new servers on line or taking servers off line
automatically when they fail etc.

You can do group based hashing to distribute header load across your
dreader boxes so a given dreaderd box is servering only a fixed subset of
the groups so header caching has a better chance of working saving you disk
IO on the header boxes.

You can add to this infrastructure another set of back-end article cache
boxes which are themselves nothing but article fetching and caching
machines and use your front end switches to hash the article fetches over
the array of those caches boxes - maybe using message-id hashing instead of
age since it's not easy to get the age of articles.

To keep the cache useful you might write the article cache code in those
machines so that it only caches articles that are less then 48 hours old.
So once they get the article, they will know it's age, and if it's old,
pass it along to the front end but don't put it in the cache. Whatever
works. When you have your own NNTP switching infrastructure in place
controlling load balancing, you can change it whenever you think a
different structure will work better. If you spend a year getting all your
users newsreaders changed to support the infrastructure feature you _think_
will solve your problems, and then you find out you hit another wall and
need the users to load balance a different way to solve the next problem,
you have another long wait as you once again go out and beg the user
community to add more features to support your infrastructure problems.

What's worse than having the user do the work that you should be doing on
your end, is the fact that every company users a different infrastructure
and will have different load balancing requirements. So there is no single
obvious protocol change than could work for everyone. It's a fix that you
suggesting we push to the end users to fix only _Your_ load problems. It
would do us absolutely no good at all. We wouldn't use even if all our
users were running clients that supported it because it's not a load
balance algorithm that would be useful to the way we structured our
infrastructure.

No, it just just doesn't make any sense to ask the users to do the work
your hardware should be doing. All you are saving by asking them to go to
all that trouble, is the cost of a small array of diskless 1U NNTP switch
servers. A single NNTP switch can do 1 Gbit these days which means if you
have 10 Gbit of traffic you need 10 of these small servers on your end to
solve your load distribution problem and then you have complete control at
all times over the load distribution algorithm for your infrastructure.

Sebastiaan Jansen

unread,

Dec 11, 2009, 6:05:39 PM12/11/09

Curt Welch schreef:

I would be able to route users to the best path if I was needed to,
without this

>
> Most large USPs custom develop their own infrastructure. They don't just
> try to make it all work with off the shelf technologies. Giganews as I
> understand it even has some patents on some of their caching technologies
> for load distribution in their infrastructure.
>
> Building high performance distributed infrastructures is tricky business.
> And when you rely on caching to be working in order to serve your load, it
> gets even trickier. There are events that can causes your caches to
> become useless such when one of the critical caching servers dies on you
> and the load has to shift. If events happen like that in a peak load
> period then suddenly your disk IO becomes saturated and backlogged because
> the cache isn't effective, and then your news feeds to the servers also
> start to backlog and things just go down hill from there and at times might
> not be able to recover (depending on your infrastructure design) until the
> user load dies down 6 hours later.
>

I think I'll be able to manage this

> If you are running diablo, your front end dreaderd boxes _are_ acting as
> NNTP switches carring your entire load already. THer's nothing "atomic
> age" about it. CPUs and networks are fast these days and a single box can
> push a very large load when it's only acting as an NNTP router.
>

I'm interested in this solution,

> So what you can to do, is build another set of front-end boxes in front of
> your current front ends that act as NNTP routers to implement all the
> request routing decisions needed to implement whatever load distributing
> you want to try and use. These front end boxes need no disks space at all
> - they can net boot off some other server. They don't need any extra ram
> because they aren't doing any caching - they are just CPUs pushing bits to
> the right places. You need less of these than you currently have dreaderd
> front ends because the dreaderds have a lot of disk load serving headers as
> well as acting as article switches to the spools.
>
> You then have total control over your load distribution. You can change it
> hour by hour if the caching demands it. You can change it in an instant to
> support putting new servers on line or taking servers off line
> automatically when they fail etc.
>
> You can do group based hashing to distribute header load across your
> dreader boxes so a given dreaderd box is servering only a fixed subset of
> the groups so header caching has a better chance of working saving you disk
> IO on the header boxes.

do you have references that can help me? Or maybe something alike I can
use as a basis to furter research? Using this hash based routing too
cache boxes / frontends would be great if it doesn't lower performance.
I needs to be able to filter 1 Gbit/sec of upstream request data, and
almost near realtime I asume.

thx

Somethings need a working innitiative and time, to make sence

> No, it just just doesn't make any sense to ask the users to do the work
> your hardware should be doing. All you are saving by asking them to go to
> all that trouble, is the cost of a small array of diskless 1U NNTP switch
> servers. A single NNTP switch can do 1 Gbit these days which means if you
> have 10 Gbit of traffic you need 10 of these small servers on your end to
> solve your load distribution problem and then you have complete control at
> all times over the load distribution algorithm for your infrastructure.
>

A lot USP/ISP's use DSR I would think.

Sebastiaan Jansen

unread,

Dec 14, 2009, 5:34:08 AM12/14/09

I'm trashing this idea, I wasn't aware of nntpswitch. Curt, thank you
for your help.

Sebastiaan Jansen schreef:

0 new messages