You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Linux Networx Users Group
We recently went through a pretty significant upgrade here going from
SLES9.3 to SLES10.1 on our LNXI cluster (some older supermicro based
nodes but mostly dell nodes). For the most part things have gone
smoothly but we are running into some scaling issues that we weren't
seeing on our test system (and that we weren't seeing in our SLES9.3
environment). Several things that we ran into the day of the upgrade:
- default arp cache size setting in SLES10.1 was way too low:
Prior to setting that on all of our nodes we were seeing arp table
floods (to the point that our provisioning network was unusable). We
set the management server's tables higher based on planned expansion
coming shortly (with the above settings we still see around 1400 arp
table entries on our management server due to the total number of
nodes in our cluster and all of the various network interfaces).
Another big thing that we ran into is that we had to force tso offload
off on all of our ethernet cards (this was done in SLES9.3 as well but
changes to boot order,etc caused us to change where that was done in
our configuration).
Right now we are still struggling with SLES10.1's default nscd
configuration. We use LDAP at our site and following the upgrade
using the same default nscd configuration that we had under SLES9.3 we
were seeing a 0% cache hit rate and the load on the ldap servers had
gone up 4x (due to no caching being done, all queries were going
directly to the ldap servers).
We are just wondering if anyone else out there has gone through this
upgrade and has seen similar or different problems. Maybe we can
exchange some information.