Hi,
thanks for your very informative replies!
Sorry for the long.. long.... long.. delay in responses.
After fighting !"$%"§$§ shelves and controllers, I _finally_ am able to SEE the disks (although via a HPE P822e, not via the LSI 9300e I wanted to use for that.
A few hundred euros, some IOMs, an additional Dell shelf, some controllers and change later, I am ordering cables.. again.. because.. cables. Memo to self: SFF 8644 to SFF 8080 does suck.
At least, now I see disks and am in the process of re-formatting them to a correct logical block size (NetApp is on 520, not 512 ;) )
I am not really that much into ZFS, I just assumed it was what is preferred by ESOS.
The "real" setup now is:
* 2 NetApp Shelves with 6GBit (IOM3 to IOM6 works, sometimes ;) ) - 15k SAS disks, so .. 300 MB/s, or 2.3 GBit/s per disk, ideally (i can dream!)
- bloody cables. At the moment, I only got one for HP P822 to the shelves, so further testing is required. At 50/pcs I did not really want to buy too many..
in the end, this will be 4 cables - each server can talk to each shelf. maybe. hopefully.
-- card: currently HPE P822e, should be the LSI 9300e
* 2 HPE DL360p Gen8
-- card: Brocade Quad-FC-Card (4*8GBit)
FC cables (duh)
* Brocade 300 8GBit FC switches
From here, vanilla setup.
From my calculation,
disks: 300 MB/s == 2.3 GBit/s (24 per shelf, 2 shelves)
Servers: one quad-SAS 6 GBit cable per shelf -> 24 GBit per shelf per server, a total of 96 GBit/s theoretical throughput
The FC-Cards do 4*8=32GBit per Server, a total of 64 GBit/s
If I use master/slave, I am at 48 to the shelves and 32 to the servers.
As I am not that much into storage - would you say that there is a chance of actually getting close to these numbers? If no, where is the most likely bottleneck?
When playing around, the lower part
(DL360-> Brocade->Server) did work, I have a Cisco booting off a
local disk in one of my ESOS nodes ;)
I'll keep you posted, at
the moment, I just wanna get drunk. like, seriously drunk. 3 months
until I can see disks is.. not good for my mental state.
> In my setup, I have dual connections to the backend
drives through and through. Each controller (server) has an HBA that
sees every drive in every shelf irrespective of the other controller...
In my situation I need only make sure the controllers dont step on each
other's toes - something that LVM and pacemaker handle reliably.
My planned setup looks similar to yours on the hardware level ;)
A question came to my mind: When one shelf fails, that means that any RAID level
>1 I am aware of will crash and burn, as 50% of the disks gone is
considered catastrophic. This adds a new single point of failure to the whole
thing.. what do you do? Trust the shelf?
Also, when changing disks, does this really require some kind of reboot? I am confused there right now - when changing the blocksize, the OS got it right, but the HPE controller needed a kick (reboot)..
>In this case, depending on your usage I would fully and absolutely recommend SSD caching, but those SSDs have to be visible to both servers
SSD caching: 2T SAS SSD is ~500€ oO - that is for.. later. For the time being, I have some 2TB SATA SSDs lying around.. I'll try them, they should max out 6GBit/s
The NVME with DRDB sounds.. bad. like really bad. NVME over fabrics (FC, IP) may be better.
I still have the on-board P420i , though - maybe I can wire them to some shared storage thingie... I need to meditate over that. Maybe a very small 1u dual shelf - any ideas?
> ZFS negates the need for lvmlockd. You are no longer using that concept.
I don't see why - or I was unclear.
I want:
* 2 ESOS heads in active/active (this is what all the above is about). No idea what FS there will be used, or how I'll RAID things.
The storage presented to the machines, however, falls into 2 categories:
* VMWare with VMFS: This is cluster-aware already
* Linux servers: To save me all that headache with kernel DLM and so on, lvmlockd seems a way to prevent one server accessing another's mounted LV.