Disk Hot-Swap in BackBlaze Pod v4

450 views
Skip to first unread message

alexs...@gmail.com

unread,
Feb 21, 2015, 3:27:18 PM2/21/15
to opensto...@googlegroups.com
Hi!

Whether supports BackBlaze Pod v4 Hot-Swap of hard drives? Otherwise in case of failure of the single disk it is necessary to switch off platform.

Alex

The O.G.

unread,
Feb 21, 2015, 3:45:03 PM2/21/15
to opensto...@googlegroups.com

The hot swap capability is built-in to the SATA and SAS connection technology. The harder question is identifying the drive that failed.

--
You received this message because you are subscribed to the Google Groups "OpenStoragePod" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openstoragepo...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

alexs...@gmail.com

unread,
Feb 21, 2015, 4:32:03 PM2/21/15
to opensto...@googlegroups.com
You speak about theoretical opportunity, but not about practical implementation. In that case there would be no need for hot-swap controller's (http://wellic.com/cat/d748/2.html). And server baskets would be very simple, and it not so.

Whether a question in that it is possible to derive a disk from BackBlaze Pod v4 without consequences for disks or for a platform.

jason andrade

unread,
Feb 21, 2015, 6:44:56 PM2/21/15
to opensto...@googlegroups.com

G'day AlexSSS,

My understanding of how the BBOSP has evolved is that hot swap disks have never been considered to be a mandatory architectural/engineering design goal.

It's always possible to implement this with the right combination of backplane and storage controller, combined with the ability to identify your failed disk and of course your OS being able to deal with this.

However this also necessitates a physical design that allows you to open a running chassis and keep it all running with all the complexity that entails.

It's a lot simpler to do a design - and then implement that design to scale - where you don't care about shutting down a pod and in fact BB even talk about how they batch this process for efficiency rather than doing the conventional enterprise 'hot swap onsite' or 'onsite technician replacement within 4 hours'.

So i think to answer your question:

- It's possible - but not on the platform 'as is'
- You'd have to do further engineering yourself
- It would be better to put the engineering effort into your application such that you can take a pod down without a service impact

When wearing my 'enterprise' hat I love the idea of hot swap everything (or at least, power and disks being the two things most likely to fail). When wearing my 'research infrastructure' and 'cloud/scaleable platform' hat, not so much.

Hope this helps.

regards,

-jason
-----
P: 0402 489 637 M: +61 402 489 637 E: jason....@gmail.com



On 22/02/2015, at 7:32 AM, alexs...@gmail.com wrote:

> You speak about theoretical opportunity, but not about practical implementation. In that case there would be no need for hot-swap controller's (http://wellic.com/cat/d748/2.html). And server baskets would be very simple, and it not so.
>
> Whether a question in that it is possible to derive a disk from BackBlaze Pod v4 without consequences for disks or for a platform.
>
>

alexs...@gmail.com

unread,
Feb 22, 2015, 7:26:05 PM2/22/15
to opensto...@googlegroups.com
Hi!

It is difficult to me to provide the project in which to application it would be indifferent - whether disconnected a disk or not. In this case the cluster and clustered file systems is necessary (for example glusterfs). But then the second, third is necessary and so daleen BB. Thus, the cost of storage of Gb will multiply grow. Moreover, the place in a stand, and also cluster infrastructure in the form of software, switches and so on still is necessary.

Whether there is the free project of a simple hot-swape backplane?

thanks !

jason andrade

unread,
Feb 22, 2015, 8:08:42 PM2/22/15
to opensto...@googlegroups.com

G'day AlexSSS,

I think I understand that you're saying you absolutely have to have hot swap disks based on your application architecture (in this instance, a clustered filesystem).

The only response I can give you is that perhaps the open storage pod is not (currently) the answer to your probem ?

When I've had a bit of a look, there's a bunch of references to vendors and products that do ultra dense storge in 4RU which does offer hot swap.

e.g. http://www.quantaqct.com/Product/Storage/JBODs/4U-c77c71c73c153

Perhaps one of them would be better for your app.

You might also want to look at http://45drives.com/


regards,

-jason
-----
P: 0402 489 637 M: +61 402 489 637 E: jason....@gmail.com



Ouroboros

unread,
Feb 22, 2015, 8:31:28 PM2/22/15
to opensto...@googlegroups.com
Supermicro has been selling a 4U 36xHDD hotswap bay dual path SAS2 chassis (server or JBOD) for a while now.

http://www.supermicro.com.tw/products/chassis/4U/847/SC847E26-R1400LP.cfm

They are now selling a new 4U chassis (currently only the JBOD version though) that uses double disk hotswap carriers, doubling the previous chassis' capacity.

http://www.supermicro.com.tw/products/chassis/4U/847/SC847DE26-R2K02JBOD.cfm

When the server chassis version of the above comes out, it will likely hold 72xHDD in hot swap bays (and should be able to squeeze in 2x3.5 or 4x2.5 HDD in the fixed disk mezzanine area in the center)


Randy Olinger

unread,
Feb 23, 2015, 12:29:07 PM2/23/15
to opensto...@googlegroups.com
We purchased one of the SuperMicro storage servers that has 72 4 TB drives.  We bought it as more of a research project after looking at the 45 drives system, which we rejected due to the low end components and high cost.  I always figured the SM would be something of a let-down, but was pleasantly surprised once we got it up and running.  Performance is adequate (meaning it can do upwards of 500 MB/s sustained writes using ZFS and LUKS encryption).  It was easy to set up and seems reliable enough.  Here are the short-comings...

1. The boot drives are not RAID, so we are vulnerable to a SPOF that can't easily be remediated.  Any system built with this server (like a backup storage farm) would need to be able to survive an extended outage while the system was recovered.  It would be nice if one could put in a pair of consumer SSD's as mirrored boot devices. 
2. Only includes 1G Ethernet.  We are going to put in a 10G card and see if it works.  I think that will be fine.  Once could also install fiber channel or Infiniband if needed, but 3 PCI slots are already in use by the SATA cards.
3. There is no way to hot swap a single drive.  There are two drives on the sled, so to remove a dead drive you also have to remove a healthy drive.  That scares me, so our plan is to have enough hot spares to keep the system healthy for an extended time.  For instance, if we use 3 drives for ZFS RAIDZ3 and 7 drives as hot spares and 2 drives for boot and failover boot, that leaves 60 drives available for the ZFS pool and you could probably run for over 5 years before you consumed your 7 hot spares.  (In theory...  we all know that 7 drives could fail in a week...)

So, all in all a good purchase and I think we may be buying more.  I'm considering using FreeNAS with a USB boot stick, absolutely the lowest cost per gig that I can think of.  Throw in 10G Ethernet and have a pretty decent NFS server for collecting backup data.  (and it can compress and replicate)

Another option is to use with Luster, but the lack of redundancy for the OST's is a scary thought.  If one node goes down, the entire cluster goes down. 

R.

alexs...@gmail.com

unread,
Feb 23, 2015, 3:35:11 PM2/23/15
to opensto...@googlegroups.com
Thanks for your responses!

I would like to finish the BackBlaze Pod v4 platform under the Hot-Swap technology. As I understood, BackBlaze Pod v4 is used the decision on the basis of the cable SFF8087 -> 4xSFF8482 (http://www.servethehome.com/wp-content/uploads/2014/06/Backblaze-Storage-Pod-v4-Drive-Backplanes.jpg).
Whether there are ready low cost boards-backplane on 4-8 disks? Or the electric circuit of the similar device can eat? I would try to solder independently.

Alex

Jeffrey Parker

unread,
Feb 24, 2015, 2:26:04 AM2/24/15
to opensto...@googlegroups.com
Hi,

I would just like to add that as mentioned before hot swap is part of sata and SAS standards and I have put this to the test on everything from low end desktops to high end servers and every time hit swap has worked. Even in windows desktop. The hardest part is identifying the drive, but if you identify the drives in advance and label them I see no reason that any problems would show up if you pull out the pod until you can open the top and pull out a drive, then put in a replacement. It might seem risky when you are using hardware not specifically labeled as supporting hot swap but as I said I have successfully done hot swap on just about every sata and SAS platform I have worked on and never experienced a problem.

Ouroboros

unread,
Feb 24, 2015, 7:19:30 PM2/24/15
to opensto...@googlegroups.com
If the 72 disk server is the SC847DE16 chassis

http://www.supermicro.com.tw/products/chassis/4U/847/SC847DE16-R1K28LP.cfm?parts=SHOW

one important point is the E16 designation means the SAS backplane is single path only (dual path would be E26, but that still hasn't been released) which is a risk issue (and somewhat of a speed issue). (Looking at the manual, the secondary path can be done by adding a second optional expander card to the backplanes, one each). You do run a bit of risk if the whole front backplane went down (I think there are 2 backplanes physically, maybe 3 electrically/logically (2F+1R)). Designing the vdev layout so a single vdev doesn't use 2 disks in the same sled is easy enough though, provided there are 3 backplanes logically. If 2, you always face a risk. If 3, then it's a simple issue of counting by sets of 9, where you have 7 disk RAIDZ3 with 2 hot spares. If a backplane goes down, that takes in the worst case 3 drives offline, which is survivable. This would give you 8 RAIDZ3 (4+3) vdev's (pretty fast) for 128TB, with 16 global hot spares. You might be able to squeeze in two more 7 disk vdev's if the layout permits that, if you are willing to have 2 global hot spares only. Switching to a 8+3 vdev plus spares layout will yield more usable disk space (less disks devoted to RAIDZ3 parity), but that drops your ZFS pool speed (pool speed governed by number of vdevs in a poll multiplied by the slowest single disk speed of the slowest vdev), and it also violates the no more than 3 disks on a given backplane survivability rule (assuming 3 backplanes).

Seems the fixed internal disks in the mezzanine is now limited to 2x2.5, but boot mirror setup is doable (NexentaStor uses ZFS mirrors for syspool boot disk by default, might be able to do ZFS boot on FreeBSD from mirrors if you have a new enough GRUB). If the chassis is the two PSU type (2000W), there is an optional thick 2x2.5 disk sled that goes in the third PSU hole.

As for the 1G ethernet, if you had chosen an alternative Supermicro E-ATX motherboard with built in 10G and LSI SAS (though that requires flashing the LSI SAS with IT mode firmware to convert it into an HBA to work best with ZFS) that would have the bare minimum without addon cards. Say the X10DRH-CT

http://www.supermicro.com.tw/products/motherboard/Xeon/C600/X10DRH-CT.cfm

Note the SATA RAID is that Intel fakeraid crap (be careful, Intel seems to have licensed LSI naming to use on fakeraid products, so the BIOS shows LSI naming), but with ZFS using the AHCI SATA normally should be easy enough. The LSI SAS would be split, 4 lanes to each path and daisy chaining the front and rear backplanes.


Reply all
Reply to author
Forward
0 new messages