RAID 5 or 6? Best practices. Please advice.

38 views
Skip to first unread message

Pilar de Teodoro

unread,
Nov 18, 2021, 9:20:22 AM11/18/21
to gpdb-...@greenplum.org
Hi team! 

we are configuring a new brand physical GP cluster which will be version 6.18:
2 master nodes

DELL r940256 GB (ea)

Cores: 40

Sockets: 4

HDD: 2 x 480GB

Optimized: read intensive  (6Gbps)

SSD: 3 x 1.92TB

Optimized: combined use (6Gbps)

24000 IOPS (32KB)

4 Data nodes:
DELL r940512 GB (ea)

Cores: 40

Sockets: 4

HDD: 2 x 480GB

Optimized: read intensive  (6Gbps)

SSD: 12 x 4TB 

Optimized: mixed used (12Gbps)

48000 IOPS (32KB)



We have been reviewing https://gpdb.docs.pivotal.io/6-18/best_practices/ha.html but still have some questions:

Best Practices

  • Use a hardware RAID storage solution with 8 to 24 disks.
  • Use RAID 1, 5, or 6 so that the disk array can tolerate a failed disk.
  • Configure a hot spare in the disk array to allow rebuild to begin automatically when disk failure is detected.
  • Protect against failure of the entire disk array and degradation during rebuilds by mirroring the RAID volume.
  • Monitor disk utilization regularly and add additional space when needed. 
  • Monitor segment skew to ensure that data is distributed evenly and storage is consumed evenly at all segments.

The cluster purpose is to store very large tables that will not be modified, only queried intensively.
Our idea is to use RAID 5 but we would like to know if it is the first option recommended or not.

For segments we are considering: physical partitions on nodes in order to improve performance, partitions per segment or 2 partitions, one for data and other for mirroring. What do you recommend? We understand it would be better to have 1 partition per segment. All segments will be mirrored but maybe as they are SSD it is ok to have just 1 partition for primary segments and 1 for mirrored segments?


Could you please give us some advice?
Thanks,

Pilar de Teodoro


Luis Filipe de Macedo

unread,
Nov 18, 2021, 9:35:23 AM11/18/21
to pvtl-cont-pilar.deteodoro, gpdb-...@greenplum.org

Pilar,

 

In your case you should go with RAID 5. How many drives per server will you have? A r940 for a master node I think it’s an overkill but if you already have it, it’s more than fine.

 

If you go with 24 drivers on the segment nodes, usually your controller will allow for 2 arrays which I would recommend.

 

You should mix primaries and mirrors on the same array. Let’s say we have 10 segments per server I would do:

 

Array 1: 5 primaries, 5 mirrors

Array 2: 5 primaires, 5 mirrors

 

Not sure I understand you “physical partitions” question. Can you clarify? Do you mean splitting the disk arrays?

 

Rgds,

 

Luis F R Macedo

Advisory Data Engineer & Business Development for Latam

VMware Tanzu Data

Call Me @ +55 11 98860 8596 (new)

Take care of the customers and the rest takes care of itself

--
You received this message because you are subscribed to the Google Groups "Greenplum Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gpdb-users+...@greenplum.org.
To view this discussion on the web visit https://groups.google.com/a/greenplum.org/d/msgid/gpdb-users/CAH98rhG_s57pNmjRLXUYEPTVK%2BsfdY5rR2r%2BPOfzPSYruNN6hg%40mail.gmail.com.

Reply all
Reply to author
Forward
0 new messages