Greenplum memory management

408 views
Skip to first unread message

John Glorioso

unread,
Oct 2, 2020, 4:45:33 PM10/2/20
to Greenplum Users
We are having trouble getting Greenplum to use the memory available to the host instances. For example, we have queries generating 300gb spill files and each host is using only 5% of available memory.

To start, I was going to revisit the memory calculator for our use case but it seems it no longer works. https://greenplum.org/calc/ There seems to be no way to actually calculate. Did this change? I remember is used to work.

Our setup is in GCP. We have master/standby and an eight host cluster. Each host has two primary and two mirror segments. The instance types are n1-highmem-8 (8 vcpu/52gb memory).

Our gp_vmem_protect_limit is set high at 36630 and we have played with statement_mem from the default of 125mb to 3gb as we set max concurrency to 10 statements in our resource group.

The resource group is set to 90% memory limit. It seems like no matter what we try we are able to get it to max cpu without issue but memory is completely idle and never gets above 10% even when running 10 queries that all spill to disk.

I feel like we are missing something simple but so far I am unable to get Greenplum to take advantage of the memory on the server.

Please let me know if there is something else I should look at? I am toying with a pre-production system so I am able to make any changes necessary at this point.

Paul Hopper

unread,
Oct 5, 2020, 10:41:12 AM10/5/20
to Greenplum Users, John Glorioso

This is interesting as I appear to be having the same issue. I am running a simple single node instance (one server with 8 data LUNs – so essentially 8 DB instances). It has over 350GB RAM. I followed the memory calc (several times). I am lucky if the system use 16GB even when busy.

 

Cheers

 

Paul

 

From: John Glorioso <jglo...@staq.com>
Sent: Friday, October 2, 2020 2:46 PM
To: Greenplum Users <gpdb-...@greenplum.org>
Subject: [gpdb-users] Greenplum memory management

 

EXTERNAL EMAIL: Open with caution

--
You received this message because you are subscribed to the Google Groups "Greenplum Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gpdb-users+...@greenplum.org.
To view this discussion on the web visit https://groups.google.com/a/greenplum.org/d/msgid/gpdb-users/f93dbca2-1191-4c1c-9918-dd5a0527ad9cn%40greenplum.org [groups.google.com].


This email and any files transmitted with it are confidential and are intended solely for the use of the intended recipient. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, be advised that you have received this e-mail in error, and that any use, dissemination, forwarding, printing, or copying of this e-mail, is strictly prohibited. If you have received this e-mail in error, please return it to the sender at Merrick Bank, and delete it from your computer.

Although Merrick Bank attempts to sweep email and attachments for viruses, it does not guarantee that they are virus-free, and accepts no liability for any damage sustained as a result.

John Glorioso

unread,
Oct 12, 2020, 11:21:32 AM10/12/20
to Greenplum Users, John Glorioso
Thank you for all the feedback. Here is a bit more:
  • We are running version 6.8
  • gp_resource_group_memory_limit is the default of 0.7. I would like to increase this since the hosts are dedicated to greenplum only. Is 90% too aggressive?
  • output of gpdb.conf and other requested settings are at the bottom of this email
  • kernel settings are correct for vm.overcommit at 2 and 95 for memory and ratio respectively.
  • free -m shows nearly correct amounts for total (50gb physical/31gb swap). Master shows 1gb used, 41gb free, and 8gb buffered whereas the segments show 2.7gb used, 1.5gb free, and 46gb buffered. Is that normal? I just want to make sure that Greenplum is actually using all available memory on the segments.
  • statement_mem and max_statement_mem are set to 100MB and 10240MB respectively.
  • memory_spill_ratio is already set to 0 on all groups.
  • There are three groups, the install provided admin and default_group. I have added a third which I allocate 75% cpu. Memory % and Memory Spill Ratio % are both 0 so it should be using the global memory pool.
  • The concurrency on the third group is 15 queries
Given that the total memory on each segment host is 84gb total (52gb+32gb swap), and the fact that our application controls those 15 slots and only 10 of them are memory intensive, I would like to essentially provide for up to 20gb per segment of use for those 10 slots.

Statement memory is not configurable at a group level unfortunately so I have it set to 100MB because we use the pg_default group for light queries to get quick snapshots (no joins) and they almost never exceed 100mb of memory. They happen frequently in our use case.

The queries that hit the third queue are very memory intensive and many of them want 100gb of memory. I understand that we will alway incur spill files there. What I would like would be by default provide those queries with 1gb of statement memory and use session overrides to control if I want to allocate more.

As far as I can tell, the only way to do this would be in application logic and use a session statement memory setting for all queries that come in to the third group. Is that correct? There is no way to default statement memory differently at a group level?

Requested Output of configurations:
[gpadmin@mdw ~]$ psql production <<EOF

show gp_resource_manager;

show gp_resource_group_memory_limit;

show gp_workfile_limit_files_per_query;

show gp_workfile_compress_algorithm;

show memory_spill_ratio;

EOF

 gp_resource_manager
---------------------
 group
(1 row)

 gp_resource_group_memory_limit
--------------------------------
 0.7
(1 row)

 gp_workfile_limit_files_per_query
-----------------------------------
 100000
(1 row)

ERROR:  unrecognized configuration parameter "gp_workfile_compress_algorithm"
 memory_spill_ratio
--------------------
 0
(1 row)

[gpadmin@mdw ~]$ more /etc/cgconfig.d/gpdb.conf
group gpdb {
     perm {
         task {
             uid = gpadmin;
             gid = gpadmin;
         }
         admin {
             uid = gpadmin;
             gid = gpadmin;
         }
     }
     cpu {
     }
     cpuacct {
     }
     cpuset {
     }
     memory {
     }
}

Brian H

unread,
Oct 20, 2020, 3:28:28 AM10/20/20
to Greenplum Users, jglo...@staq.com
FYI - the https://greenplum.org/calc/ site is functioning again.

Bill Bailey

unread,
Nov 19, 2020, 4:41:15 AM11/19/20
to Greenplum Users, jglo...@staq.com
Did you find a solution to you problem yet as we seem to be experiencing a similar problem ?
We're running a 3 node cluster of GP 5.19 with 2 primary and mirror segments on each host and 96GB RAM but memory use is <5GB
We set all our configuration parameters and queue settings according to the guidelines

Gurupreet Bhatia

unread,
Nov 19, 2020, 6:10:54 AM11/19/20
to Bill Bailey, Greenplum Users, jglo...@staq.com
Hi All,

I was also following this mail chain
As I was also facing same issue

Last week I updated some Memory related parameter and I can see in command center that memory consumption started now

Below parameter I changed(please calculate as per your environment and change accordingly)
1) gp_vmem_protect_limit=32700
2) shared_buffers=750MB
3) statement_mem=750MB
4) max_statement_mem=3072MB
5) work_mem=128MB

check this parameter also-> maintenance_work_mem(I didn't changed yet)

Actual issue was memory was not using in cluster and session/queries we're creating huge spills which ultimately leads to high I/O and showing system CPU load


Thanks
Gurupreet

***** Email confidentiality notice *****

Xanadu Consultancy Limited is a limited company registered in Ireland with registered number 500416 and VAT registered number IE 9793319P. Our registered office is at Floor 2, River House, Blackpool Retail & Business Park, Cork, Ireland. We have a branch office registered in England and Wales with company number FC030315, whose address is at International House, 24 Holborn Viaduct, London EC1A 2BN.

This message is intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to us, and immediately and permanently delete it. Do not use, copy or disclose the information contained in this message or in any attachment. Xanadu Consultancy Limited cannot accept liability for any statements made which are clearly the sender’s own and not expressly made on behalf of Xanadu Consultancy Limited.

--
You received this message because you are subscribed to the Google Groups "Greenplum Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gpdb-users+...@greenplum.org.

John Glorioso

unread,
Nov 19, 2020, 10:08:56 AM11/19/20
to Greenplum Users, bhatia.g...@gmail.com, Greenplum Users, John Glorioso, Bill Bailey
Our use case is specific where we have lots of low memory usage adhoc queries that come in so we set statement_mem low (100mb) and max_statement_mem high (10gb) for the cluster.

This allows the small adhoc queries to run freely with high concurrency in the default resource group. We then manually set statement_mem on a per query basis for the larger ones that run in a different resource group.

This is inconvenient and I think it would be really useful to have statement_mem configurable on a per workgroup basis but the above approach is what we adapted to use and seems to be working decently.

-John

Cole Sutera

unread,
Mar 25, 2021, 1:47:45 PM3/25/21
to Greenplum Users, jglo...@staq.com, bhatia.g...@gmail.com, Greenplum Users, Bill Bailey
@Gurupreet,

Do you mind sharing information about the size of the nodes in your cluster? I
Reply all
Reply to author
Forward
0 new messages