Do you have mirrors? If you do that is a 2x difference.
I can’t validate your query, but I think others here can. Sometimes those queries from PostgreSQL do not work properly with Greenplum due to some differences between the products.
Rgds,
Luis Filipe R Macedo
Adv Data Engineering
VMware Data Solutions
Take care of the customers and the rest takes care of itself
--
From:
muhammad ali <engmuha...@gmail.com>
Date: Saturday, 11 February 2023 16:39
To: Greenplum Users <gpdb-...@greenplum.org>
Subject: [gpdb-users] Greenplum disk size
!! External Email |
--
You received this message because you are subscribed to the Google Groups "Greenplum Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
gpdb-users+...@greenplum.org.
To view this discussion on the web visit
https://groups.google.com/a/greenplum.org/d/msgid/gpdb-users/220bdb34-e706-4b4e-ab79-1a97c51e1e5an%40greenplum.org.
!! External Email: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender. |
Muhammed,
There are views under gp_toolkit that will help you get the proper sizes:
Based on these views you might be able to see what is missing from your query.
Hello Luis,
I've tried the following views:gp_size_of_schema_disk
gp_size_of_table_and_indexes_disk
gp_toolkit.gp_size_of_table_uncompressed
gp_size_of_database
=> total size(Schema) or total_size(Tables) is around 2-2.5 Tb with index + toast.
=> gp_size_of_database gives 7TB in output.
Hello Ashwin,
I just checked the size using ls -alh [^0-9]* on all segments
pg_internal.init 136KB
pg_filenode.map 8 KB
And some other files which makes a total of 4-5 GB.
1: Pg_catalog schema is around 130GB. Is it possible that after vacuuming catalog tables, we'll get correct stats?
2: Should I check with infra guys to validate disk health check?
Hello Ashwin,
Thanks for the function. I ran this function once after manual checkpoint and voila
It returned around 35329 records. Last updated values are mostly around Jan 2022. Checked using this query:
select to_char(to_date(last_updated,'MON DD YYYY'),'YYYY-MM-DD'),sum(file_size)/1024/1024/1024 as file_size_gbs from public.greenplum_check_orphan_files() group by 1 order by 1;
When I added file_size column, It gave me around 4.8TB which is almost equal to the missing space.
select sum(file_size)/1024/1024/1024 as size_gb from public.greenplum_check_orphan_files();size_gb
-----------------------
4800.8426322937011719
What is your advice on this? How can we remove these Orphan files without crashing GP server?
Hello Ashwin
Total no of files inside each segment is around 16000
I checked in pg_class for relfilenodes. Used the following query:
select * from pg_class where relfilenode::text like '%1496%';
Where most of the files has 1496 as common value
Those entries doesn't exist inside pg_class. What's the best way to cater this