Hi folks,
As you may know, in September, we launched Yezzey [0] - an extension that allows for transparent switching between table storage on a local filesystem and cloud storage, and back in native format. A discussion on the topic is available here [1].
In fact, Yezzey is just one more step towards making GreenplumDB cloud-native and more manageable, easier to deploy and use. The previous step was point-in-time recovery using WAL-G. And in this post I want to talk about further steps forward.
To be fair, Greenplum isn't an overly-optimized analytical database. There is a lot of room for performance improvements. So many competing systems boast better execution, vectorized, and specialized data handling capabilities, query compilation, and optimization techniques, and so on, but none of this matters much. The MPP nature allows you to simply throw more hardware at the problem. And given the unique combination of analytical capabilities and open source nature, GreenplumDB does not need to fear result of hundred benchmarks.
As our next steps, we are going to concentrate on the following topics:
1. Auto Scaling: Currently, cdbhash() uses the number of segments as a hash parameter. This leads to scaling issues with gp_expand on a large cluster. We would like to implement an access method similar to AO, which materializes the table metadata and decouples hash range from particular segment. That will allow us to bind hash ranges to segments instantly, offloading the file to S3 and declaring it bound to the new segment. It can even be left in the cache on the original segment until it is evicted.
As part of our autoscaling strategy, we will be deprecating non-catalog tables in the HEAP data type. This will allow for zero-cost table addition and removal.
2. Coordination service for sharing tables between clusters, fully S3-backed. Enables reading and writing from multiple clusters, eliminates the need for standby clusters, and opens up new data usage opportunities. Metadata is served by the coordination service when the source cluster is unavailable. Also it maintains writer locks during writes.
3. Materialized views incrementally maintained (a.k.a projections). One of the key strengths of GP is its ability to join large tables. However, distributed hash joins usually lead to significant data network utilization. To mitigate this effect, we can consider materialized views using different distribution keys, which can make the Motion unnecessary. To enable this feature, we will need to introduce a new relation type, the projection, a specific kind of materialized view, and instruct the planner to include it in its considerations.
4. Caching. Currently, Yezzey relies fully on object storage cache, but some S3 implementations charge customers according to the number of GET requests. To reduce these numbers, we need proper local caching in place.
5. ANN - ANN integration for AI. Currently, PG_Vector using HNSW has taken over the world. We need to integrate this feature into GP as well.
6. QUIC - QUIC Motion, or at least Motion Compression, see [4]
7. Coordination-less mode - Given the metadata table service from (2), we should be able to plan queries on each cluster node
8. Backup, Offload, and Table-Sharing Storage - Currently, Yezzey stores data in different object storage buckets. However, we can think about creating a backup bucket and using offload buckets in the same way, as homogeneous parts of the storage system. This would allow us to convert a local table to a Yezzey table instantly.
9. Built-in Time Travel - SELECTing a table at a specific point in the near past should be no problem, unless the visibility map is heavily modified. Also, dropping a table should be no problem; the table is stored in backup in exactly the same format, making table drops equivalent to table evictions from the cache.
No doubt, these steps are ambitious. It is likely that we will only be able to implement some subset of the plans. However, this is an ordered list of the things we want to achieve in Greenplum DB. As always, we commit to developing these projects in an open-source manner. Any discussion is greatly appreciated. I sincerely hope that some of the resulting developments will make their way into the upstream codebase
Best regards, Andrey Borodin.
[0]
https://github.com/yezzey-gp/yezzey/blob/v1.8/notes/announce.md
[1]
https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/ImJz6DlwT_A
[2]
https://github.com/wal-g/wal-g/releases/tag/v2.0.0
[3]
https://cloud.yandex.com
[4]
https://github.com/greenplum-db/gpdb/pull/16045