pgs_distribution_metadata schema contains tables or views?

126 views
Skip to first unread message

shankar

unread,
Nov 11, 2015, 6:45:56 AM11/11/15
to pg_shard users
Hi, 

1) Please explain the schema structure of "pgs_distribution_metadata" after installing citusdb through the instructions given in below link.

After installation three views are created:

                      List of relations

          Schema           |      Name       | Type |  Owner   

---------------------------+-----------------+------+----------

 pgs_distribution_metadata | partition       | view | postgres

 pgs_distribution_metadata | shard           | view | postgres

 pgs_distribution_metadata | shard_placement | view | postgres

(3 rows)


This is not the case when i installed through cloud formation setup. If i create the setup through cloud formation script given in citusdb for distributed postgresql i see that these views are created as tables (not views) along with two sequence types. 

Am i missing some thing here? or this schema has only views?

2) In multi master setup, I have two masters. If I updated schema in one master node and wanted to sync the meta-data to other master instead of login to other master and do the same steps each and every time, do we have any script?

 

 

Onder Kalaci

unread,
Nov 11, 2015, 12:52:13 PM11/11/15
to pg_shard users
Hi, 


Am i missing some thing here? or this schema has only views?

No, you are not missing anything:) 

Before answering your question, let me give you some background. We have two different products, pg_shard and CitusDB, which can be considered as complementary to each other.

Both can handle multi node workloads. pg_shard can handle real-time INSERTs/UPDATEs on the cluster. CitusDB can handle multi-core parallel processing of queries on the cluster. Lastly, both products can be used with each other as well.

Both pg_shard and CitusDB have a very similar metadata structure. So, in pg_shard v.1.2.0, we implemented such that when CitusDB is present, pg_shard accesses CitusDB's metadata via views. If CitusDB is not present, pg_shard creates its own metadata tables. We understand that this sometimes creates confusion among our users, so the next releases of our products will integrate much seamlessly.

Is that helpful? 


2) In multi master setup, I have two masters. If I updated schema in one master node and wanted to sync the meta-data to other master instead of login to other master and do the same steps each and every time, do we have any script?


pg_shard currently does not support multi-master setup out-of-the box. If you are referring to this post, we already mentioned in the post that it is a proof of concept and not production ready.

We don't have a generic script for what you want. But, I think you can use pssh to run the same command on all the master odes. And, if you run "sync-metadata" script as its mentioned in the  post on all master nodes, you should be able to achieve what you want. But again, this is not tested, I just think that it should work:)

Onder

shankar ofg

unread,
Nov 12, 2015, 6:04:24 AM11/12/15
to Onder Kalaci, pg_shard users
Hi Onder,

Thank you for clarification.

May i know where is the citusdb schema stored. Is it in pg_catalog schema?

When i checked the three views in "pgs_distribution_metadata", which are created while installing citusdb, they are collecting data from tables which are stored in pg_catalog schema as given below.

Tables in pg_catalog:
1) pg_dist_partition
2) pg_dist_shard
3) pg_dist_shard_placement

So, when i distribute and shard the table using master_create_distributed_table() and master_create_worker_shards() functions, the meta-data will be stored in the tables of pg_catalog? Then views under "pgs_distribution_metadata" schema will be using these tables information for INSERT, UPDATE and DELETE operations?

If this is the case when we use citusdb to create another master node, it is better to take the base backup of existing master metadata database and restore to the new master. sync_metadata script we cant implement here? Please clarify.

Thanks
Shankar 

 

--
You received this message because you are subscribed to a topic in the Google Groups "pg_shard users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/pg_shard-users/-gcO-njeQEk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to pg_shard-user...@googlegroups.com.
To post to this group, send email to pg_shar...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pg_shard-users/cbff481d-d0f5-44c6-b88b-a56642dfeb93%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Onder Kalaci

unread,
Nov 16, 2015, 5:28:55 PM11/16/15
to pg_shard users, on...@citusdata.com
Hi shankar,



When i checked the three views in "pgs_distribution_metadata", which are created while installing citusdb, they are collecting data from tables which are stored in pg_catalog schema as given below.

Tables in pg_catalog:
1) pg_dist_partition
2) pg_dist_shard
3) pg_dist_shard_placement

So, when i distribute and shard the table using master_create_distributed_table() and master_create_worker_shards() functions, the meta-data will be stored in the tables of pg_catalog? Then views under "pgs_distribution_metadata" schema will be using these tables information for INSERT, UPDATE and DELETE operations?

Yes, that's how pg_shard works when installed with CitusDB.
 

If this is the case when we use citusdb to create another master node, it is better to take the base backup of existing master metadata database and restore to the new master. sync_metadata script we cant implement here? Please clarify.

 
You cannot simply do that. The reason is that pg_shard store's Oid's of the distributed tables in the metadata (ie: logicalrelid field in pg_dist_partition).  So, when you take the backup and restore it on a node, the distributed tables would most probably have different Oids and so pg_shard wouldn't be able to identify that the table is distributed.

The sync_metadata script takes care of that. In other words, it maps table names with corresponding Oids on the different master nodes. 

Note that, even if you manage to move metadata to the other masters, you need to take care of the metadata changes in any of the master nodes. Basically, on any change in the metadata, you need to synchronize both masters' metadata again, which is not trivial to do in a production environment. 

Though we plan to implement that in the future, currently pg_shard (and CitusDB) does not support such a functionality out-of-the box. 

shankar ofg

unread,
Nov 18, 2015, 12:28:32 AM11/18/15
to pg_shard users

shankar

unread,
Dec 10, 2015, 5:22:25 AM12/10/15
to pg_shard users
Hi Onder,

I would like to install citusdb through below given steps in the page.
When i follow these steps, i can see that pgs_distribution_metadata schema has only views. 

When i follow the cloud formation setup i can see the tables in pgs_distribution_metadata schema.
Both cases were used citusdb 4.0 and its contrib (extentions) packages.

Thanks for explanation on this.

Please let me know if i want to have tables in pgs_distribution_metadata how can we install citusdb 4.0?










Onder Kalaci

unread,
Dec 10, 2015, 8:39:06 AM12/10/15
to pg_shard users
Hi shankar,

Could you also share the link to the cloud formation template you used?

shankar

unread,
Dec 15, 2015, 12:04:25 AM12/15/15
to pg_shard users

Onder Kalaci

unread,
Dec 15, 2015, 4:05:02 AM12/15/15
to pg_shard users
Hi shankar,

I am guessing that you've reached the template from this blog post, is it right? As we mentioned in the introduction of that post it is a proof-of-concept, that's why I do not know how it exactly operates. Also, it is not recommended for production.

But, if you use this downloads page, you'll be downloading this template which is the one that we officially support. And, if you use that, you'll see pgs_distribution_metadata contains views. Actually, if you check the code here you'll see that whenever citusdb is present, pg_shard uses views to access the metadata of distributed tables.

Btw, we are aware of the fact that having two different tables/views is confusing to our users. So, in our next release the integration of our products will be much seamlessly. 

Thanks,
Onder

shankar

unread,
Dec 15, 2015, 6:57:28 AM12/15/15
to pg_shard users
Thank you Onder. 

Yes i have gone through same blog post and it is proof-of-concept.
I will use the new template which you suggested.

If we install the postgresql+pg_shard (both compiled from source, not from citusdb) can we migrate/upgrade the cluster to citusdb5.0 ?
Because citusdb5.0 is going to release as a extension for postgresql5.0. We can install the citusdb5.0 extenstion as other normal extenstions (CREATE EXTENSION citusdb;)

The great thing is, this extension going to be opensource.

In this case, can we migrate/upgrade the cluster from postgresql+pg_shard to postgresql+citusdb5.0?
or is it better to use https://www.citusdata.com/resources/downloads to install citusdb4.0 for easy migration/upgrade to citusdb5.0?


Regards
Shankar

On Wednesday, November 11, 2015 at 5:15:56 PM UTC+5:30, shankar wrote:

Onder Kalaci

unread,
Dec 21, 2015, 7:00:20 AM12/21/15
to pg_shard users
Hi shankar,

I think it is better to use citusdb4.0 for easier migration.

Thanks,
Onder

shankar

unread,
Dec 22, 2015, 12:51:27 AM12/22/15
to pg_shard users
Hi Onder,

Thank you very much.

Regards
Shankar

On Wednesday, November 11, 2015 at 5:15:56 PM UTC+5:30, shankar wrote:
Reply all
Reply to author
Forward
0 new messages