Creating index on pg_shard distributed table

72 views
Skip to first unread message

shankar

unread,
Dec 3, 2015, 1:37:20 PM12/3/15
to pg_shard users
Hi,

I created distributed table "github_events" using the following link:

CREATE TABLE github_events
(
    event_id bigint,
    event_type text,
    event_public boolean,
    repo_id bigint,
    payload jsonb,
    repo jsonb,
    actor jsonb,
    org jsonb,
    created_at timestamp
);
SELECT master_create_distributed_table('github_events', 'repo_id');
SELECT master_create_worker_shards('github_events', 16, 2);

Then populated the table with data.

I created an index on actor field on master node. This index is not propagated to worker node sharded tables.

This index should be created on master node only? 
or this should propagate to worker nodes sharded tables also?
If this should update in worker nodes sharded tables how can we update all the distribured shards on all worker nodes?

Thanks
Shankar



Onder Kalaci

unread,
Dec 7, 2015, 8:40:02 AM12/7/15
to pg_shard users
Hi Shankar, 

Currently pg_shard only propagates the indexes which are created before the table is distributed. So, in your example if you had created the index before executing master_create_worker_shards(), then you'd have seen the indexes on the shards as well.

Also, I want to share that we are currently working on DDL command propagation feature on the distributed tables and hopefully we'll have that feature in the next release.

Thanks!

shankar

unread,
Dec 9, 2015, 5:53:40 AM12/9/15
to pg_shard users
Thanks Onder.

I modified the script as given in the page https://gist.github.com/marcocitus/601b49c343bc910e97bd  to create the index instead of altering the table on all the shards in worker nodes.
Then created index on master node. 

Is this the right way of doing? or do we need to update any other metadata tables on master node?


Regards 
Shankar

shankar

unread,
Dec 9, 2015, 5:58:31 AM12/9/15
to pg_shard users
Hi Onder,

I am creating index after table distributed across worker nodes. 

Onder Kalaci

unread,
Dec 10, 2015, 8:13:50 AM12/10/15
to pg_shard users
Hi shankar,

Since the shards on the worker nodes are regular postgres table, you can simply check whether you were able to create the index on the shards or not. So, if you modified the scrip and able to create the indexes on the shards and on the master table, then its fine. 

But, the problem with that approach is that it is not resilient to any kind of failures. For instance, if one of the nodes is down while you run the script, then you're in trouble. So, if you manually create the indexes (or alter the table) with scripts, you need to ensure that you successfully executed them on all shards. As I mentioned earlier, we are going to implement that change with failure handling on the next release. 

Does that make sense?

Onder

shankar

unread,
Dec 15, 2015, 12:02:29 AM12/15/15
to pg_shard users

Thats great Onder.

Thanks
Sankar

On Friday, December 4, 2015 at 12:07:20 AM UTC+5:30, shankar wrote:
Reply all
Reply to author
Forward
0 new messages