Hi,
Just clarifying my own question and understanding further, even if I do enable tag aware sharding or keep replicas in a different DC as permanently hidden or non-voting members, it is the networking access aspect that bothers me...
1. For diagram 1, if my applications are in Asia or UK, they would still want to be able to write to the primary - which is in US east coast. So the primary will have to be accessible via a public network access (write global). If tunnelling / port forwarding can ensure that the writes get to the primary, I would still want to know what I will need to give as the replica set member names to form the replica.
2. Same for diagram 2, I understand that it can be achieved via tag aware sharding. However, how will I -
a) make a replica set by listing all the members - even those which are in a remote DC.
b) How will my mongos and mongo config servers which have to be in each DC, need to know and access the replica members of the shard. Best practice is to have mongos running on all application server machines or at least locally in that geography.
c) If a customer travels from US to UK and his primary shard replica were in US, he won't be able to write any data to US - eg he can't change his credentials, he can't make reservations for places which are in UK or back home in US when he is in a different DC etc.
The only straightforward way I can think of is that I have to give public EIP addresses to each and every node in every shard and replica so that mongos or application tier will be able to route the request to the appropriate location. Additionally, public EIP addresses to each config servers as well.
If I grow to have 100s or 1000s of mongo db instances then it doesn't seem quite right to have as many number of EIPs.
The other straightforward way is that I put everything i.e. all mongod, mongos and application servers in one DC / geography and into 1 large VPC. But then again I am not really taking advantage of multiple DCs and high availability - and how will I decide whether requests from mobile users from Singapore should go to US-east-1a or US-east-1b? Where will I draw the boundary of what is a continent/geographical region and what is a country?
Not quite sure how large installations like Foursquare solve this problem which have grown beyond the boundaries of a single VPC and single region.