Hostname and Address Semantics in GPDB

222 views
Skip to first unread message

Jim Doty

unread,
Dec 16, 2019, 3:52:57 PM12/16/19
to Greenplum Developers, Tyler Ramer, Scott Kahler
**Summary**

There are some inconsistencies in how different parts of the product use the
`hostname` vs the `address` columns in `gp_segment_configuration`. We’d like to
propose an expected use of each column, based on an in-use configuration with
separate networks and hostnames for external traffic and internal, interconnect
traffic.

We would like to solicit GPDB-dev feedback on the proposed design, as well as
ask for help ensuring GPDB tooling is in compliance with the design, if it is
regarded as sound.


**Design Overview**

Let’s consider the following setup of a node of the cluster:

```
+-----------+
| | SDW1-X1
sdw1.local.net | +-----+ 10.10.1.101
172.29.100.1 +----+ | Interconnect
| | Network
Data Center | | SDW1-x2
Network | +-----+ 10.10.2.101
| |
+-----------+
```

and the following `/etc/hosts` file:

```
172.29.100.1 sdw1.local.net sdw1-pa
# ...
10.10.1.101 sdw1-x1 sdw1
10.10.2.101 sdw1-X2
```

This setup has the following conventions:

+ The FQDN resolves to the IP that is routable on the data center network.
+ The FQDN has a short name, `sdw1-pa`, for convenience
+ The interconnect interfaces are resolvable with the short names above, but
only inside of the cluster - they are not routable from outside the cluster

This design isolates traffic in/out of the cluster from bandwidth available to
inside-the-cluster actions. For example - moving a database backup outside of
the cluster should not cause queries to slow down due to network congestion,
because the query uses the interconnect exclusively.

We find that there are three possible addresses to consider:

**Segment address(es)**: The “address” column of `gp_segment_configuration`

Could be used as bound listen addresses for segment servers for security (not
implemented). Any “run per segment” feature should utilize the interconnect, as
each segment might be bound to a specific NIC for optimization (for example,
NUMA locality)

Examples:

+ All database interconnect traffic
+ All replication traffic (gprecoverseg)

**Generic hostname**: The “hostname” column of `gp_segment_configuration`

Can be used for “run once” operations where using the segment address(es) would
cause an undesired “run more than once” per node. Should be a one-to-one
mapping of hostname/ip to node.

Examples:

+ Any “run once” tool, such as `gppkg`
+ GPCC for reporting host statistics

**Fully Qualified Domain Name (FQDN)**: The canonical hostname, used by
services outside the cluster to direct traffic to the cluster

Not in `gp_segment_configuration`. This is useful for services like `gpcopy`,
which might route to from another server, etc… Might be used for as bound
listen addresses for postgres server for security (implemented at least for
master server).

Examples:

+ `gpcopy`
+ `gptranfer`
+ `gpfdist`
+ Routing for external tables
+ UI for the database, perhaps GPCC
+ `pgadmin`

Generic and FDQN hostnames could be routable from inside the cluster - the
important distinction is that external traffic can only be routed via the
external FQDN/devices, and internal traffic may be directed over the
interconnect devices by using the generic hostname, instead of being routed
outside of the cluster via FQDN hostnames and devices.

We should call out that this proposal leaves it up to the sysadmin as to
whether the output of the `hostname` command is the generic hostname or the
FQDN. We would like to find a way to not couple our tooling to that aesthetic
choice.

**Proposed behavior**

Any traffic benefitting from or requiring cluster-specific high network
throughput should pull the IP or hostname from gp_segment_configuration
“address” column in order to optimize network and avoid congestion of the PA
network

Any tools which require a “run once per server” behavior should utilize the
“hostname” column to de-duplicate servers - it will be up to the GPDB designer
to determine if this “hostname” is externally routable or still uses the
interconnect.

External utilities must not assume that the “interconnect” network is
addressable from outside the cluster.

That said, based on the above proposed design, we know of at least two tools
that don’t follow these rules:

1. gprecoverseg routes recovery traffic over the “hostname” network,
not interconnect network

https://github.com/greenplum-db/gpdb/issues/9060

2. gpinitsystem creates a hostfile incorrectly from a provided config file.

https://github.com/greenplum-db/gpdb/issues/9132

Cheer,
Jim & Tyler

--
Jim Doty | R&D Greenplum Building Blocks Team | jd...@pivotal.io
Tyler Ramer | R&D Greenplum Building Blocks Team | tra...@pivotal.io

Jim Doty

unread,
Dec 16, 2019, 8:10:05 PM12/16/19
to Greenplum Developers, Tyler Ramer, Scott Kahler
> Let’s consider the following setup of a node of the cluster:
>
> ```
>                    +-----------+
>                    |           |    SDW1-X1
> sdw1.local.net     |        +-----+ 10.10.1.101
>    172.29.100.1 +----+         |           Interconnect
>                    |           |             Network
>    Data Center     |           |    SDW1-x2
>     Network        |        +-----+ 10.10.2.101
>                    |           |
>                    +-----------+
> ```

This diagram renders better in a monospaced font.

                   +-----------+
                   |           |    SDW1-X1
sdw1.local.net     |        +-----+ 10.10.1.101
   172.29.100.1 +----+         |           Interconnect
                   |           |             Network
   Data Center     |           |    SDW1-x2
    Network        |        +-----+ 10.10.2.101
                   |           |
                   +-----------+

Tammie Panar

unread,
Dec 18, 2019, 4:46:19 PM12/18/19
to Greenplum Developers, tra...@pivotal.io, ska...@pivotal.io
Hey all, we've run into issues with the private vs. public expectation of the hostname field, especially as it relates to usage between core tools (gpstart/gpstop/gpstatus) and gpcopy. We run our clusters in a private network configuration, but punched a hole through the network to route the internal vlans together to enable gpcopy to work. We also need to secure/isolate each cluster as much as possible from the wider network, so postgres.sql files for each segment instance have been modified to only accept connections from the private network (and cureen't testing ipsec for data-in-transit encryption with vormetric data-at-rest deployment ahead as well). We also normally only do local backups when we have to unload data for an upgrade (ie 4.x -> 5.x); otherwise our backups are multiple production copies with pre-load backups in another system.

All that being said: why not have dedicated fields in gp_segment_configuration for both internal/external and have tools default to the most appropriate one for their use case with an overwrite flag available command-line? Without having the master/standby marshal all data from the segment nodes, I don't think it's going to be a one-size fits all solution for every installation.

Regards,
-Tammie

PS. My main two cases dealing with gp_segment configuration issues for reference purposes: 217701, 215295

Hao Wu

unread,
Feb 17, 2020, 5:53:44 AM2/17/20
to Greenplum Developers, tra...@pivotal.io, ska...@pivotal.io
Nice job, Jim.

I agree with your idea. We should clarify the semantics of address and hostname in gp_segment_configuration, or misuse of them will cause bugs.

Besides address and hostname in gp_segment_configuration, there is also confusion between them and listen_addresses value in postgresql.conf.
The similar issue is about the port values in gp_segment_configuration and postgresql.conf.

> + The FQDN resolves to the IP that is routable on the data center network.
> + The FQDN has a short name, `sdw1-pa`, for convenience
> + The interconnect interfaces are resolvable with the short names above, but
>  only inside of the cluster - they are not routable from outside the cluster

The short name for the interconnect interface should be always resolved to the same address(ipv4/ipv6).
And we should cover more people to discuss this topic and dispatch works to other teams who will take
care of their tools/code that uses address/hostname from gp_segment_configuraiton.



Reply all
Reply to author
Forward
0 new messages