HA for authoring node(s)?

46 views
Skip to first unread message

bitsofinfo

unread,
Dec 6, 2019, 4:22:02 PM12/6/19
to CrafterCMS
Hi, 
In looking at the docs it appears that for a production setup, you'd have a single authoring server/pod/node etc, with many delivery nodes.

Can one run multiple authoring servers/nodes for HA?

thanks

Sumer Jabri

unread,
Dec 6, 2019, 4:27:41 PM12/6/19
to CrafterCMS

bitsofinfo

unread,
Dec 6, 2019, 4:32:02 PM12/6/19
to CrafterCMS
Thanks, but what about when deployed in orchestrated environments like k8s where it might be part of a deployment w/ N replicas, where IPs are not pre-determined? How does it do auto discovery of peers?

Sumer Jabri

unread,
Dec 6, 2019, 4:49:44 PM12/6/19
to CrafterCMS
Studio announces itself to its siblings via the database (a DB record per Studio instance). This is a multi-master setup, and all Studios become aware of each other via the DB and then sync up.

--sumer

bitsofinfo

unread,
Dec 6, 2019, 4:54:44 PM12/6/19
to CrafterCMS
Thanks, so what value, if any do I provide for: 

studio.clustering.node.registration.localAddress

Sumer Jabri

unread,
Dec 6, 2019, 4:56:54 PM12/6/19
to CrafterCMS
The local IP address accessible to its siblings. Bear in mind that the Studio siblings will need to reach into each other's git repos to sync up. This is usually done over SSH and there is a trust established between the nodes. See the guide I sent you earlier for more detail.

--sumer

bitsofinfo

unread,
Dec 6, 2019, 4:59:03 PM12/6/19
to CrafterCMS
So when in an orchestrator like swarm or k8s I don't know the IPs ahead of time as replicas come and go all day. What value do I put for this localAddress?

avasquez

unread,
Dec 6, 2019, 5:11:54 PM12/6/19
to craft...@googlegroups.com
In case of Kubernetes, you use a StatefulSet, where the replicas have a stable network address. Then you pass the address to the pod as environment variables. For example, assuming the headless service of your StatefulSet is named `authoring-service`, then passing the address as an environment variable would look like this:

          env:
           
- name: POD_NAME
              valueFrom
:
                fieldRef
:
                  fieldPath
: metadata.name
           
- name: CLUSTER_NODE_ADDRESS
              value
: "$(POD_NAME).authoring-service.default.svc.cluster.local"


After that, you just need to make sure the studio-config-override.yaml is reading from the environment variable, like this:

# Cluster member registration, this registers *this* server into the pool
# Cluster node registration data, remember to uncomment the next line
studio
.clustering.node.registration:
#  this server's local address (reachable to other cluster members)
  localAddress
: ${env:CLUSTER_NODE_ADDRESS}

bitsofinfo

unread,
Dec 6, 2019, 5:21:22 PM12/6/19
to CrafterCMS
k, I can see how that would work, but are StatefulSet's required when deploying to k8s? I got the impression crafter was intended to be stateless. Couldn't one still just deploy as a Deployment kind, then get the IP via this mechanism no? (then reference CRAFTER_POD_IP in the override yaml?

  env:
   
- name: CRAFTER_POD_IP
      valueFrom
:
        fieldRef
:
          fieldPath
: status.podIP

Do you guys have a helm chart or operator that would just take care of all of this?

On Friday, December 6, 2019 at 3:11:54 PM UTC-7, avasquez wrote:
In case of Kubernetes, you use a StatefulSet, where the replicas have a stable network address. Then you pass the address to the pod as environment variables. For example, assuming the headless service of your StatefulSet is named `authoring-service`, then passing the address as the environment address would look like this:

bitsofinfo

unread,
Dec 6, 2019, 5:23:04 PM12/6/19
to CrafterCMS
To clarify, I'd be interested in running it w/ N replicas, where the total number is unknown and I can dynamically scale w/ hpa's etc. I don't feel that instance 1 vs instance N has any priority or relevance, hence don't get why a StatefulSet would be necessary?

Sumer Jabri

unread,
Dec 6, 2019, 5:35:39 PM12/6/19
to CrafterCMS
Delivery (Engine) is stateless and runs w/ N replicas. That serves the actual sites/apps.

Authoring (Studio) needs:
- Stable addresses for clustering
- Stable local storage for Git (think speed since Git is disk chatty)

Therefore Authoring is Stateful at this time. (Note: Studio supports N sites.)

If you wanted to create a SaaS offering, you can either scale-out a Studio cluster and spin up sites within a cluster. Or, if you wanted every client to get their own Studio, you'd create a cluster per client.

The Delivery tier is simply unaware of Authoring (Studio). You can have Delivery render sites/apps from many different Studios.

It's not entirely clear what your ultimate goal is, but I hope the above helps answer your questions.

--sumer

bitsofinfo

unread,
Dec 6, 2019, 5:40:36 PM12/6/19
to CrafterCMS
in a clustered "authoring" setup, do all replicas access the same persistent volume for the working repo copy? or each have their own PV and then manage distributing changes via the origin? if sharing a PV how do you handle fs locking issues across replicas?

Sumer Jabri

unread,
Dec 6, 2019, 6:02:45 PM12/6/19
to CrafterCMS
They don't share storage. Each has its own, hence the syncing between Studios.

--sumer

bitsofinfo

unread,
Dec 6, 2019, 6:04:46 PM12/6/19
to CrafterCMS
thanks for clarifying

so why are StatefulSets needed at all? Why not just use a regular Deployment and inject the IP via env and the status.podIP in the spec

bitsofinfo

unread,
Dec 6, 2019, 6:05:35 PM12/6/19
to CrafterCMS
i.e. do they do a fresh clone on bootup? or expect a pre-cloned copy to pre-exist?

Sumer Jabri

unread,
Dec 7, 2019, 5:53:04 PM12/7/19
to CrafterCMS
No, Studio will auto-sync.

bitsofinfo

unread,
Dec 8, 2019, 2:40:20 PM12/8/19
to CrafterCMS
So if I boot up a single clean authoring node, where does it get its initial repo? I assume it clones it fresh?

If I have a pair of authoring nodes (clustered) what is the procedure by which an initial clone occurs vs deciding that it ssh copies it from a peer?

Sumer Jabri

unread,
Dec 9, 2019, 9:54:14 AM12/9/19
to CrafterCMS
When sites are created and managed, they get synced.

--sumer

bitsofinfo

unread,
Dec 9, 2019, 10:00:05 AM12/9/19
to CrafterCMS
Doesn't really answer address the question, but thanks

Sumer Jabri

unread,
Dec 9, 2019, 6:33:06 PM12/9/19
to CrafterCMS
There is a git repo per site created. That repo is born to the Studio that creates it, other Studios will sync up with the original parent.

Since it's multi-master, any Studio can create, and the rest of the cluster will clone and sync.

--sumer
Reply all
Reply to author
Forward
0 new messages