Globus as part of CI infrastructure

275 views
Skip to first unread message

Joshua Brown

unread,
Mar 30, 2022, 3:25:18 PM3/30/22
to Discuss
Hi all,

As we have built our appliction on top of Globus I am interested in understanding the best means for incorporating it in our CI infrastructure and wanted to ask the community and the Globus team what the best approach to take is? 

For our scenario we would ideally be able to standup and register a managed globus connect server version 5.4 and two endpoints. Again, ideally we could set up and tear them down as needed without using a bunch of browser links as part of the server and endpoint configuration, so the steps could be automated - which I'm not quite sure is yet possible.

There are a few issues we are running into that are making this a bit painful or where I could use clarification on:

1. The browser login requests i.e. Is there a way to do this through the API?

globus-connect-server login localhost

Please authenticate with Globus here:

------------------------------------

https://auth.globus.org/v2/oauth2/authorize?client_id=blahblahblah&prompt=login

------------------------------------
Enter the resulting Authorization Code here:

2. Can I reuse of the client ids and secrets? It would be nice if we could just delete a vm with the Globus endpoints and globus connect servers on them and recreate them just using the client id and secrets we had before. This would also delete the deployement-key.json but I'm not sure it is needed if everything is being recreated anyway or is this a problem?

Best

Jason Alt

unread,
Apr 1, 2022, 10:40:50 AM4/1/22
to Joshua Brown, Discuss
> standup and register a managed globus connect server version 5.4 and two endpoints

Do you need an endpoint with 2 collections perhaps? Or does this mean you need an endpoint with 2 nodes?

Before I get into your questions, I'd like to understand your use case better. Are you trying to launch new GCS endpoints on demand, owned by you (or some service account), preconfigured with collections, for use by some other targeted individuals or groups? Or is this the same endpoint(s) redeployed for recovery (or stateless nodes, or hardware migrations)?

> 1. The browser login requests i.e. Is there a way to do this through the API?

No, at least not the way you may be thinking. The /authorize call performs the OAuth2 step in which you consent to allow the globus-connect-server CLI on that node to act on your behalf while interacting with the new endpoint. The consent step requires human interaction; hence the need to open a browser. If the consent has already been performed, then perhaps it could be automated (chicken and egg, I know). That's why I wonder what your specific use case is.

I'm curious as to what triggers the deployment of these new endpoints (job submission? new hardware?), and if that interaction can be used to manually generate the token.

> 2. Can I reuse the client ids and secrets?

No. Once the client ID is used, it becomes the owner of certain resources throughout the Globus platform. Even once the endpoint and client ID are deleted, its ID is still maintained for internal purposes. Solvable? Maybe, but there is a more critical concern. The client ID is unique and trusted to identify a specific client (or in this case, a specific endpoint). Users can't rely on the ID as a reliable identifier if the endpoint can change from time to time (similarly to how I'm trusting that joshbr...@gmail.com has not changed and remains a valid identifier). Again, solvable? Maybe, but I don't think that's the real issue for this use case. I'd prefer that client IDs are easily and automatically generated.

> It would be nice if we could just delete a vm with the Globus endpoints and globus connect servers on them and recreate them just using the client id and secrets we had before. This would also delete the deployement-key.json but I'm not sure it is needed if everything is being recreated anyway or is this a problem?

Just to be clear (for future readers too), deleting a VM (or node) containing GCS does nothing to release the resources associated with GCS within the Globus platform (the Globus AWS services). You'll want to run 'endpoint cleanup' so at the very least the Transfer collections aren't found by your users when they search for endpoints.

Jason

Joshua Brown

unread,
Apr 22, 2022, 1:50:16 PM4/22/22
to Discuss, jaso...@globus.org, Discuss, Joshua Brown

Sorry for the delayed response.


>  Do you need an endpoint with 2 collections perhaps? Or does this mean you need an endpoint with 2 nodes?

I need two separate endpoints, each on a separate Globus Server (sorry I wasn't clear) so I can test a transfer between them. Our application wraps the Globus transfer functionality, we need to make sure that before and after the transfer a payload appears as expected - from the point of view of our application.


>  Before I get into your questions, I'd like to understand your use case better. Are you trying to launch new GCS endpoints on demand, owned by you (or some service account), preconfigured with collections, for use by some other targeted individuals or groups?

That was a loaded question. For the purpose of testing our application, we would be launching GCS endpoints with a service account. My questions are all within the context of how to stand up two separate endpoints automatically within the CI. What I would like to be able to do is start with a fresh install and teardown, not having machines sitting idel with Globus on them when I am not using the CI. Given the information I have that may not be possible with Globus, I'm guessing another option is to manually set up two Globus endpoints and reuse them without the creation and teardown.


>  Or is this the same endpoint(s) redeployed for recovery (or stateless nodes, or hardware migrations)?

I think I have already kind of answered this above. The answer is no. Our Continous integration pipelines use virtual machines hardware migrations are not a problem because the CI is typically not being used non-stop and is ideally ephemeral. In testing I don't really want to preserve state after the testing is completed.


> No, at least not the way you may be thinking. The /authorize call performs the OAuth2 step in which you consent to allow the globus-connect-server CLI on that node to act on your behalf while interacting with the new endpoint. The consent step requires human interaction; hence the need to open a browser. If the consent has already been performed, then perhaps it could be automated (chicken and egg, I know). That's why I wonder what your specific use case is.

Well, it would be fine if I only had to manually authenticate once and then use the same information to stand up the Globus Servers. Is this possible? 

> I'm curious as to what triggers the deployment of these new endpoints (job submission? new hardware?), and if that interaction can be used to manually generate the token.

Exactly, a push to a GitHub job triggers the CI pipeline and that's when we want to stand up the servers.


> No. Once the client ID is used, it becomes the owner of certain resources throughout the Globus platform. Even once the endpoint and client ID are deleted, its ID is still maintained for internal purposes. Solvable? Maybe, but there is a more critical concern. The client ID is unique and trusted to identify a specific client (or in this case, a specific endpoint). Users can't rely on the ID as a reliable identifier if the endpoint can change from time to time (similarly to how I'm trusting that joshbr...@gmail.com has not changed and remains a valid identifier). Again, solvable? Maybe, but I don't think that's the real issue for this use case. I'd prefer that client IDs are easily and automatically generated.

I'm not picky about how it should be done, I just want a solution. 


> Just to be clear (for future readers too), deleting a VM (or node) containing GCS does nothing to release the resources associated with GCS within the Globus platform (the Globus AWS services). You'll want to run 'endpoint cleanup' so at the very least the Transfer collections aren't found by your users when they search for endpoints.

Noted.

Jason Alt

unread,
Apr 27, 2022, 12:32:06 PM4/27/22
to Joshua Brown, Discuss
From the sounds of it, you'll want to manually set up two endpoints and keep the client ID/secret/deployment key to redeploy on demand. Getting the client ID/secret and performing the oauth consent flows prevents you from deploying unique endpoints automagically on demand (all things we want to resolve). If you add an Auth client as an admin role on the endpoint, you can also modify the endpoint configuration on-the-fly from within CI without human interaction. That should be useful if you want to customize collections for specific CI events, but it'll require a bit of python coding on your part.

Jason

Joshua Brown

unread,
May 19, 2022, 10:06:03 PM5/19/22
to Discuss, jaso...@globus.org, Discuss, Joshua Brown
Hi Jason, 

 > If you add an Auth client as an admin role on the endpoint, you can also modify the endpoint configuration on-the-fly from within CI without human interaction. That should be useful if you want to customize collections for specific CI events, but it'll require a bit of python coding on your part.

Thanks for all of your useful comments, can you walk me through the steps required to do this? Are you suggesting that I use the globusSDK and register a Globus application that has admin rights? 

Joshua Brown

unread,
May 19, 2022, 10:20:44 PM5/19/22
to Discuss, Joshua Brown, jaso...@globus.org, Discuss

And to clarify more, I know how to set register an application but, I'm not quite sure how you go about granting the admin scopes so it has the authorization to automate the set up steps for a Globus Connect Server.

Jason Alt

unread,
May 21, 2022, 12:55:25 PM5/21/22
to Joshua Brown, Discuss
You could likely manually create a storage gateway and mapped collection, grant the CI client data access to the mapped collection (via allowed domains and identity mapping), then at runtime launch an instance of the endpoint and have CI jobs access predefined paths within the collection (based on job id perhaps). That's probably the simplest solution if all you need is a client-accessible mapped collection. If you have multiple, known client IDs, you just need to adjust the identity mapping. There's an explanation of how to do that in section 3 of https://docs.globus.org/globus-connect-server/v5.4/use-client-credentials/.

If you want the client to be able to configure the endpoint/gateway/mapped collections, you'll need to set the client as an administrator on the endpoint. In this example, the CLIENT_ID_USERNAME is the app client ID; this is not the endpoint ID.

$ globus-connect-server endpoint role create administrator $CLIENT_ID_USERNAME
Role ID: ef8a7108-d917-11ec-b37e-fdd01edbf245
$ globus-connect-server endpoint role list
Role ID                              | Role          | Principal                                                  
------------------------------------ | ------------- | ------------------------------------------------------------
62dd115a-10c9-11ec-a018-811dd7c5dbfa | administrator | jaso...@globus.org                                        
ef8a7108-d917-11ec-b37e-fdd01edbf245 | administrator | 4d6e9126-f428-4dd9...@clients.auth.globus.org
fc9ab067-5ce3-4815-bfed-59c6770b3ad3 | owner         | jaso...@globus.org          


In this example script, the client ID gets the 'manage_collections' scope which allows it to interact with the GCS Manager API and then creates a POSIX storage gateway and mapped collection.

#!/usr/bin/env python3

import globus_sdk
 
# Substitute your values here:
ENDPOINT_ID = "ENDPOINT_ID"
GCS_MANAGER_FQDN = "GCS_MANAGER_FQDN"
CLIENT_ID = "YOUR_APP_CLIENT_ID"
CLIENT_ID_USERNAME=CLIENT_ID + "@clients.auth.globus.org"
CLIENT_SECRET = "YOUR_APP_CLIENT_SECRET"
 
#
# We need an access token with the 'manage_collections' scope in order
# to interact with the GCS Manager API.
#

# The authorizer manages our access token for the scopes we request
authorizer = globus_sdk.ClientCredentialsAuthorizer(
    # The ConfidentialAppAuthClient authenticates us to Globus Auth
    globus_sdk.ConfidentialAppAuthClient(
        CLIENT_ID,
        CLIENT_SECRET
    ),
    f"urn:globus:auth:scope:{ENDPOINT_ID}:manage_collections"
)

# The access token is stored in authorizer.access_token
access_token = authorizer.access_token

#
# We'll need a GCS Client
# https://globus-sdk-python.readthedocs.io/en/stable/services/gcs.html
#
gcs_client = globus_sdk.GCSClient(GCS_MANAGER_FQDN, environment='sandbox', authorizer=authorizer)

#
# Create a storage gateway. The SDK GCSClient doesn't currently have a member function for
# creating storage gateway, so we'll make the POST call according to the GCS API docs.
# https://docs.globus.org/globus-connect-server/v5.4/api/openapi_Storage_Gateways/#postStorageGateway
#
gateway_doc = {
    'DATA_TYPE': 'storage_gateway#1.1.0',
    'display_name': 'My Unique Storage Gateway Display Name',
    # POSIX Connector ID
    'connector_id': '145812c8-decc-41f1-83cf-bb2a85a2a70b',
    # Set whichever domain you want to allow data access on the mapped collection. In this case,
    # the client will be able to access the mapped collection.
    'allowed_domains': ['clients.auth.globus.org'],
    # We only have a single domain so we aren't required to supply an identity_mapping, however,
    # I want to make sure this is the only client that maps _and_ I want to be able to map to a
    # more useful local username than the CLIENT_ID.
    'identity_mappings': [{
        'DATA_TYPE': 'expression_identity_mapping#1.0.0',
        'mappings': [{
            'source': '{username}',
            'match': CLIENT_ID_USERNAME,
            'output': 'ci_client',
        }]
    }],
    'policies': {'DATA_TYPE': 'posix_storage_policies#1.0.0'}
}

# Returns globus_sdk.response.GlobusHTTPResponse
resp = gcs_client.post('/storage_gateways', data=gateway_doc)
gateway_id = resp.data['data'][0]['id']

#
# Create a mapped collection on the storage gateway. This is supported by the SDK.
# https://globus-sdk-python.readthedocs.io/en/stable/services/gcs.html#globus_sdk.GCSClient.create_collection
# Returns UnpackingGCSResponse
# Collections doc reference: https://docs.globus.org/globus-connect-server/v5.4/api/schemas/Mapped_Collection_schema/
collection_doc = {
    'DATA_TYPE': 'collection#1.5.0',
    'collection_type': 'mapped',
    'display_name': 'My Client-Created Mapped Collection Display Name',
    'storage_gateway_id': gateway_id,
    'public': True,
    'collection_base_path': '/',
}

resp = gcs_client.create_collection(collection_doc)
collection_id = resp.data['id']

That created the gateway and mapped collection and set my client username as the owner on the collection along with the administrator role on the collection:

$ globus-connect-server storage-gateway list
Display Name                           | ID                                   | Connector | High Assurance | MFA  
-------------------------------------- | ------------------------------------ | --------- | -------------- | -----
My Unique Storage Gateway Display Name | 8d038f24-2e10-4f52-9308-58a9d068e944 | POSIX     | False          | False
$ globus-connect-server storage-gateway show 8d038f24-2e10-4f52-9308-58a9d068e944
Display Name:                My Unique Storage Gateway Display Name
ID:                          8d038f24-2e10-4f52-9308-58a9d068e944
Connector:                   POSIX
High Assurance:              False
Authentication Timeout:      15840
Multi-factor Authentication: False
Allowed Domains:             ['clients.auth.globus.org']
(venv) [centos@(gcs dev 1) client_admin]$ globus-connect-server collection list
ID                                   | Display Name                                     | Owner                                                        | Collection Type | Storage Gateway ID                  
------------------------------------ | ------------------------------------------------ | ------------------------------------------------------------ | --------------- | ------------------------------------
c458e931-3b73-4798-9729-43f1a4de3870 | My Client-Created Mapped Collection Display Name | 4d6e9126-f428-4dd9...@clients.auth.globus.org | mapped          | 8d038f24-2e10-4f52-9308-58a9d068e944
$ globus-connect-server collection show c458e931-3b73-4798-9729-43f1a4de3870
Display Name:                My Client-Created Mapped Collection Display Name
Owner:                       4d6e9126-f428-4dd9...@clients.auth.globus.org
ID:                          c458e931-3b73-4798-9729-43f1a4de3870
Collection Type:             mapped
Storage Gateway ID:          8d038f24-2e10-4f52-9308-58a9d068e944
Connector:                   POSIX
Allow Guest Collections:     False
Disable Anonymous Writes:    False
High Assurance:              False
Authentication Timeout:      15840
Multi-factor Authentication: False
Manager URL:                 https://1008a.8540.sandbox2.zones.dnsteam.globuscs.info
TLSFTP URL:                  tlsftp://m-fe434a.1008a.8540.sandbox2.zones.dnsteam.globuscs.info:443
Force Encryption:            False
Public:                      True
Contact E-mail:              jaso...@globus.org
$ globus-connect-server collection role list c458e931-3b73-4798-9729-43f1a4de3870
Role ID                              | Collection ID                        | Role          | Principal                                                  
------------------------------------ | ------------------------------------ | ------------- | ------------------------------------------------------------
9653a0c0-d924-11ec-b37e-fdd01edbf245 | c458e931-3b73-4798-9729-43f1a4de3870 | administrator | 4d6e9126-f428-4dd9...@clients.auth.globus.org

From that example, hopefully you can construct other GCS API calls to configure the endpoint as needed using the SDK (https://globus-sdk-python.readthedocs.io/en/stable/) and API (https://docs.globus.org/globus-connect-server/v5.4/api/#api_reference) references.

Jason

Joshua Brown

unread,
May 23, 2022, 10:57:34 AM5/23/22
to Discuss, jaso...@globus.org, Joshua Brown
Thank you for the very helpful response!

I am most interested in the second scenario in which you provided very instructive feedback. I wanted to clarify a few points:

If the node goes down and I lose everything except the deployment key from the endpoint create command and node configuration file (created with the --export-node flag), what do I need to get everything else back and running? And what steps still require manual interaction? I do not understand what configuration is stored on the Globus Cloud and what items need to be rerun. The only thing I understand for certain is that the Globus cloud does not contain the information in the deployment key. For instance, if I lose the node, do I need to recreate the collections and get new UUIDs or are the old ones still valid, the same for the storage gateways. If I need to recreate the storage gateways and collections do I need to delete the old ones that might still be registered on Globus Cloud? 

Possible steps from my understanding:
  a. Reinstall the Globus Connect Server (can be automated)
  b. recreate the endpoint with the deployment key (does this have to be manual?)
  c. recreate the node using the --import-node flag with the node configuration file (automated with the cli - I do not see how to do this with the API or globusSDK)
  c. Rerun the python script to create collections and storage gateway (can be automated, delete old collections and gateway?)


Jason Alt

unread,
May 23, 2022, 5:26:28 PM5/23/22
to Joshua Brown, Discuss
Everything except the client id, client secret, deployment key and node configuration file is stored in the Globus AWS services (encrypted). The only thing you need to do to get back up and running is:

# globus-connect-server node setup --import-node <node_config> --deployment-key <deployment-key> --client-id <client_id>

`node setup` pulls down the latest configuration for the endpoint including gateways, collections, roles, sharing policies, etc. At that point, the node should be fully operational with every defined collection; no need to recreate anything. 

Joshua Brown

unread,
Jul 25, 2023, 1:29:08 PM7/25/23
to Discuss, jaso...@globus.org, Discuss, Joshua Brown
I have a question, where does the connector_id come from, is this always the same for a POSIX connector? I see a different UUID listed elsewhere in the documentation for the POSIX connector. 

'connector_id': '145812c8-decc-41f1-83cf-bb2a85a2a70b',

Jason Alt

unread,
Jul 25, 2023, 1:39:17 PM7/25/23
to Joshua Brown, Discuss
The connector_id is generated when the connector is first developed and hardcoded into its module. It is immutable. The current connector_id's are:

activescale: 7251f6c8-93c9-11eb-95ba-12704e0d6a4d
azure_blob: 9436da0c-a444-11eb-af93-12704e0d6a4d
blackpearl: 7e3f3f5e-350c-4717-891a-2f451c24b0d4
box: 7c100eae-40fe-11e9-95a3-9cb6d0d9fd63
ceph: 1b6374b0-f6a4-4cf7-a26f-f262d9c6ca72
google_cloud_storage: 56366b96-ac98-11e9-abac-9cb6d0d9fd63
google_drive: 976cf0cf-78c3-4aab-82d2-7c16adbcc281
hpss: fb656a17-0f69-4e59-95ff-d0a62ca7bdf5
irods: e47b6920-ff57-11ea-8aaa-000c297ab3c2
onedrive: 28ef55da-1f97-11eb-bdfd-12704e0d6a4d
posix: 145812c8-decc-41f1-83cf-bb2a85a2a70b
posix_staging: 052be037-7dda-4d20-b163-3077314dc3e6
s3: 7643e831-5f6c-4b47-a07f-8ee90f401d23

Jason
 

Joshua Brown

unread,
Jul 25, 2023, 1:48:15 PM7/25/23
to Discuss, jaso...@globus.org, Discuss, Joshua Brown
Thanks!

Joshua Brown

unread,
Aug 14, 2023, 4:35:47 PM8/14/23
to Discuss, Joshua Brown, jaso...@globus.org, Discuss
A few clarifying points. I keep hitting a chicken and egg problem. What I want to do:

1. Create an endpoint deployment key and configuration file that is not on the same machine that it will be used.

Why, so I can use the deployment key and other configuration files to spin up containers without having to launch a container in a kubernetes cluster and login, which is awkward.

I am unsure at this point if running the globus-connect-server setup command must be run on the same machine where a  node must also be deployed. I am hoping not. 

2. Add a client id with the administrator role to automate the remaining steps. I have been trying to do this using the

globus-connect-server endpoint role create administrator "$CLIENT_UUID"@clients.auth.globus.org 

command but I am hitting an error:

Error contacting dec6a0.75bc.data.globus.org
Error resolving dec6a0.75bc.data.globus.org
This may be because the endpoint is deleted, it is not deployed on any
nodes, or your DNS resolver is misconfigured.

This seems to be telling me that I need to have a node up and running before I can make a client an administrator, but that defeats the purpose of creating a client administrator. Adding the client as an administrator via the UI in the same project does not appear to have any effect as far as I can tell. Any assistance in understanding what I am doing wrong would be helpful.

Jason Alt

unread,
Aug 14, 2023, 5:18:15 PM8/14/23
to Joshua Brown, Discuss
 I am unsure at this point if running the globus-connect-server setup command must be run on the same machine where a  node must also be deployed. I am hoping not.

Assuming you mean 'globus-connect-server endpoint setup', then you are correct; you can move the deployment key to another, unrelated node and run `node setup`.

You have to have a node setup to add an administrator. However, you can `globus-connect-server endpoint setup --owner "$CLIENT_UUID"@clients.auth.globus.org ...` which will assign the client as the owner/admin of the endpoint before node deployment. I think that will achieve what you are looking for.

As of GCS 5.4.61, make sure "$CLIENT_UUID"@clients.auth.globus.org is an admin on the Auth project where you want to register the endpoint credentials.

Jason

Joshua Brown

unread,
Aug 14, 2023, 7:45:51 PM8/14/23
to Jason Alt, Discuss
Great that was exactly what I was looking for! 

Follow up question is there a way to create a project and client via the REST API of python sdk?

Joshua Brown

unread,
Aug 15, 2023, 7:08:08 AM8/15/23
to Discuss, Joshua Brown, Discuss, jaso...@globus.org
I'll answer my own question, it looks like there is a way to create projects using the globus_sdk https://globus-sdk-python.readthedocs.io/en/stable/services/auth.html. Though I'm not sure that there is a way to actually create a client using the sdk.

Joshua Brown

unread,
Aug 15, 2023, 7:15:44 AM8/15/23
to Discuss, Joshua Brown, Discuss, jaso...@globus.org
Great, it looks like there is also a way to create a client programmatically. https://docs.globus.org/api/auth/reference/#create_client

Jason Alt

unread,
Aug 15, 2023, 9:36:29 AM8/15/23
to Joshua Brown, Discuss
You can script that pretty easily or you can look into https://docs.globus.org/cli/reference/api_auth/ to make things a tad bit simpler.

Jason

Stephen Rosen

unread,
Aug 15, 2023, 5:36:52 PM8/15/23
to Jason Alt, Joshua Brown, Discuss
There have been some other conversations with our team about this topic, so I'd like to make sure we share some info back to the listhost and elaborate on some details.

Using the `globus api auth` commands is probably the easiest approach for quick scripting because the Globus CLI has builtin error handling for the session and policy errors which you can encounter via these APIs. You can write your own python scripts with the SDK as well, but you'll have to do that error handling yourself.

(we actually have made some minor tweaks to the latest version of that script, which will be published to the docs soon)

There are also several methods which are still missing from the SDK. We have some like create_project already available, and are working on filling in the gaps.
Expect future SDK releases to provide more of the missing methods like `create_client`.

Cheers,
-Stephen

Joshua Brown

unread,
Sep 6, 2023, 12:34:33 AM9/6/23
to Discuss, sir...@globus.org, Joshua Brown, Discuss, jaso...@globus.org

Follow up questions, I have been able to create the project and client successfully, I'm having a hard time understanding however, how to go about using the python SDK and the Globus API to set up a Globus Connect Server endpoint and node. For instance, the python SDK has commands such as:

tc.create_endpoint(ep_data)
tc.add_endpoint_server(endpoint_id, server_data)

However, I am questioning if these commands are still relevant to GCSV5, and if they are I'm thinking that I a missing something, because I cannot figure out which project they are being created in, or even if a gcs_manager domain name is being assigned when I call add_endpoint_server. It looks like the cli

globus-connect-server endpoint setup

command and the 

globus-connect-server node setup

commands are doing a lot behind the scenes and the API is too granular for me to figure out all the steps that are needed to reproduce the behavior. Sure I could simply call the CLI explicitly, but I would like to keep all the logic for setting up a GCS5 instance in a single Python script.

If this is not possible, then I'll simply have to use a less elegant solution and use the subprocess command and call globus-connect-server with the CLIENT_ID and CLIENT_SECRET that I created in the project.

What would be the recommended way to move forward here?

Joshua Brown

unread,
Sep 6, 2023, 9:17:03 AM9/6/23
to Discuss, Joshua Brown, sir...@globus.org, Discuss, jaso...@globus.org
Doesn't look like using subprocess with 'globus-connect-server endpoint setup' will work, even with GCS_CLIENT_ID and GCS_CLIENT_SECRET I'm still required to login via a browser link. 

Jason Alt

unread,
Sep 6, 2023, 9:51:07 AM9/6/23
to Joshua Brown, Discuss, sir...@globus.org

Follow up questions, I have been able to create the project and client successfully, I'm having a hard time understanding however, how to go about using the python SDK and the Globus API to set up a Globus Connect Server endpoint and node. For instance, the python SDK has commands such as:

tc.create_endpoint(ep_data)
tc.add_endpoint_server(endpoint_id, server_data)

This is v4 only. The only supported way to correctly create endpoints and nodes is to use the GCS CLI.
 
Doesn't look like using subprocess with 'globus-connect-server endpoint setup' will work, even with GCS_CLIENT_ID and GCS_CLIENT_SECRET I'm still required to login via a browser link. 

Support for fully non-interactive endpoint setup and cleanup is coming out in the next release (or two, currently in review). 

Jason

Joshua Brown

unread,
Oct 3, 2023, 10:02:46 AM10/3/23
to Discuss, jaso...@globus.org, Discuss, sir...@globus.org, Joshua Brown
Any updates on the endpoint automation by chance?

Jason Alt

unread,
Oct 3, 2023, 10:14:10 AM10/3/23
to Joshua Brown, Discuss, sir...@globus.org
This will be released in the next GCS release which I expect will be no later than Oct 18.

Jason

Joshua Brown

unread,
Oct 18, 2023, 3:45:50 PM10/18/23
to Discuss, jaso...@globus.org, Discuss, sir...@globus.org, Joshua Brown
I assume this will not happen today, or did I miss a previous announcement? I didn't see anything in the other releases on this issue, unless I simply missed it.

Jason Alt

unread,
Oct 18, 2023, 3:46:41 PM10/18/23
to Joshua Brown, Discuss, sir...@globus.org
It's coming, most likely later today, possible tomorrow morning.

Joshua Brown

unread,
Feb 19, 2024, 4:38:27 PMFeb 19
to Discuss, jaso...@globus.org, Discuss, sir...@globus.org, Joshua Brown
Follow-up question. I'm trying to get a GCS instance running in a docker compose dev environment, there are some challenges to this. My questions relate to how networking in GCS works. I'm making a few assumptions, which may not be accurate.

1. Port 443 is used for control messages from the Globus Cloud service for orchestration.
2. Ports 50000-51000 are for data transfers between GCS instances.

Assuming I want two GCS instances running. This cannot be done as is because of the port binding requirements on port 443. My question is how much work would it be to allow GCS instances to work only with egress on port 443 and not bind to the port exclusively? So essentially polling the GCS cloud on 443 instead of having the Globus Service trigger the GCS instance? I'm only wanting to do this in development.

Joshua Brown

unread,
Mar 18, 2024, 8:48:50 AMMar 18
to Discuss, Joshua Brown, jaso...@globus.org, Discuss, sir...@globus.org

I am running into an additional problem when trying to automate standing up a Globus Connect Server. Currently, we are making use of guest collections and anonymous access as part of our default configuration, this requires a managed endpoint. The steps I have been able to achieve thus far are.

1. Generate deployment key by running globus-connect-server endpoint setup, this requires interactive user input as it triggers the authorization code oauth flow, which is fine because it is a one-time occurrence. The following steps however need to be automated in a non-interactive manner which is all done using a confidential client with a secret.
2. Next, I can run my GCS container by passing in the deployment key, this also works fine.

3. The next step is to set up the storage gateways and collections etc. 

 

This is where I hit a snag. Using a confidential client it doesn't look like I can make my endpoint a managed endpoint, there does not seem to be a way to give a confidential client the scope to do this, or if there is I was unable to find out how.  If there is that would solve my problem. The alternative would be to set the subscription in step 1 where I have used my user credentials as part of the initial manual ouath2 authentication . However, it looks like running globus-connect-server endpoint set-subscription-id is failing in this step because no nodes are running, because the first node is only set up after the first container is launched.

 

globus-connect-server endpoint update  --subscription-id "blahblah"

Error contacting blahblah.data.globus.org

Error resolving blahblah.data.globus.org

This may be because the endpoint is deleted, it is not deployed on any

nodes, or your DNS resolver is misconfigured.


Jason Alt

unread,
Mar 19, 2024, 11:30:22 AMMar 19
to Joshua Brown, Discuss, sir...@globus.org
We wrote up a guide a while back on how to perform automated deployments https://docs.globus.org/globus-connect-server/v5.4/automated-deployment/.

The key points are to:

1. run `endpoint setup` using client credentials in order to avoid the interactive session. You can change endpoint ownership later if desired but it is not required.
2. add your client credentials to your new subscription group so that the client can set the endpoint(s) as managed

If you run globus-connect-server using client credentials as describe at https://docs.globus.org/globus-connect-server/v5.4/reference/#description, it'll take care of the scope/consent/token for you, it should make it quite trivial.

Jason

Joshua Brown

unread,
Mar 28, 2024, 4:12:51 PMMar 28
to Discuss, Joshua Brown, jaso...@globus.org, Discuss, sir...@globus.org
A follow up on this item. The configuration I am interested in where only port 443 egress is allowed does not allow me to what I would like. I know that the configuration I mentioned is not possible having read: https://docs.globus.org/globus-connect-server/v5.4/gcsv54-restricted-firewall-policy/consequences-of-restricting-gcsv54-firewall-policy/#impact_of_restricting_port_443_inbound

"
The Globus service must be able to establish a control channel connection to the GridFTP service on your endpoint. These control channel connections will be initiated by Globus service hosts in the 54.237.254.192/29 CIDR block and - by default - must be able to connect to local port 443 on the system hosting your endpoint. If the Globus service cannot establish this control channel connection, then your endpoint cannot function."

The problem is if I want to set up a developer environment on my laptop for instance using docker compose I cannot easily do this because of the firewall exceptions. Even outside of a developer machine as part of a CI/CD system where I want to stand up ephemeral machines I don't want to have to deal with firewall and network configuration if I don't have to. 

In the current situation, I still need to allow ingress traffic, for the control messages or for the gcs-manager configuration.

CurrentSituation.png

The configuration that would be ideal is to simply poll the Globus Service for incoming requests and establish a connections that way.

IdealSituation.png
I realize that there are probably good reasons things are set up the way they are. However, I know that for many organizations allowing ingress communication can be quite problematic. 
On Monday, February 19, 2024 at 4:38:27 PM UTC-5 Joshua Brown wrote:
Reply all
Reply to author
Forward
0 new messages