Managing workflows as first class resources

54 views
Skip to first unread message

Simon Opper

unread,
Feb 9, 2022, 9:51:04 PM2/9/22
to topbrai...@googlegroups.com, David Habgood, Rob Atkinson, Marcus Jowsey, Jamie Feiss
Hi TQ crew

I've posted a few comments and questions lately about saved layouts, queries and user data in general. These are aimed at being able to manage the deployment of these resources over large user bases, across multiple dev/test/prod instances and corresponding explorer instances.

I'd like to just follow up on this user data theme with a further suggestion to consider in your future roadmap.

That is providing EDG admin tools to manage workflow data as first class resources.

As with layouts and searches being able to manage them as decoupled resources in my view would give much more scalability in EDG.

A few use cases and thoughts:

  1. ability to back up a set of workflow data for a graph independently of the content graph
  2. delete a urn content graph and redeploy it after refactoring with the ability to re-instate the workflow data associated with it
  3. ability to move the workflow data associated with a graph e.g.  urn:x-evn-master:example_datagraph  to a new graph e.g. urn:x-evn-master:example_datagraph_with_same_content which is perhaps a clone, or dev/test/prod versions on other servers to perform testing or perform updates.
  4. versioning and history of workflows - this is a bit on an edge case that follows by extension. I didn't have a need for this. However, I could see it being of use in some cases where full provenance is required.
Items 1 to 3 would get constant use from our team in our larger projects.

Currently, we've developed workarounds that use RDFx via python to extract all workflow data, which provides some degree of independent backup. However, we haven't yet attempted to solve the issue of re-deploying workflow data back into EDG graphs. 

We have wondered what your roadmap for git integration is and that perhaps workflow data can be included in git integrations to support some of the above goals. 

A side note here:  it would be great to have direct functionality to merge and commit the changes made in an EDG urn graph back into the local(disk) or database system files.  These service functions exist and we exploit them, but why isn't it supported in EDG directly?

We have built this functionality for certain manifest deployment loaders (see pic below), but the out of the box capability to sync EDG graphs to and from local content ( in addition to the EDG file edit capability) would be fabulous. 

image.png



Many thanks in advance 



Simon Opper 

Chief Data Scientist

Connected Knowledge 


M 0447 641 837

E simon...@surroundaustralia.com 

A Level 9, Nishi Building, 2 Phillip Law Street; NewActon Canberra 2601

surroundaustralia.com 

Copyrights:

SURROUND Australia Pty Ltd is the copyright owner of all original content and attachments. All rights reserved. 

Confidentiality Notice:

The contents of this email are confidential to the email addressee, and may also be privileged. If you are not the addressee, you may not copy, forward, disclose, or otherwise use it, or any part of it or its attachments, in any form. If you have received this email in error, please reply to the sender.


Holger Knublauch

unread,
Feb 22, 2022, 8:13:14 PM2/22/22
to topbrai...@googlegroups.com, David Habgood, Rob Atkinson, Marcus Jowsey, Jamie Feiss
Hi Simon,

On 10 Feb 2022, at 1:50 pm, Simon Opper <simon...@surroundaustralia.com> wrote:

Hi TQ crew

I've posted a few comments and questions lately about saved layouts, queries and user data in general. These are aimed at being able to manage the deployment of these resources over large user bases, across multiple dev/test/prod instances and corresponding explorer instances.

I'd like to just follow up on this user data theme with a further suggestion to consider in your future roadmap.

That is providing EDG admin tools to manage workflow data as first class resources.

As with layouts and searches being able to manage them as decoupled resources in my view would give much more scalability in EDG.

A few use cases and thoughts:

  1. ability to back up a set of workflow data for a graph independently of the content graph
  2. delete a urn content graph and redeploy it after refactoring with the ability to re-instate the workflow data associated with it
  3. ability to move the workflow data associated with a graph e.g.  urn:x-evn-master:example_datagraph  to a new graph e.g. urn:x-evn-master:example_datagraph_with_same_content which is perhaps a clone, or dev/test/prod versions on other servers to perform testing or perform updates.
  4. versioning and history of workflows - this is a bit on an edge case that follows by extension. I didn't have a need for this. However, I could see it being of use in some cases where full provenance is required.
Items 1 to 3 would get constant use from our team in our larger projects.

The workflow data is (like almost all our state-related info) stored in RDF triples. In this case they reside in .tch graphs, just like saved queries and layouts. The data model behind workflows is relatively simple - basically a collection of teamwork:Change objects that point to added or deleted triples. This means that you can use the usual infrastructure (e.g. SPARQL queries or ADS scripts) to query, manipulate and restore workflows.

Are you essentially asking for better automation and user interface for such tasks?


Currently, we've developed workarounds that use RDFx via python to extract all workflow data, which provides some degree of independent backup. However, we haven't yet attempted to solve the issue of re-deploying workflow data back into EDG graphs. 

We have wondered what your roadmap for git integration is and that perhaps workflow data can be included in git integrations to support some of the above goals. 

A side note here:  it would be great to have direct functionality to merge and commit the changes made in an EDG urn graph back into the local(disk) or database system files.  These service functions exist and we exploit them, but why isn't it supported in EDG directly?

We have built this functionality for certain manifest deployment loaders (see pic below), but the out of the box capability to sync EDG graphs to and from local content ( in addition to the EDG file edit capability) would be fabulous. 

I guess these features resemble what are doing in Files mode and in our relatively new Git integration. You can invoke the services behind those, e.g. to save snapshots to files. Having said this, there are obvious size limitations after which reliable file output becomes impractical.

Holger



<image.png>



Many thanks in advance 



Simon Opper 
Chief Data Scientist
Connected Knowledge 

M 0447 641 837
A Level 9, Nishi Building, 2 Phillip Law Street; NewActon Canberra 2601

Copyrights:

SURROUND Australia Pty Ltd is the copyright owner of all original content and attachments. All rights reserved. 
Confidentiality Notice:
The contents of this email are confidential to the email addressee, and may also be privileged. If you are not the addressee, you may not copy, forward, disclose, or otherwise use it, or any part of it or its attachments, in any form. If you have received this email in error, please reply to the sender.


--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/CABfSiRN%2BgUXQrjnv1ACT8TV35O5e2EhWeqXGmWzD6sJ1Agc6uQ%40mail.gmail.com.

Reply all
Reply to author
Forward
0 new messages