Mapping Azure Resources to EDG Ontologies

Skip to first unread message

Tim Smith

Aug 8, 2022, 4:43:49 PMAug 8
to TopBraid Suite Users
I am going to model some of my Azure environments using EDG.  The first task is to determine what classes are required to represent the Azure resources.  Below is a small list of available Azure resources.  There are hundreds more.

The hardest part of this effort will be to sort through the EDG ontologies to find either the class that best semantically represents the Azure resource or which class to extend to capture Azure specific details.

Given how common cloud resources are, has their been any work done to map these environments to EDG to facilitate using EDG to govern cloud environments? I'm not looking for an import from the cloud capability (although that would be fantastic!), just a resource to EDG ontology mapping.  I'm sure I'm not the first person wanting to do this.



Azure Resource
AKS (Kubernetes)
AKS (Kubernetes) Pod
Analysis Services
App Insights
App Service Plan
Automation Account
Azure Active Directory
Cognitive Services
Container Instance
Container Registry
Cosmos DB
Event Grid Topic
Event Hub
Function App
Key Vault
Load Balancer
Log Analytics Workspace
Logic App
Machine Learning Service
MySQL Server
Network Security Group
Notification Hub
PostgresSQL Database
Power BI
Power BI Gateway
Private DNS
Private Endpoint
Private Link Service
QnA Maker
Redis Cache
Resouce Group
Service Principal
SQL Data Warehouse
SQL Database
SQL Database Managed Instance
Storage Account
Stream Analytics
Synapse Analytics
Virtual Network
Web App
Web App PaaS Blueprint

Fan Li

Aug 8, 2022, 5:34:01 PMAug 8
to TopBraid Suite Users
Hi Tim, I have similar needs to model AWS resources. Additionally I would like to model Office365 services (e.g. SharePoint lists / document libraries, MS Teams). 

Ralph TQ [Gmail]

Aug 8, 2022, 5:51:19 PMAug 8

Currently EDG has some coverage of ETL and Infrastructure/Cloud resources. The intent has been to consider ETL blocks as “black boxes”  - that is, not to model the details pf what metadata repositories are already modeling but to depict how these building blocks play in the larger ecosystem.

Here is an Azure example, a pipeline:

a edg:AssetClass ;
edg:acronym "ADBPL" ;
rdfs:comment "An 'Azure Databricks Pipeline' is a pipeline that is based on Databricks running on Microsoft Azure." ;
rdfs:label "Azure Databricks Pipeline" ;
rdfs:subClassOf edg:ETLpipeline ;

This is defined in the ETL schema (as stated by the refs:isDefinedBy statement)

Diving deeper into other ETL and Infrastructure components can be done. I would recommend extending EDG’s classes for functional components. There are 3 schemas to look into:

  1. SCHEMA_EDG-technical-assets-v1.0.ttl
  2. SCHEMA_EDG-technical-assets-ETL-v1.0.ttl
  3. SCHEMA_EDG-technical-assets-Infrastructure-v1.0.ttl

You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit

Tim Smith

Aug 9, 2022, 12:03:23 PMAug 9
to TopBraid Suite Users
Hi Ralph,

Thank you for pointing me to those schemas.

Of course, this is the exact issue I was hoping to avoid.  For example, in the three graphs you mentioned, there are 261 definitions, 107 (41%) of which do not have any comments explaining their intended meaning or use (see query below).  Of the 261 definitions, roughly 120 are classes.

Now my task must be understanding the meaning and intended use of all of the classes and determining what Azure (or GCP or AWS) resource most closely aligns so that I can extend or use the correct class, otherwise, I won't be able to use EDG functionality like lineage, etc...  If I didn't care about integrating with the EDG ontologies, it would be easier to just make my own ontologies to support Azure, GCP & AWS.  I can export a json file from Azure that effectively contains instance data from which I can derive an ontology - which also means I can auto-populate EDG with the Azure structure.

This type of mapping needs to be done only once.  Maybe this is an opportunity for TQ to create another asset collection type and sell it as an add-on?

As a side note, the ETL and Infrastructure schemas you shared are not accessible from the Includes tab.  I had to open them using the Files asset collection to get them to show up in the EDG UI.  (This also entailed unlocking the workspace).  Maybe there is another way that I missed.

Also, I noticed a number of "empty property shapes" on edg:AzureDataBricksPipeline, coming from a superclass. (see attached file).  Walking up the tree shows one undefined property shape at edg:System.  The label on this class does not match the URI - rdfs:label = "Software System".  Five more can be found at edg:EnterpriseEnabler.  Additionally, edg:AssetClass contains four undefined property shapes.  This is only what I found in the cursory look.  A query would be better (or maybe a SHACL rule?) to find all of the "hanging" shapes.  Of course, it may be that the missing shapes are defined graphs that are not imported into these graphs.


SELECT ?commentflag (COUNT(?commentflag) AS ?numcomments)
  ?s rdfs:isDefinedBy ?defininggraph .
  ?s rdf:type ?stype .
  OPTIONAL { ?s rdfs:comment ?scomment . } .
  FILTER (?defininggraph IN (<>, <>, <>))
  FILTER (?stype NOT IN (sh:PropertyShape,  sh:PropertyGroup))
  BIND (BOUND(?scomment) AS ?commentflag)
GROUP BY ?commentflag
Azure Databricks.JPG
Reply all
Reply to author
0 new messages