Question on Enterprise Data Governance (EDG) application's Database Tables Capturing user-id

59 views
Skip to first unread message

Sanjeev Devireddy

unread,
Jul 3, 2019, 5:18:12 AM7/3/19
to TopBraid Suite Users
Hi,
     In our organization, we are asked to generate a report from the TopBraid Enterprise Data Governance (EDG) application's database, with the Tables/Columns that capture the user-ids. After looking at the database tables, our understanding is that there are total four tables in the database and are Nodes2, Prefixes2, Quads2 & Triples. Among these 4 tables, only the table named Nodes2 contains/captures the user-ids under the column named lex but not other Tables/Columns. Could someone please help us to verify that whether our understanding is correct or not?


Thanks,
Sanjeev

Holger Knublauch

unread,
Jul 3, 2019, 6:47:50 PM7/3/19
to topbrai...@googlegroups.com

Hi Sanjeev,

I assume this is about TopBraid's relational database storage, aka SDB. This stores RDF triples, and RDF triples can only store literals (e.g. strings) in the object position. So if SDB stores literals in a column called lex then this is indeed the only place where user IDs are stored. However, note that users can sometimes be stored as part of URIs, of the form urn:x-tb-users:XY where XY is the user ID, and those may be used in either subject or object position, and therefore be stored in corresponding RDB database columns.

I am not sure if this answers your question.

Holger

--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/4499566c-2e47-402a-8cc0-8afd614c1634%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sanjeev Devireddy

unread,
Jul 4, 2019, 5:43:10 AM7/4/19
to TopBraid Suite Users
Hi Holer,
     Thanks for your response. Your post answers some of my questions. Please check my other comments/queries below.

Hi Sanjeev,

 

I assume this is about TopBraid's relational database storage, aka SDB.

     
        Yes, this is about our TopBraid EDG application instance's relational database storage and it is MySQL.
 

This stores RDF triples, and RDF triples can only store literals (e.g. strings) in the object position. So if SDB stores literals in a column called lex then this is indeed the only place where user IDs are stored.

      
      So, my understanding is that if all the literals that are at the Object position of Triples are stored in the column called lex then the user IDs are also stored in that same column.
 

However, note that users can sometimes be stored as part of URIs, of the form urn:x-tb-users:XY where XY is the user ID, and those may be used in either subject or object position, and therefore be stored in corresponding RDB database columns.

      
     Does it mean to say that if an user ID (stored as part of URIs, of the form urn:x-tb-users:XY where XY is the user ID) is at the Subject position of a Triple then that user ID is stored in a different RDB database column other than the column named lex, which contains literals that are at the Object position of Triples?

 

I am not sure if this answers your question.

Holger




Thanks,
Sanjeev

Holger Knublauch

unread,
Jul 4, 2019, 7:39:04 PM7/4/19
to topbrai...@googlegroups.com

On 4/07/2019 19:43, Sanjeev Devireddy wrote:

Hi Holger,
     Thanks for your response. Your post answers some of my questions. Please check my other comments/queries below.

Hi Sanjeev,

 

I assume this is about TopBraid's relational database storage, aka SDB.

     
        Yes, this is about our TopBraid EDG application instance's relational database storage and it is MySQL.
 

This stores RDF triples, and RDF triples can only store literals (e.g. strings) in the object position. So if SDB stores literals in a column called lex then this is indeed the only place where user IDs are stored.

      
      So, my understanding is that if all the literals that are at the Object position of Triples are stored in the column called lex then the user IDs are also stored in that same column.
Yes.

 

However, note that users can sometimes be stored as part of URIs, of the form urn:x-tb-users:XY where XY is the user ID, and those may be used in either subject or object position, and therefore be stored in corresponding RDB database columns.

      
     Does it mean to say that if an user ID (stored as part of URIs, of the form urn:x-tb-users:XY where XY is the user ID) is at the Subject position of a Triple then that user ID is stored in a different RDB database column other than the column named lex, which contains literals that are at the Object position of Triples?

There are only few triples of that form, that have the user URI in the subject position. They would be triples that attach some (meta) data to the user resources themselves. But yes, if these triples occur then they would be in a different column, for the subject position of triples.

Holger



 

I am not sure if this answers your question.

Holger




Thanks,
Sanjeev
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.

Sanjeev Devireddy

unread,
Jul 5, 2019, 2:36:59 AM7/5/19
to TopBraid Suite Users
Hi Holger,
   When I look at some rows in the four tables (Nodes2, Prefixes2, Quads2 & Triples) of our TopBraid EDG RDB database then I don't see examples of user IDs/URIs stored in a column other than column named lex (in the table Nodes2), which contains Object position of Triples. So, I want to check that if the subject position of triples exists in an EDG RDB database then which Table/Column stores them? Please check the attachments to have a glance at our TopBraid EDG RDB database Tables/Columns.



Thanks,
Sanjeev
Nodes2.jpg
Prefixes2.bmp
Quads2.bmp
Triples.bmp

Holger Knublauch

unread,
Jul 5, 2019, 2:58:39 AM7/5/19
to topbrai...@googlegroups.com

From an RDF perspective, these user references will typically only appear in graphs ending with .tch - those that capture the metadata about an asset collection, including the permissions and change history. Look for triples with sioc:has_creator as predicate, for example. These are in the object position only.

The subject position would be used by the graph urn:x-evn-user-data, e.g. the triple


<urn:x-tb-users:Administrator>
  <http://topbraid.org/teamwork#recent_collections> <urn:x-evn-master:xy_graph> ;
.

Holger

--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.

Sanjeev Devireddy

unread,
Jul 5, 2019, 5:23:28 AM7/5/19
to TopBraid Suite Users
Hi Holger,
    Thanks for your response. As my actual question was not answered so assuming that I wasn't much clear in the previous post so here I am elaborating the same question again. Please check my comments below.
   
On Friday, July 5, 2019 at 12:28:39 PM UTC+5:30, Holger Knublauch wrote:
 

From an RDF perspective, these user references will typically only appear in graphs ending with .tch - those that capture the metadata about an asset collection, including the permissions and change history. Look for triples with sioc:has_creator as predicate, for example. These are in the object position only.


Yes, when the teamGraph is queried (please check the below SPARQL) then it gives the metadata including the permissions and change history. From that SPARQL results (please check the attachment), we can see the User IDs/URIs can be at Subject and/or Object positions of Triples.

SELECT ?s ?p ?o
    WHERE {
        BIND (<urn:x-evn-master:geo> AS ?projectGraph)
        rdf:nil <http://topbraid.org/teamwork#graphsUnderTeamControl> ( ?projectGraph ?teamGraph ) .
        GRAPH ?teamGraph {
            ?s ?p ?o
    }
    }



 

The subject position would be used by the graph urn:x-evn-user-data, e.g. the triple


<urn:x-tb-users:Administrator>
  <http://topbraid.org/teamwork#recent_collections> <urn:x-evn-master:xy_graph> ;
.

From previous posts it is understood that Object position of Triples are stored in the column called lex(in the table Nodes2).  Now, my question is that as you also said (in one of your previous posts) that if an User URI is at the Subject position of a Triple then that User URI is stored in a different RDB database column other than the column named lex so, what could be the TopBraid EDG RDB Table/Column that stores the User URI that is at the Subject position of a Triple?
 
Please check the below two examples:

1) User URI at Object position of a Triple:
<urn:x-evn-master:geo.tch>    <http://rdfs.org/sioc/ns#has_creator>    <urn:x-tb-users:Administrator>
In the above example, since the User URI is at Object position so, it is stored in the column called lex (in the table Nodes2).


2) User URI at the Subject position of a Triple:
<urn:x-change:2019-07-05T08-29-53.421ZAdministrator-2db2948e-2f1b-46b3-852a-cbbe5c08aa56>    rdfs:comment    Create Continent with URI http://topquadrant.com/ns/examples/geography#CONTINENT-TestContinent
Here the question is that in which TopBraid EDG RDB Table/Column the above Subject(User URI) is stored if it is not the column called lex (in the table Nodes2)?

Thanks,
Sanjeev
Subject_Object_UserURI.bmp

Holger Knublauch

unread,
Jul 10, 2019, 7:19:45 PM7/10/19
to topbrai...@googlegroups.com

Sorry for the delayed response; I was away for a couple of days.

On 5/07/2019 19:23, Sanjeev Devireddy wrote:
Hi Holger,
    Thanks for your response. As my actual question was not answered so assuming that I wasn't much clear in the previous post so here I am elaborating the same question again. Please check my comments below.
   
On Friday, July 5, 2019 at 12:28:39 PM UTC+5:30, Holger Knublauch wrote:
 

From an RDF perspective, these user references will typically only appear in graphs ending with .tch - those that capture the metadata about an asset collection, including the permissions and change history. Look for triples with sioc:has_creator as predicate, for example. These are in the object position only.


Yes, when the teamGraph is queried (please check the below SPARQL) then it gives the metadata including the permissions and change history. From that SPARQL results (please check the attachment), we can see the User IDs/URIs can be at Subject and/or Object positions of Triples.

SELECT ?s ?p ?o
    WHERE {
        BIND (<urn:x-evn-master:geo> AS ?projectGraph)
        rdf:nil <http://topbraid.org/teamwork#graphsUnderTeamControl> ( ?projectGraph ?teamGraph ) .
        GRAPH ?teamGraph {
            ?s ?p ?o
    }
    }

BTW you could replace the line starting with rdf:nil with BIND (teamwork:teamGraph(?projectGraph) AS ?teamGraph) .

And yes, correct, the user IDs are also sometimes encoded in the URIs of teamwork:Change subjects.



 

The subject position would be used by the graph urn:x-evn-user-data, e.g. the triple


<urn:x-tb-users:Administrator>
  <http://topbraid.org/teamwork#recent_collections> <urn:x-evn-master:xy_graph> ;
.

From previous posts it is understood that Object position of Triples are stored in the column called lex(in the table Nodes2).  Now, my question is that as you also said (in one of your previous posts) that if an User URI is at the Subject position of a Triple then that User URI is stored in a different RDB database column other than the column named lex so, what could be the TopBraid EDG RDB Table/Column that stores the User URI that is at the Subject position of a Triple?
 
Please check the below two examples:

1) User URI at Object position of a Triple:
<urn:x-evn-master:geo.tch>    <http://rdfs.org/sioc/ns#has_creator>    <urn:x-tb-users:Administrator>
In the above example, since the User URI is at Object position so, it is stored in the column called lex (in the table Nodes2).


2) User URI at the Subject position of a Triple:
<urn:x-change:2019-07-05T08-29-53.421ZAdministrator-2db2948e-2f1b-46b3-852a-cbbe5c08aa56>    rdfs:comment    Create Continent with URI http://topquadrant.com/ns/examples/geography#CONTINENT-TestContinent
Here the question is that in which TopBraid EDG RDB Table/Column the above Subject(User URI) is stored if it is not the column called lex (in the table Nodes2)?

I will need to ask my colleague on which RDB table is involved, and will get back to you.

Holger



Thanks,
Sanjeev
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.

Holger Knublauch

unread,
Jul 11, 2019, 7:45:29 PM7/11/19
to topbrai...@googlegroups.com

Sanjeev,

here is what I learned: The nodes table contains all nodes in the dataset. A node can be any type; blank, URI, literal. The nodes table primary key is the calculated hash for the node.  The quads table then contains the foreign keys to the nodes table.  So there is no way to know the position of a node within the statement without joining the 2 tables.

HTH
Holger
Message has been deleted

Sanjeev Devireddy

unread,
Jul 12, 2019, 4:54:15 AM7/12/19
to TopBraid Suite Users
Hi Holger,

      Thanks for confirming that the Nodes table contains all Nodes (literal / URI) in the data set. So, my understanding is that User-IDs (in the form of literals/URIs/encoded in the URIs of teamwork:Change subjects) are also stored in the Nodes table. If yes then this answers my initial/actual question on Table/Column capturing the User-IDs in the TopBraid EDG relational database.

    In our TopBraid EDG relational database, Nodes2 table has 5 columns and they are hash, lex, lang, datatype & type. Out of these 5 columns, I see that only the column named LEX contains the literals/URIs of Taxonomies/Ontologies. So, can we say that in TopBraid EDG relational database the User-IDs (in the form of literals/URIs/encoded in the URIs of teamwork:Change subjects) are stored only in the Table Nodes2 under the column LEX?


Thanks,
Sanjeev

Holger Knublauch

unread,
Jul 12, 2019, 6:57:07 PM7/12/19
to topbrai...@googlegroups.com
The table Nodes2 stores all nodes, both literals and URI's.  The lex column contains the human readable form for the node.  The Nodes2 table does not provide any indication as to the position of the node within any particular statement.  In fact nodes are never duplicated within the Nodes2 table.  The same node is always represented by the same row in the database regardless of the nodes position within any given triple, or even graph for that matter.

The Nodes2 table is indexed on the hash column.  This hash value is used as a foreign key in the Quads2 table.  It is in this table where the subject, predicate, and object of a triple are defined.  These respective columns contain the hash values for the nodes in the triple.

It should be pointed out that although there is nothing proprietary about this scheme, any use of it for interfacing or ETL purposes is discouraged and not supported.

Holger
--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.

Sanjeev Devireddy

unread,
Jul 13, 2019, 12:42:11 AM7/13/19
to TopBraid Suite Users
Hi Holger,

    Thanks for detailed explanation. Now there is clear understanding on the table Nodes2 and it's relation with the table Quads2.
  
     In our case, the reason behind looking at TopBraid EDG relational database Tables/Columns is that to make sure that our TopBraid EDG application instance is in compliance with European Union (EU) General Data Protection Regulation (GDPR) and to be able to fulfill the Data Subject's(user's) request(s) for Erasure/Rectification of their personal data(in our case User-IDs) stored in EDG application database. Our EDG application instance uses Lightweight Directory Access Protocol (LDAP) for authentication &authorization and the user-id for the users is their email-address. While submitting the reports to our organization's internal GDPR team, as we are always supposed to refer to the relational database directly in terms of Tables/Columns so, we are working towards reading/replacing User-IDs directly from the TopBraid EDG relational database using Structured Query Language (SQL).


Thanks,
Sanjeev
Reply all
Reply to author
Forward
0 new messages