property graph standardization

Joshua Shinavier

unread,

Feb 25, 2019, 7:03:18 AM2/25/19

to gremli...@googlegroups.com

O users of Gremlin,

As you may know, there is a W3C workshop coming up in March which deals with standardization of the property graph data model. This is not the first time such a thing has been attempted, but there is far more momentum behind this new effort, and it seems likely that some sort of consensus recommendation will come together in the months following the workshop.

I know there are diverse points of view, on this list, about standardization. Do property graphs even need a standard? Would a standard be used by the community if it existed, or does a formal specification of the data model run contrary to the pragmatic, code-first vibe of the graph database community? I thought it would be interesting to sound off here on these questions, as it is ultimately the developer community which will determine whether such a standard gains traction. What follows is my personal take.

First of all, I am going to claim unequivocally that yes, the graph community would benefit from a standard. Why? Because the property graph ecosystem has long since outgrown its toddler phase in which the main selling point of the data model was developer friendliness (vis-a-vis powerful but complex frameworks like RDF and OWL). Simplicity and friendliness are still as important as ever, but on that front, the property graph data model has already proven itself, and become very popular in the process. Meanwhile, the graph ecosystem has become far more sophisticated than it was circa 2009. SemWeb-like things such as schema languages, mapping languages, rules, metadata vocabularies, etc. naturally crop up in spite of the lack of standards because demanding real-world applications require increasing expressive power, type safety, guarantees of computability, etc. The fact that we don't have standards just means that we don't have clean and consistent ways of interconnecting these solutions when they do arise. See my slide show, Evolution of the graph schema for a brief overview of graph schema frameworks in particular.

I could go on about why it is useful to build bridges between data models, and about the rich history of the adapter pattern in TinkerPop. However, I would rather just make a couple of points about what an effective property graph standard might look like.

Above all, it should be recognized that there is not one property graph data model, but many. Look at Gremlin's Graph.Features. The data types supported for ids and properties vary widely. Some property graph implementations support higher-level constructs like meta-properties, while others don't even support edge properties. List, map, and array types, multi-properties vs. simple properties -- there is a great deal of variation in what we consider to be a "property graph", and Graph.Features paints this variation in broad strokes. This is why we might at first recoil at the idea of a "standard"; it sounds like a straitjacket which will prevent individual property graph back-ends from doing their own thing.

So let's not talk about that. Property graphs are not RDF, and there will never be a single, rigid PG specification we can all conform to. However, there are ways of formally describing the whole family of data models which do preserve nuance. Enter category theory.

Category theory has been a kind of footnote to a footnote to property graphs since the early days of TInkerPop. Marko used to bring it up, and there is even a page on the old Blueprints wiki about graph types and transformations. More recently, there has also been some early discussion on the Semantic Web side about applying category theory to RDF. I think we should start to combine and amplify these threads.

Basically, category theory has some important things going for it when it comes to graphs. First, the phrasing of category theory is rather similar to graph theory, which ought to appeal to this community: instead of nodes and edges, one speaks of objects and morphisms, as well as higher-level constructs like functors and natural transformations. These are already more familiar to you than you might think. For another thing, CT is often communicated using easily understandable diagrams, making it a good lingua franca for aligning graph concepts at a high level of abstraction. Btw. the paper Marko shared a few days ago on stream ring theory uses similar commuting diagrams in the context of algebra. Finally, category theory is very friendly toward automated reasoning.

The following images are from another talk I gave last month at Data Day Texas. Without my telling you anything more about CT, see if you can understand them.

High-level types:

Edges:

Vertex properties:

Edge properties:

Vertex meta-properties:

Meta-edges: edge to vertex:

Meta-edges: vertex to edge:

Hyper-edge:

RDF statement:

Get it? Can you see how closely related edge labels are to property keys, and how slight are the differences between vanilla edges and weird things like meta-edges and hyper-edges? They don't look so weird when drawn like this. In most cases, we are just permuting labels within the same basic diagram.

tl;dr if you like dots and arrows, you might like category theory, and you might not find the idea of a property graph standard -- or a common language for describing and interconnecting graph data models -- too daunting.

What are your thoughts? If there were a common API for property graph schemas and transformations, what would you want in it?

Best,

Josh

Marko Rodriguez

unread,

Feb 25, 2019, 12:04:06 PM2/25/19

to gremli...@googlegroups.com

HI Josh,

Wow. That was an awesome write up. I really appreciated the included graphics and respective discussion.

Before reading your email, my thoughts on PG standardization were this:

1. No one will agree on vertex labels? one, many, or is that just a vertex property?

2. No one will agree on vertex properties? maps, multi-maps? or RDF-style literals?

3. No one will agree on graph schema representations. String/long, pg:type, etc. ?! (crazy hole)

4. Query language…. another endless rats nest. Fluent, SQL-like, language hosted, String parsing, … ?!

After reading your email I see that at least 1 and 2 are “solvable” from a categorical perspective of mapping between representations. However, how would that happen in practice? Yes, Neo4j can simulate multi/meta properties (as we do that in gremlin-neo4j/), but what does that look like from a standardization perspective. What is the ‘standard’ ? From what I can glean from your writeup is that there is “no standard” just categories and mappings between then — I believe you are saying: "choose your category and map between them as needed." If so, I like that, but then what about the query language aspect? How does it become category agnostic?

Good stuff,

Marko.

http://markorodriguez.com

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/CAPc0Ouu7%3Dukat%3DwgcA-4cG4GryO3VOU_MeO25AbPsTG5_-sjjA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Joshua Shinavier

unread,

Feb 25, 2019, 4:53:09 PM2/25/19

to gremli...@googlegroups.com

Hey Marko,

Thanks for the feedback and enthusiasm. IMO, the starting point for a standard has been staring us in the face: Graph.Features, or rather Graph.Features++. This has already been vetted to a certain extent by the community. Although many of the features deal with data reading and writing, transactions, etc. there is a subset that essentially defines a graph data model. What are the primitive data types of the model? Which basic structures / element kinds are supported? What sort of set-theoretical constraints exist among the types?

If you peek into other property graph schema frameworks -- like Neo4j's or JanusGraph's -- you find additional vocabulary for existence constraints, more fine-grained notions of cardinality and multiplicity, enumerations of labels and keys, and slightly different sets of primitive data types. All of this vocabulary is potential raw material for a standard -- i.e. a language for describing property graph data models. However, I also think it is important to choose a solid theoretical foundation rather than just picking and choosing among commonly used terms. I was excited to learn that you had been thinking about algebras, as I see the potential foundation in algebraic data types, and of course category theory.

For a start, data models will be much more portable if we can standardize on a set of core data types, and it should not be hard to come up with a reasonable consensus on these types. At Uber, we have just called the basic types Boolean, String, Integer, Float, etc. and provided parameters for numeric precision and signedness. Define a reasonable set of primitive types, then think about bindings for XML Schema, interface description languages, and individual programming languages.

As I suggested in my previous email, we can also think of vertex labels, edge labels, and property keys as types. Vertex labels are atomic, and IMO every vertex should have exactly one, just as every primitive value has exactly one data type. You can think of unlabeled vertices as having a designated "null" label. Some graph back-ends may not support the null label, and this should be explicit. On the other hand, edge labels and property keys are not atomic; each one has projections to two other types. In terms of algebra, an edge label or property key contains a product, or ordered pair, of two types. Btw. it is a short step from there to hyper-relationships, which are tuples, i.e. the product of any number of other types. Likewise for relational schemas.

It is also tempting to include sum types in the schema language. For example, the sum of the unit type with any other type is an "optional". The sum of any two types is an "either/or", which enables pattern matching on types. Pattern matching is currently not typical in property graph databases, but there are some really interesting possibilities here, and it enables crossover between property graphs and languages like Protobuf, Thrift, and Avro (which are the bread and butter of data exchange in many places). Among other things, this makes it possible to expose bigger chunks of enterprise data as a graph.

With respect to query languages, there are some concepts from functional programming which are a great fit for data models based on algebraic data types, and the good news here is that Gremlin already embodies some of them. For example, parametric polymorphism makes queries "category agnostic" -- a mapping, or in category theoretical terms, a functor, literally cannot depend on the identities of the objects it deals with, and must preserve the compositional structure of the arrows between them.

While the gremlin-scala wiki points out that Gremlin traversals are not actually monads, in many cases you can think of them that way. A monad is a mapping that consume a thing, like a vertex of a particular label, and produce a container full of other things -- like an iterator of vertices of another label -- together with a rule for binding steps together so as to feed the output of one step into the input of the next. Some steps are functionally pure, whereas others carry state. It would be really interesting to define a purely functional subset of Gremlin, although in my opinion the case for standardization is not as strong for graph query languages as for data models.

Josh

To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/FAB56C3B-4EF1-49F4-9E85-C7BA851292B7%40gmail.com.

pieter martin

unread,

Feb 26, 2019, 2:29:09 PM2/26/19

to gremli...@googlegroups.com

Hi,

I have also previously thought about what a graph standard would/should look like. However I though about it more from a UML modeling perspective, or rather from the OMG's MOF meta meta modeling perspective. Basically create a meta model for property graphs using the MOF.

Afraid I have no experience in the mathematical discussions being had so can't really say what's the benefit to using meta models as opposed to Category theory as a foundation.

I am glad you mentioned hyper graphs/relations. It is a construct that I often miss. Using an intermediary vertex/node to fake the hyper relationship is unsatisfactory.

Another matter I'd like to raise is creating some place/space for what in UML is "association ends". I find it more expressive to design, think, talk, code and work with models using "association ends" than a directional edge label.

E.g.

Hand -----finger----> Finger

From the perspective of the Finger you have to navigate "finger" to get to hand. Not very expressive.

With association ends

Hand[hand]---------[finger]Finger

Now from finger you navigate "hand" to get to Hand and from Hand you navigate "finger" to get to finger.

E.g.

Marko -----knows-----> Josh.

Josh in, reverse, otherside (knows) Marko.

With association ends

Marko[knownBy]-----------[knows]Josh

Marko knows Josh

Josh knownBy Marko.

Not sure if there will be any appetite for this but I reckon association ends are semantically equivalent to a directional edge yet is more readable and conveys information better.

You also mention carnality. This is another feature a miss. Currently all edges/relationships/associations in TinkerPop are many to many relationships with Bag semantics. Again inspired by UML I'd like to see multiplicity, order and uniqueness. Order and uniqueness refers to whether the vertices at the other end of an edge are a List/Set/OrderedSet/Bag. A graph, data model is greatly enhanced by having multiplicity, order and uniqueness.

Thanks

Pieter

Fred Eisele

unread,

Feb 27, 2019, 1:13:53 PM2/27/19

to Gremlin-users

Why did you leave out edge connecting two edges?

Is it because it is not familiar to property graph aficionados?

Such a construct in critical to properly model functors.

A functor being (roughly) a set of morphisms between morphisms and morphisms between objects where path composition is preserved.

Functors being fundamental in category theory.

e.g. functors are used in categories as schema.

Joshua Shinavier

unread,

Feb 28, 2019, 9:38:53 AM2/28/19

to gremli...@googlegroups.com

Hi Pieter,

Using OGM's MOF as an intermediate language (category, if you will) is an interesting idea. This might make sense if there are other data models with existing mappings to/from MOF which we would like to bring into TinkerPop-space.

W.r.t. your "association ends", how much experience do you have with RDF and OWL? It is actually quite common to have a pair of RDF properties (e.g. hasParent vs. hasChild) which represent inverse relationships. OWL even includes some vocabulary to declare that two properties are mutual inverses.

Note that in labeled hypergraphs, this becomes a little more complicated; relationships can include more than two "roles", i.e. they can connect more than two entities. In a tuple like (Zeus, Hera, Hephaestus), typed as relationship (father, mother, child), you can speak of the sub-relationship between Hera and Hephaestus as a projection (Hera, Hephaestus) / (mother, child). Now you essentially have an undirected edge with a label at each end and a label in the middle. You could use some combination of those labels for your "association ends".

Back to OWL, though, there is also vocabulary to indicate that a property is "functional" (e.g. hasMother would be functional if we assume a person can't have more than one mother) and/or "inverse functional" (e.g. the inverse property motherOf would be inverse functional under the same assumptions).

I agree that support for lists, as well as algebraic data types in general (i.e. tuples, and unions) would be a powerful next step for TinkerPop's core data model.

Josh

To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/aa15491d9e21b251cbc00b1d0649b9199fc96ba5.camel%40gmail.com.

Joshua Shinavier

unread,

Feb 28, 2019, 9:48:19 AM2/28/19

to gremli...@googlegroups.com

Hi Fred,

I left out edge-edge edges in the interest of time. Likewise for what you might call index properties, which map from a primitive data type to a vertex or edge label. However, a data model that supports vertex-edge and edge-vertex edges would presumably support edge-edge edges, as well.

Functors are indeed key to mappings between data models / between schemas. Or rather, I think it would be fruitful to begin conceptualizing and talking about transformations in this way. I think category theory is a good bandwagon for us in the graph community to jump on -- not only because it is easy to grok once you have been thinking in terms of dots and arrows for a while, but also because it is a powerful alternative to set theory which is gaining a lot of traction lately... i.e. it will facilitate building bridges between property graphs and external data models by virtue of the fact that in some cases, we may find the bridge already half-built from the other end.

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/1c713617-513e-4126-a76c-9a55d6513aea%40googlegroups.com.

Fred Eisele

unread,

Feb 28, 2019, 11:49:01 AM2/28/19

to Gremlin-users

I am hoping to be able to use TinkerPop as a backing store for several tools.

The ability to make these advanced edges is critical.

Here are some of the tools:

- https://github.com/CategoricalData/fql : for performing category/schema synchronization

- https://webgme.org/ : for custom languages

- SysML tools like: https://sparxsystems.com/resources/repositories/index.html

The current TinkerPop works better than RDBMS in each of these situations.

But, it is still cumbersome.

As Edge and Vertex already inherit from Element it seems like fusing Vertex and Edge is not without merit.

pieter martin

unread,

Feb 28, 2019, 2:35:21 PM2/28/19

to gremli...@googlegroups.com

Hi,

Afraid I am not particularly familiar with OWL apart from the odd scan.

That said I'll give the OWL primer you linked to below a read a familiarize myself with it. From what I know OWL/RDF and the structural parts of UML (Class Diagrams) are far more similar than they are different.

Regarding hypergraphs,

I think we should distinguish between an edge that connects 2 or more vertices (N-ary associations in UML), and an edge that itself is also a vertex and can have an edge to other vertices. AssociationClass in UML.

An edge can link 2 or more vertices like the ternary edge between (Zeus, Hera, Hephaestus)

and an edge itself can also be a vertex and have and edge to some other vertex.

An typical example is,

Company<---Job--->Person

           worksAt---->Location

Job is the edge between Company and Person and Job has a "worksAt" edge to a Location.

Regarding sub-relationships (projections). I don't think that should be a structural construct of the meta model but perhaps rather a runtime view, projection, named query.

If we at all decide to go with hypergraphs in whatever format or description, the part that I have not considered is how the gremlin would look to address the new navigational possibilities. Have you?

The requirement of multiplicities and support for list/sets/orderedsets/bags does not really affect gremlin much as far as I can tell.

Cheers

Pieter

Joshua Shinavier

unread,

Mar 4, 2019, 9:04:59 AM3/4/19

to gremli...@googlegroups.com

Hi Fred, Pieter. Please don't mind the delay; I am on vacation now but thought I would chime in with a Zoom link for the workshop, which kicked off earlier this morning:

https://uber.zoom.us/j/373501651

There is some interesting discussion going on right now about how the various communities which have come to use graph DBs and graph queries can start to come together around standards. The full program can be seen here:

https://www.w3.org/Data/events/data-ws-2019/schedule.html

The Zoom is listen-only (though you can chat questions to those in the room if you have any), so please mute yourself if you join.

Fred, I agree it would be worthwhile to generalize Element and Property somewhat, and Pieter I agree there are two notions of hypergraph in common use: the usual mathematical notion of a hypergraph in which edges may join multiple vertices, and the edges-reified-as-vertices notion. Many "hypergraph" databases have supported both attributes in parallel, e.g. Hypernode, GROOVY, HypergraphDB, and GRAKN, also adding some kind of label on vertices/edges and on the "roles" which relate an edge to the vertices it joins. If TinkerPop were to support these constructs, I would suggest just speaking of "graphs" with generalized edges.

Josh

To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/6fe01309dfa42c5921bc22b9d25df143652155e0.camel%40gmail.com.

Joshua Shinavier

unread,

Mar 5, 2019, 4:26:51 AM3/5/19

to gremli...@googlegroups.com

Minutes from the first day of the workshop are here:

https://docs.google.com/document/d/1yveIHGpxn49Ke90xX1DSuAK7CTGLgCVnaZN1q9k9H28/edit#heading=h.aw727h4rx0gy

The Zoom recording is here (content starts at 4 hours, 15 minutes):

https://uber.zoom.us/recording/play/GJvK_j5nozeHWpQP3a2bir1BCD_CVQfhChCPXy7Pauf51nWN02MABCptKXMJeez2?autoplay=true

This was an interesting mix of perspectives, with strong opinions on the one hand that formalizing mappings is the hardest and most important part of building bridges between data models, and on the other hand that creating efficient implementations is the hard part. I tend to agree that we have some work to do on the property graph side to define and internalize a more formal and complete data model. I think that formal data model should be a cornerstone of TinkerPop 4.

Lots of good work and ideas on display, including the use of GraphQL SDL for defining property graph schemas, defining a property graph schema as a property graph, Cypher for Apache Spark, embedding of Gremlin queries in Cypher, and an impassioned argument against 3-value logic (i.e. weak NULL semantics) in graph query languages. See the minutes for more details.

Josh

pieter martin

unread,

Mar 5, 2019, 9:10:34 AM3/5/19

to gremli...@googlegroups.com

Thanks for the info. Interesting discussions.

Joshua Shinavier

unread,

Mar 6, 2019, 5:57:05 AM3/6/19

to gremli...@googlegroups.com

Pieter, glad you have been able to listen in. It has been very much a fly-on-the-wall form of participation due to the A/V setup, but our notes in the meeting minutes have been heard, and I had done my best to make sure that the TinkerPop ecosystem was represented in the Property Graph Schema Working Group leading up to the workshop. In the session summary for Graph Models and Schema (minutes here), Juan Sequeda repeated my comment that "historically, property graphs were somewhat of a reaction to the complexity of RDF. A complex standard will not be accepted by the developer community". The "keep it simple" mantra was repeated over and over during that session. There was possibly a little lack of clarity about what it means to be simple, and whether simplicity and expressivity are at odds when it comes to a property graph standard.

Harsh Thakkar and Dmitry Novikov were in attendance to present on SPARQL-Gremlin Cypher for Gremlin, respectively. See the minutes here. Olaf Hartig's work on RDF* and SPARQL* received quite a bit of attention, with many suggesting that these should be promoted to W3C recommendations. Olaf also elaborated on his proposal to specify property graph schemas using GraphQL SDL. This struck a chord with me, as the key concerns in this space are very similar to the concerns of aligning property graphs with data models based on record syntax and algebraic data types, such as Protobuf, Thrift, and Avro (something we do at Uber, and which I discussed in the talk I linked). I don't know whether GraphQL SDL should be the definitive language for specifying property graph schemas, but a well-defined mapping between a GraphQL SDL extension and an abstract property graph data model certainly makes sense.

Some key questions from the session on graph data interchange concerned the relative importance of mappings vs. actual formats, and the actual choice of a format. GraphSON3? JSON-LD? JSON Graph? Minutes from that session are here.

My personal takeaway from the workshop is that there is a real sense of urgency about moving forward with a set of standards for property graph data model(s), query language(s), serialization format(s), and mappings to other data models, and that while there are a number of reasonable proposals on the table, there is no clear consensus in any of these areas. As I said, there is a risk of premature optimization. At the same time, we in the graph DB community are accountable for the absence of some of these necessary things, including a formal data model. I raised my hand to help fill that gap. At no point during the workshop was the idea of a categorical formalism for property graphs floated, other than in my links, but I expect to continue advancing that approach through the working group.

IMO, we should not discount the importance of early design work on TinkerPop4 for informing some of these efforts arounds standards. I think it is fair to say that property graphs and associated tech were elevated and sustained by TinkerPop, and this did not go unrecognized in the workshop. It will be interesting to see how the developer / standards / biz dynamic evolves going forward.

See the workshop summary here.

...

To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/380250fe69d67dc8ede835c3c778be9bb6b6831c.camel%40gmail.com.

Ryan Wisnesky

unread,

Mar 8, 2019, 4:27:26 PM3/8/19

to Gremlin-users

Hi All,

Fred Eisele pointed me to this thread; I am not a gremlin user but since the topic of category theory and graph databases has come up I wanted to speak up and mention that myself and others at MIT, NIST, and other places have mounted a direct attack on many of the problems mentioned here, such as formalizing database schemas as categories and schema mappings as functors, as well as tackling the hard (often undecidable) problems encountered in solving these problems at scale and/or in full generality. We have an open-source tool and many published papers and talks hosted here:

http://categoricaldata.net/fql.html

http://github.com/CategoricalData/fql

Fred has been using our software for schema evolution, and others have applied it to various ETL and data integration tasks - see link above. Because graphs and categories are so closely related, there could very well be results useful to your efforts contained in this research program. I am happy to chat further if there's any interest!

Ryan Wisnesky

http://wisnesky.net

Joshua Shinavier

unread,

Mar 18, 2019, 3:04:18 PM3/18/19

to gremli...@googlegroups.com

Hi Ryan,

Just wanted to respond here to say that yes, CQL and the conceptual model described in the links are definitely relevant to these discussions of categorical formalisms for property graphs, RDF, and knowledge graphs. There is a lot of material here to dig into, but I have taken a look at "Ologs" and the recent paper on CQL.

Even the simple fact that you use commuting diagrams to represent assertions, constraints and queries is powerful IMO. As the authors say toward the end of the ologs paper ("Implementing ologs in the real world"), much is possible once you have good categorical representations for knowledge. The problem is that there is usually an input bottleneck around that criterion of "good", particularly when we we are talking about knowledge graphs constructed by people. Look at the long history of semantic wikis, for example. Coming up with good ways to allow users to express semantically rich data, easily and fluidly, tends to be hard, though an approachable idiom of diagrams, dots, and arrows may help.

Josh

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/ea643426-de53-40eb-8215-3ced82db6c32%40googlegroups.com.

Ryan Wisnesky

unread,

Mar 18, 2019, 4:29:50 PM3/18/19

to gremli...@googlegroups.com

Thank you for your kind words, Josh. We’re happy to help - in particular, if there are math questions that can help get you all get to where you want to be, or experiments with CQL, we’re happy to give them some thought. Our research program focused primarily on understanding the connection between CQL/ologs/category theory and SQL/relational algebra/functional programming, and to a lesser extent, RDF. The connections with graph theory were not as thoroughly explored by our group, but seem relevant; just to give an example, mathematicians have developed over 30 different visual yet completely formal diagrammatic calculi for ’string diagrams’ alone: https://www.mscs.dal.ca/~selinger/papers/graphical.pdf . Different kinds of graphs (multi-edge or not; allow splitting or not; allow choice or not, allow crossing wires, etc) can be interpreted as diagrams in different categories (having products; having diagonals; having co-products, commutativity, etc). I can’t say for sure if that particular line of work will be useful, but I wanted to mention it specifically as a survey paper connecting category theory with graphs, written from a non data-migration perspective (the ``applied category theory’’ field is far broader than just CQL).

> Edges:
>
>
> Vertex properties:
>
>
> Edge properties:
>
>
> Vertex meta-properties:
>
>
> Meta-edges: edge to vertex:
>
>
> Meta-edges: vertex to edge:
>
>
> Hyper-edge:
>
>
>
> RDF statement:

>
>
>
> Get it? Can you see how closely related edge labels are to property keys, and how slight are the differences between vanilla edges and weird things like meta-edges and hyper-edges? They don't look so weird when drawn like this. In most cases, we are just permuting labels within the same basic diagram.
>
> tl;dr if you like dots and arrows, you might like category theory, and you might not find the idea of a property graph standard -- or a common language for describing and interconnecting graph data models -- too daunting.
>
> What are your thoughts? If there were a common API for property graph schemas and transformations, what would you want in it?
>
> Best,
>
> Josh
>
>
>
> --
> You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/ea643426-de53-40eb-8215-3ced82db6c32%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/CAPc0OutwsKGwf6qh2wAtdD5xbWiMwkUrJBDHqP%3DHt4w6nQ5_CA%40mail.gmail.com.

Joshua Shinavier

unread,

Mar 18, 2019, 7:21:15 PM3/18/19

to gremli...@googlegroups.com

Thanks, Ryan. The survey paper you linked is pretty rich with diagrammatic goodness that could potentially be applied to TinkerPop. For example, just look how much more readable diagrams 1.5 and 1.6 are than the tensor notation in 1.3. The same comparisons can be made for graph queries. E.g. in the Penrose notation (or the cleaned-up version in the survey paper; the original freehand diagrams in Penrose's paper are messier and more interesting):

See also Marko's stream notation. I think you should be able to tell at a glance what the query does, even if you have not encountered the notation before.

Josh

On Mon, Mar 18, 2019 at 1:30 PM Ryan Wisnesky <ry...@catinf.com> wrote:

Thank you for your kind words, Josh. We’re happy to help - in particular, if there are math questions that can help get you all get to where you want to be, or experiments with CQL, we’re happy to give them some thought. Our research program focused primarily on understanding the connection between CQL/ologs/category theory and SQL/relational algebra/functional programming, and to a lesser extent, RDF. The connections with graph theory were not as thoroughly explored by our group, but seem relevant; just to give an example, mathematicians have developed over 30 different visual yet completely formal diagrammatic calculi for ’string diagrams’ alone: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mscs.dal.ca_-7Eselinger_papers_graphical.pdf&d=DwIFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=Ifhys3Z04hefaAWpYPViWFAiO0vpUrG_9m8FJMwVo1w&s=RuIHSdH_MXNXmYIxHN-8UvSL1pE6o7V_H-HXBc51E2g&e= . Different kinds of graphs (multi-edge or not; allow splitting or not; allow choice or not, allow crossing wires, etc) can be interpreted as diagrams in different categories (having products; having diagonals; having co-products, commutativity, etc). I can’t say for sure if that particular line of work will be useful, but I wanted to mention it specifically as a survey paper connecting category theory with graphs, written from a non data-migration perspective (the ``applied category theory’’ field is far broader than just CQL).

[snip]

Reply all

Reply to author

Forward