Algebraic Property Graphs

Joshua Shinavier

unread,

Sep 11, 2019, 10:35:51 AM9/11/19

to gremli...@googlegroups.com

Users of Gremlin,

I am happy to give you a first look at a paper that has been several months in the making: Algebraic Property Graphs. This has been a great collaboration between Ryan Wisnesky (of Conexus AI) and me, with early contributions from Jan Hidders (active in the Property Graph Schema Working Group).

The paper addresses the long-standing need for a formally specified property graph data model (of which RDF* is another, contrasting example). It is somewhat like a bigger, better Graph.Features, and is originally based on my work on integrating graph and non-graph schemas at Uber, where we are very heavy on loose systems of algebraic datatypes including Avro, Thrift, and Protocol Buffers. Algebraic property graphs allow the majority of our data to be treated as graph data, without the need for manual mappings.

In my Global Graph Summit talk last January, I motivated formalizing the data model in terms of category theory, but it took Ryan's deep expertise to pull this off. See my earlier messages to gremlin-users (esp. here and here) for context. The result is a framework which is conceptually simple, as property graphs should be, but mathematically rigorous and with straightforward connections to past work on database theory and data integration.

Since Ryan and I started writing the paper, mm-ADT has also burst into glorious existence (see Marko's talk at ApacheCon today), and delves deep into process in addition to structure. We will be working to bring the best of both worlds to TinkerPop 4.

Josh

Note: this paper has just gotten through Uber's lengthy internal review process. A more permanent arXiv preprint of the paper should become available later today; we will tweet the link and/or post it here.

Josh Perryman

unread,

Sep 12, 2019, 10:28:11 PM9/12/19

to Gremlin-users

Josh-

Thank you for emailing this to the group. It looks like you and Ryan have provided the rigor necessary for the algebra. I have some projects coming up where I expect to be making use of this as I build out transformations between graphs and another systems.

Cheers,

-Josh

Joshua Shinavier

unread,

Sep 13, 2019, 9:30:02 AM9/13/19

to gremli...@googlegroups.com

Hi Josh,

Good to hear, thanks. Watch this space for a new, Scala-based API (ETA late October / early November). If you are in a hurry, there is also the Haskell-based tooling we use at Uber, which comes with an associated YAML format for schemas; I am currently refactoring it to bring it into more direct alignment with what we describe in the paper.

Btw. the arXiv preprint is available now: [1909.04881] Algebraic Property Graphs.

Best regards,

Josh

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/a91ba0a5-9656-413a-8223-a50c2eadfe87%40googlegroups.com.

Fred Eisele

unread,

Sep 13, 2019, 9:44:32 AM9/13/19

to Gremlin-users

Is the source tree public?

I, like Josh, need interop between schema and databases.

Joshua Shinavier

unread,

Sep 13, 2019, 10:53:31 AM9/13/19

to gremli...@googlegroups.com

Well, now that I have mentioned it, I had better make it open source. I will start that process. However, if you need RDBMS interoperability, schematool (the Haskell utility) might not be immediately useful to you, because the target is an Uber format called UQL. Also note: this was my first foray into Haskell, and the code is sorta gnarly. Literally; my error-wrapping monad is called Sorta. E.g.

-- | Convert a logical schema to or from a Protobuf AST
protobufSchemaMapping :: ProtoConfig -> SchemaVocabulary -> Mapping Schema ProtoSchema Sorta
protobufSchemaMapping config vocab = Mapping toProto fromProto
  where
    toProto schema = ProtoSchema <$> path <*> dotProto <*> comment <*> content
      where [...]
    fromProto (ProtoSchema path proto comment content) = schema
      where [...]

Most of the mappings use the symmetric currying, or apple-orange-banana pattern I described in my Graph Day talk. In the example above, you give the function a configuration object and some vocabulary metadata, and it gives you back a mapping. The mapping either takes a (logical / APG) schema and "sorta" gives you a Protobuf schema, or it takes a Protobuf schema and sorta gives you a logical schema. The Sorta means that the mapping might not give you an exact equivalent of the schema you passed in (particularly if there are errors or incompatibilities), but it will give you the next best thing, together with some disclaimers about information that could not be mapped. I imagine I will carry this pattern forward into the Scala implementation unless there is a more standard monad I can use (which there probably is).

Josh

On Fri, Sep 13, 2019 at 6:44 AM Fred Eisele <fredric...@gmail.com> wrote:

Is the source tree public?
I, like Josh, need interop between schema and databases.

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/c2bb8f5c-c403-4874-9964-d860ac240d11%40googlegroups.com.

Ryan Wisnesky

unread,

Sep 13, 2019, 1:40:29 PM9/13/19

to gremli...@googlegroups.com

In support of the paper, I implemented:

1) the algorithms in Java, as part of CQL - https://github.com/CategoricalData/CQL/tree/master/src/main/java/catdata/apg

2) all of the examples in the paper in CQL - https://www.categoricaldata.net/help/APG.html

3) the algorithms in Coq, along with the proofs from the paper - https://github.com/CategoricalData/ApgCoq

Ryan

> To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/CAPc0Ouu_wdjG10fzJMqb9vhrXS8Wcb_2%3D7Oi7dY%2Brd3wB66NSA%40mail.gmail.com.

Josh Perryman

unread,

Sep 13, 2019, 3:14:58 PM9/13/19

to Gremlin-users

You're going to have a Scala version? It's not even my birthday.

I'm just starting to think about the use cases for which I might need a tool like this, so a couple of months is no problem. But if the Scala code is looking for early evaluators, or contributors, let's just say that I might know a guy.

-Josh

To unsubscribe from this group and stop receiving emails from it, send an email to gremli...@googlegroups.com.

Joshua Shinavier

unread,

Sep 14, 2019, 12:53:56 PM9/14/19

to gremli...@googlegroups.com

Good to know, Josh! And I think you're not the only one who would contribute to this, so let me sketch out the framework first and see how everyone feels about it. Then we can start building the mappings.

Need to think harder about how to integrate the Scala-based API with Ryan's work, as well. At a superficial level, this should be straightforward; his ApgSchema can be just another target for mappings. Given a native APG Schema specified in YAML (or another source-of truth format like Thrift with annotations), map it to an ApgSchema, and then you are in CQL world, where you can do more sophisticated transformations. From here, you can map transformed ApgSchemas back to native Schemas. However, it would be great to bring the two concepts together, so that there is no difference between "mappings" and "transformations". The main point in favor of "mappings" is that you can make full use functional pattern matching to implement them in idiomatic Haskell or Scala code, whereas Ryan's API is Java-based.

Josh

To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/58bab547-6037-4b08-a1ed-20beb27f38be%40googlegroups.com.

Ryan Wisnesky

unread,

Sep 14, 2019, 4:19:25 PM9/14/19

to gremli...@googlegroups.com

CQL has come up a few times in the context of using APG in practice, so here's an explicit description of the connection, which is also described in the paper, mostly in the data migration section.

CQL, which ironically used to be called AQL (for algebraic query language), is a language for manipulating algebraic databases in general - any database that can be presented as a set-valued functor. The idea of storing data as set-valued functors goes at least as far back as the early 1990s.

APGs are algebraic databases in the above sense, and you can do things like project from an APG onto an algebraic database that is not an APG. So, implementing APG as part of CQL, which happens to be written in (purely functional) Java, was a natural choice. But you can use APG inside of CQL without touching anything besides APG. The paper's example's are built in to the CQL IDE as the "APG" example. You can also simply invoke the straight-forward API for the Java code.

For the moment, the CQL IDE / its Java code is the only place where certain algorithms from the paper, such as merging APGs, actually compute. The Coq development encodes sets as types, and so although you can prove properties about the algorithms this way, you can't in general enumerate the sets so computed.

APG is lightweight - its implementation is quite similar to implementing the simply typed lambda calculus with products and sums, and various derived constructions, such as type inference and normalization. It would be great to see it implemented everywhere.

> To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/CAPc0OuudWapkOOPfVsRN4WyRH6_YQu16dBqFNskDaehUfdTtiw%40mail.gmail.com.

Joshua Shinavier

unread,

Sep 15, 2019, 10:05:58 AM9/15/19

to gremli...@googlegroups.com

I agree with all of that, and I think CQL is the right tool for operations on APGs. However, the mappings I am referring to are not operations on APGs; they are operations on finer-grained things like parts of schemas, and you need to be able to build coarser-grained mappings out of finer-grained ones, as I pointed out in that talk. For example, consider a mapping that takes APG sum types to and from Protobuf oneof's. At this level, you need to deal with a variety of APIs and language-specific quirks. For example, behold the following beauty of a function. This maps logical types (what we call "labels" in the paper, together with other metadata like descriptions and annotations) to and from Thrift type definitions.

typeMapping :: ThriftConfig -> SchemaVocabulary -> SchemaName -> Mapping Type (Maybe (TA.Type TMP.SourcePos)) Sorta
typeMapping options vocab sname = Mapping toThrift fromThrift
  where
    toThrift atype = if isBasicTypedef atype
      -- if this is a redundant alias for a primitive type, and does not have a js.type annotation, omit it
      then pure Nothing
      else byDataType dt
      where
        Type dt name (CommonInfo _ _ sameAs _ _ _) _ = atype
        jsType = jsSameAs sameAs
        thriftName = mapTo typeNameMapping name
        anns = pure [] -- sameAsAnnotations $ webSameAs sameAs
        doc = fmap (Just . T.pack) $ mapTo typeInfoMapping $ typeInfo atype
        byDataType dt = case dt of
          ZeroType -> byDataType (SumType [])
          UnitType -> byDataType (ProductType [])
          PrimitiveType dt ->
            fmap (Just . TA.TypedefType) $ TA.Typedef <$> target <*> thriftName <*> anns <*> doc <*> pure srcPos
            where
              target = pure $ datatypeToThrift dt jsType
          ProductType fields -> forSumOrProduct fields False
          SumType cases -> forSumOrProduct cases True
          -- note: irreversible; needs an annotation
          OptionalType dt -> byDataType dt
          EnumType values -> fmap (Just . TA.EnumType) enum
            where
              enum = TA.Enum <$> thriftName <*> enumDefs <*> anns <*> doc <*> pure srcPos
              enumDefs = catSortas $ L.zipWith (\id v -> emap (v, Just id)) [0..] (rpcEnumValues name values)
              emap = mapTo enumValueMapping
          ListType _ -> pure Nothing
          SetType _ -> pure Nothing
          MapType _ _ -> pure Nothing
          -- note: ignoring ExceptionKind
        forSumOrProduct fields union = fmap Just included
          where
            included =
              fmap TA.StructType $ TA.Struct <$> dt <*> thriftName <*> thriftFields <*> anns <*> doc <*> pure srcPos
            thriftFields = catSortas $ fmap map $ L.zipWith withId fields $ propertyIndexes fields
            withId (Field tr nm rb (FieldInfo common _ def card) owner) id
              = Field tr nm rb (FieldInfo common (Just id) def card) owner
            map = mapTo (fieldMapping union options vocab sname (directTypePointer atype))
            dt = pure $ if union then TA.UnionKind else TA.StructKind
    fromThrift Nothing = fromThrift $
      Just $ TA.TypedefType (TA.Typedef (TA.DefinedType (T.pack "UndefinedType") srcPos) (T.pack "UndefinedTypedef") [] Nothing srcPos)
    fromThrift (Just t) = byType t
       where
         toTypeInfo :: Maybe T.Text -> Sorta CommonInfo
         toTypeInfo doc = mapFrom typeInfoMapping $ Y.maybe "" T.unpack doc
         byType :: (TA.Type TMP.SourcePos) -> Sorta Type
         byType t = case t of
           TA.EnumType (TA.Enum name values anns doc src) ->
             Type <$> dt <*> nm <*> toTypeInfo doc <*> pure sname
             where
               dt :: Sorta DataType
               dt = fmap EnumType $ catSortas $ fmap toValue values
               nm :: Sorta String
               nm = mapFrom typeNameMapping name
               toValue v = fmap fst $ mapFrom enumValueMapping v
           TA.TypedefType (TA.Typedef tref name anns doc src) ->
             Type <$> dt <*> nm <*> toTypeInfo doc <*> pure sname
             where
               -- We can safely unwrap the Kinda here, as it is a local typeref
               dt :: Sorta DataType
               dt = do
                 s <- fromThriftTypeReference vocab sname tref False
                 dt <- kindaSorta $ toDataType s
                 return dt
               nm = mapFrom typeNameMapping name
               toDataType (NamedType t) = fmap dataType $ typePointerResolver t "type from Thrift" vocab
               toDataType dt = pure dt
           TA.StructType (TA.Struct thriftKind thriftName thriftFields anns doc src) -> do
               nm <- mapFrom typeNameMapping thriftName
               let typePointer = localTypePointer sname nm
               let mapper x = mapFrom (fieldMapping union options vocab sname typePointer) x
               fields <- catSortas $ fmap mapper thriftFields
               let dt = if union
                   then (if L.null fields then ZeroType else SumType fields)
                   else (if L.null fields then UnitType else ProductType fields)
               info <- toTypeInfo doc
               return $ Type dt nm info sname
             where
               union = case thriftKind of
                 TA.UnionKind -> True
                 TA.StructKind -> False
                 TA.ExceptionKind -> False -- TODO: don't make this into a struct
           TA.SenumType (TA.Senum name values anns doc src) ->
             unsup logicalFormat "SenumType"

As I said, these mappings can be a little gnarly, but I believe they would be even gnarlier in plain old Java. They make heavy use of pattern matching over Haskell sum and product types. For example, the above matches over DataType, which has constructors for primitive types, product and sum types, etc. and in the other direction, it matches over the Language.Thrift type called Type, which has constructors for enums, structs, etc.

To my mind, it makes the most sense to write fine-grained mapping code like this as close to the native API as possible, but at the same time, a mapping between two languages, like APG and Thrift, is also a morphism between data models in the sense of APG-in-CQL, modulo the incompatibilities that exist between every pair of languages. Is there a way to bring the pragmatic mappings closer to the CQL notion? If not, then supporting the CQL APG API as another target language will work well enough.

Josh

> To view this discussion on the web visit https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_d_msgid_gremlin-2Dusers_a91ba0a5-2D9656-2D413a-2D8223-2Da50c2eadfe87-2540googlegroups.com&d=DwIFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=ffg4p-8e545v4RN_Vib7RXdHWqv975TBMiHqbF8rVDE&s=i049oMUy3-KjnlAJRHtTu_5fOMItUjQ4LXTyZuMPu8c&e= .

>
> --
> You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

> To view this discussion on the web visit https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_d_msgid_gremlin-2Dusers_58bab547-2D6037-2D4b08-2Da1ed-2D20beb27f38be-2540googlegroups.com&d=DwIFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=ffg4p-8e545v4RN_Vib7RXdHWqv975TBMiHqbF8rVDE&s=xMlnoALGqeiFlFxlzBkrkeLVNfqpnaGDPK1vn8cBm3o&e= .

>
> --
> You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

> To view this discussion on the web visit https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_d_msgid_gremlin-2Dusers_CAPc0OuudWapkOOPfVsRN4WyRH6-5FYQu16dBqFNskDaehUfdTtiw-2540mail.gmail.com&d=DwIFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=ffg4p-8e545v4RN_Vib7RXdHWqv975TBMiHqbF8rVDE&s=LZLGCPk27ECCdPk4l1nq9NJQfd49gLlLJh5viHAwneo&e= .

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

To view this discussion on the web visit https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_d_msgid_gremlin-2Dusers_2B7D49DD-2DF17B-2D4175-2D9AAD-2DD7076E5E71A8-2540conexus.ai&d=DwIFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=ffg4p-8e545v4RN_Vib7RXdHWqv975TBMiHqbF8rVDE&s=u9vhqVURHxqZpE5HSkCV2r3MPQog2hTtTdOoQk_7xDk&e= .

Ryan Wisnesky

unread,

Sep 15, 2019, 2:34:48 PM9/15/19

to gremli...@googlegroups.com

Oh, right: what to do about translating from existing data models into and out of APG. If a data model is algebraic, we can probably express these translations using data migration and schema mappings - the rationale for implementing in CQL to begin with. If it isn't, or isn't known to be, then custom coding is required, be it Haskell or 4x as much Java.

APG is so lightweight something like a JSON interchange format may even make more sense than tying things together at the API level, as least for meta data such as schemas and schema mappings.

> To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/CAPc0OusAhVPqPzEuADnneD_jMEw0GUOJbdDA09pK5sp3uOio9w%40mail.gmail.com.

Joshua Shinavier

unread,

Sep 15, 2019, 10:03:07 PM9/15/19

to gremli...@googlegroups.com

As far as interchange formats go, this might be a good time to take a look at the YAML format for schemas. I am in the process of migrating Uber's standardized schemas to a new and improved YAML syntax, so it might be a good chance to incorporate any feedback from gremlin-users. This is the format I would like to use in connection with the Scala-based API, as well.

You can see a couple of snippets of the legacy YAML format we use in the Risk knowledge graph here. An example of the format we currently use for standardized schemas at Uber is here. Continuing with that example, let me illustrate the new format.

What we call a schema -- G(\sigma) -- in the APG paper is normally specified in a collection of YAML files. The contents of these files are perhaps confusingly called "schemas", whereas the collection is called a "schema vocabulary". Maybe we should call the files "schema modules" or "namespaces", while the collection is a "schema". Anyway, each file begins with some header lines like these:

name: position
description: "A schema for GPS sensor data"
status: Production

There are validation rules associated with different status values, e.g. a Production schema cannot include a Development schema, Development and above requires a certain level of documentation, etc.

Schemas/modules/namespaces may depend upon each other, e.g.

includes:
  - ../basic/datatypes
  - ../time/time
  - geometry

Then come the label definitions. The definitions key used to be called types, but I felt that "types" was a little overloaded, given that we also speak of primitive datatypes and anonymous constructions like lists as types. Instead, each label definition has a type, e.g.

- name: UnixTimeMillis
  description: "A timestamp in milliseconds since the Unix epoch"
  type:
    integer:
      precisionBits: 64
  sameAs:
    - id: Date
      context: JavaScript
  seeAlso:
    - https://en.wikipedia.org/wiki/Unix_time

The above is an alias for a primitive 64-bit integer data type, i.e. G(\sigma) of the label time:UnixTimeMillis is a 64-bit integer. The integer and float datatypes are parameterized in this way, integer also having a signed parameter which defaults to true. The other supported primitive types are binary, boolean, and string, which are not parameterized. TinkerPop4 may support a slightly different set of primitive types, but probably not radically different.

For contrast, the following defines a label bound to a record (product) type:

- name: TimeEvent
  description: "A measurement of time by a device, such as an accelerometer or GPS unit"
  type:
    record:
      fields:
        - name: epochMillis
          description: "Absolute time of the event in milliseconds"
          type: UnixTimeMillis
          index: 1
        - name: nanosSinceBoot
          description: "Elapsed time in nanoseconds since the measuring device became active"
          type:
            optional: Long
          index: 2

Whereas in the APG paper, we leave field names outside of the formalism, here every field does need a name, as well as other metadata like description and index, depending on validation constraints. Anonymous fields would be possible using different constraints. Also note the syntactic sugar optional in the above; this is a union of the unit datatype with an alias called Long.

Here is a label bound to a more explicit union (sum) type:

- name: Owner
  description: "The identity of an owner of a dataset, by email address or user name"
  type:
    union:
      cases:
        - name: userName
          description: "The user name of the owner"
          type: UserName
        - name: emailAddress
          description: "The email address of the owner"
          type: EmailAddress

Enumerations are a special kind of sum type, but I am not sure they should be special going forward. E.g.

- name: TemporalUnit
  description: "A standard duration, which provides the scale factor for a time extent, or the granularity or precision for a time position"
  sameAs:
    - id: http://www.w3.org/2006/time#TemporalUnit
      context: Web
  type:
    enumeration:
      values:
        - YEAR
        - MONTH
        - WEEK
        - DAY
        - HOUR
        - MINUTE
        - SECOND
        - MILLISECOND
        - MICROSECOND
        - NANOSECOND

...with mandatory SCREAMING_SNAKE_CASE for enumerated values which makes them easy to align with Thrift and Protobuf. However, the same definition could just as well be expressed with union syntax, e.g.

- name: TemporalUnit
  description: "A standard duration, which provides the scale factor for a time extent, or the granularity or precision for a time position"
  sameAs:
    - id: http://www.w3.org/2006/time#TemporalUnit
      context: Web
  type:
    union:
      cases:
        - name: year
        - name: month
        - name: week
        - name: day
        - name: hour
        - name: minute
        - name: second
        - name: millisecond
        - name: microsecond
        - name: nanosecond

Note: the sameAs annotation is used loosely; it really means something like "related to". Finally, a slightly more contrived example including a couple of vertex labels and an edge label:

- name: TripId
  description: "An identifier for a trip"
  type:
    named: UUID

- name: Trip
  description: "An actual or potential trip"
  sameAs:
    - id: https://schema.org/Trip
      context: Web
  id: TripId

[...]

- name: vehicle
  description: "The relationship between a trip and the vehicle which was used for the trip"
  type:
    record:
      - name: from
        description: "The trip"
        type: Trip
      - name: to
        description: "The vehicle which was used for the trip"
        type: Vehicle

We don't explicitly define edge labels like this any more except in our legacy schemas, but the newer syntax allows it. Note that Trip does not have a type; the default is unit, denoted by 1 in the paper, which is the type of all vertex labels. Also note that we give a name to the id type which uniquely identifies a Trip or a Vehicle, respectively; this makes the id types into first-class citizens for re-use in Thrift, Protobuf, Avro, etc. The schema declares that to every Trip, there is a unique TripId.

If you have any thoughts on what you would like to see in a TinkerPop schema language, feel free to chime in.

Josh

> To view this discussion on the web visit https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_d_msgid_gremlin-2Dusers_CAPc0OusAhVPqPzEuADnneD-5FjMEw0GUOJbdDA09pK5sp3uOio9w-2540mail.gmail.com&d=DwIFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=zs-cRz81YDlQ17Ul7RSHGyqunCpV6HEhr7vatRNysXk&s=i0DYuFDJ61DSmGJdSvfL4o9hhYikKuGKMQAAMS6quTE&e= .

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

To view this discussion on the web visit https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_d_msgid_gremlin-2Dusers_B77A0A2B-2DEAB8-2D4021-2DBF39-2DD99495907342-2540conexus.ai&d=DwIFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=zs-cRz81YDlQ17Ul7RSHGyqunCpV6HEhr7vatRNysXk&s=auxmaYy97oAM4TTvP6CGzk-JulNKBtZ7DBIaRx2V0I0&e= .

Ryan Wisnesky

unread,

Sep 15, 2019, 10:20:16 PM9/15/19

to gremli...@googlegroups.com

As far as interchange formats for algebraic datatypes go, there's also:

https://typedefs.com/about/

"Typedefs is similar to protocol-buffers, thrift and many more.

However, it is based on a theory of algebraic data types and aims to be a more natural fit for users of proof assistants and purely functional, strongly typed programming languages."

It's written in Idris with back ends for Haskell and a few others. I haven't used it myself.

> To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/CAPc0OutL%2BLrKGQ9_uYWhR0mCFVWps8kbDtTg5PVe78UfFyJkNw%40mail.gmail.com.

Joshua Shinavier

unread,

Sep 15, 2019, 11:25:12 PM9/15/19

to gremli...@googlegroups.com

My impression of typedefs is that it has been developed as an alternative to Protobuf, Thrift etc. with a cleaner, more self-consistent type system. I am all for that, and I hope it catches on; I would much rather deal with typedefs than the currently dominant langs for enterprise schemas, though it will take years at best before a new language achieves the same level of adoption.

Typedefs might be a reasonable source-of-truth language for APG if it is extensible enough. Thrift and Protobuf can also used as sources of truth, but this requires domain-specific annotations. The advantage of a custom YAML format is that we can use whatever syntax makes the most sense for APG, natively, without tacking things on in comments or annotations.

GraphQL SDL has also been suggested as the basis for a property graph schema language, and it may be a reasonably good fit for APG schemas in particular. Again, I would continue to use the YAML, but would be interested in supporting a mapping to such a GraphQL format specifically for the sake of GraphQL support.

Josh

On Sun, Sep 15, 2019 at 7:20 PM Ryan Wisnesky <ry...@conexus.ai> wrote:

As far as interchange formats for algebraic datatypes go, there's also:

https://urldefense.proofpoint.com/v2/url?u=https-3A__typedefs.com_about_&d=DwIFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=Nb3RhEfG5MxtSihivcwciw4FrimfLkeaa6mKM9qydpg&s=kRpREZQmI83mch0ycpIrSrf1dDgungKLXQjST7CoFLY&e=

> - https://urldefense.proofpoint.com/v2/url?u=https-3A__en.wikipedia.org_wiki_Unix-5Ftime&d=DwIFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=Nb3RhEfG5MxtSihivcwciw4FrimfLkeaa6mKM9qydpg&s=2xX1i0ImS_cKo-v_51Qg6b1xDjc2Vca7d27OCEk8Z-U&e=

> - id: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_2006_time-23TemporalUnit&d=DwIFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=Nb3RhEfG5MxtSihivcwciw4FrimfLkeaa6mKM9qydpg&s=RWdjAlUxlVdFSo15btvm3BK_0buU0qnWG75hkiVZqio&e=

> context: Web
> type:
> enumeration:
> values:
> - YEAR
> - MONTH
> - WEEK
> - DAY
> - HOUR
> - MINUTE
> - SECOND
> - MILLISECOND
> - MICROSECOND
> - NANOSECOND
> ...with mandatory SCREAMING_SNAKE_CASE for enumerated values which makes them easy to align with Thrift and Protobuf. However, the same definition could just as well be expressed with union syntax, e.g.
> - name: TemporalUnit
> description: "A standard duration, which provides the scale factor for a time extent, or the granularity or precision for a time position"
> sameAs:

> - id: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_2006_time-23TemporalUnit&d=DwIFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=Nb3RhEfG5MxtSihivcwciw4FrimfLkeaa6mKM9qydpg&s=RWdjAlUxlVdFSo15btvm3BK_0buU0qnWG75hkiVZqio&e=

> context: Web
> type:
> union:
> cases:
> - name: year
> - name: month
> - name: week
> - name: day
> - name: hour
> - name: minute
> - name: second
> - name: millisecond
> - name: microsecond
> - name: nanosecond
> Note: the sameAs annotation is used loosely; it really means something like "related to". Finally, a slightly more contrived example including a couple of vertex labels and an edge label:
> - name: TripId
> description: "An identifier for a trip"
> type:
> named: UUID
>
> - name: Trip
> description: "An actual or potential trip"
> sameAs:

> - id: https://urldefense.proofpoint.com/v2/url?u=https-3A__schema.org_Trip&d=DwIFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=Nb3RhEfG5MxtSihivcwciw4FrimfLkeaa6mKM9qydpg&s=9b6xuvwUyWSyEQ_P5OTVTGNr3UzcwwYiooHxCnDyrok&e=

> To view this discussion on the web visit https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_d_msgid_gremlin-2Dusers_CAPc0OutL-252BLrKGQ9-5FuYWhR0mCFVWps8kbDtTg5PVe78UfFyJkNw-2540mail.gmail.com&d=DwIFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=Nb3RhEfG5MxtSihivcwciw4FrimfLkeaa6mKM9qydpg&s=L88G0_PPbe89JDhwCCB4SepKFT8qJ4J3mDf85N4rOv4&e= .

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

To view this discussion on the web visit https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_d_msgid_gremlin-2Dusers_35DA2C48-2DBD9B-2D4E3A-2D98C7-2D7D42FAD3E848-2540conexus.ai&d=DwIFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=Nb3RhEfG5MxtSihivcwciw4FrimfLkeaa6mKM9qydpg&s=vUkmg81j_6B4i-XtCl02glAYEQ2nUJwSQ-v_TiDVrPU&e= .

Ryan Wisnesky

unread,

Sep 15, 2019, 11:40:21 PM9/15/19

to gremli...@googlegroups.com

It may also be worth thinking about various surface syntaxes for how APG schema mappings should be expressed. In the paper, we use algebraic datatypes, but with open terms (e.g., 'fst', 'snd', 'case'). Many algebraic data serialization languages, such as typedefs, only support closed terms (i.e., values).

> To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/CAPc0OuvZqrs%3DE_CQ40MoiuivOaUgZkv1mJb9jXcP1EuaFcyj6A%40mail.gmail.com.

Reply all

Reply to author

Forward