Dynamic Schema Generation

311 views
Skip to first unread message

Peter Hunsberger

unread,
Feb 7, 2017, 4:13:10 PM2/7/17
to sangria-graphql
Thought I'd start a new topic for this since this is a more general discussion than the previous one I had piled on to.

I've got some basics working but I find myself having to implement a fair amount of code and I keep wondering if I'm not missing something.  The basic goal is to hook Sangria to DataStax Enterprise (DSE) Graph (Titan), interpreting the DSE Graph schema and creating a Sangria schema from that. I'm mapping the graph vertices to objects and using the vertex properties as the fields.  Directed edges map parent child relationships.  The plan is to use edge properties as metadata for query modification, restriction and filtering etc.

So far I have been building an AST from multiple passes over the DSE Graph schema and then parsing the AST as the Sangria schema.  As I'm starting to build out more of the capability it seems I'm basically re-implementing the AST builder from scratch.  As a result I'm second guessing whether this is the best way to proceed.  The decision on whether to use the AST implementation or an Introspection builder was somewhat arbitrary but my schema representation from DSE Graph has no relationship to anything that the Introspection builder would normally handle so I figured the AST approach made more sense.

Any thoughts, or suggestions on how to do this would be greatly appreciated!

Oleg Ilyenko

unread,
Feb 7, 2017, 4:47:59 PM2/7/17
to sangria-graphql
Hi Peter,

Thanks a lot for a detailed description! You mentioned that you are building the GraphQL schema from AST) Have you considered a more simple solution, like constructing schema with plain `ObjectType`s. I mean something like this ("pseudo" scala code):

val allTypes = TrieMap.empty[VertexId, ObjectType[MyContext, DBObject]]

val vertices: Seq[Vertex] = loadVerticesMetaInfoFromDB(...)
val edges: Map[VertexId, Seq[VertexId]] = loadVerticesMetaInfoFromDB(...)

vertices.foreach { vertex
val vertexFields = vertex.properties.map(prop
Field(prop.name, figureOutType(prop), resolve c resolvePropWithDBObject(prop, c.value)))

// please note that I use function here because vertices can have recursive relations,
// or some vertices may not be present in `allTypes` yet
val allFields = () {
val relations: Seq[Vertex] = edges(vertex.id).map(otherVertexId vertices.find(_.id == otherVertexId))

val relationFields =
relations.map { otherVertex
Field(otherVertex.name + "RelationField", allTypes(otherVertex.id),
resolve = c resolveRelationWithDBObject(otherVertex, c.value))
}

vertexFields ++ relationFields
}

val graphqlObject = ObjectType(vertex.name, vertex.description, fieldsFn = allFields)

allTypes.put(vertex.id, graphqlObject)
}

val schema = Schema(allTypes(VetexId("SomeEntryPointType")))

Hope this code demonstrates the concept. Because it's an object graph, we need to construct it lazily so that recursive references and not-yet-defined types are properly initialized. That's why I used `allTypes` map and a function to initialize the fields (and not a normal list of fields).

Cheers,
Oleg

Peter Hunsberger

unread,
Feb 9, 2017, 10:03:43 AM2/9/17
to sangria-graphql
Thought I had posted a reply thanking you for this but apparently not.  Once more thanks, I think something along these lines in the examples might help a lot.... 

In any case, I've got this coded up but I've run into another problem with this approach that I had forgotten about: how do you code up the resolve?  I apparently need to define have something of the form:

Field[Ctx, Val]

in scope, but where do I find the definition of Ctx? I haven't been able to dig up the import that would let the compiler resolve it....  Do I need to create my own and if so of what form?

I'm still new enough to Scala that I may be missing something obvious so a little more help would be appreciated!


Peter Hunsberger

unread,
Feb 9, 2017, 1:14:40 PM2/9/17
to sangria-graphql
I have solved my problem by making sure the type parameters for Field matched up with the type parameters for ObjectType.

A complete working example in the documentation would be wonderful.  There's a lot of very subtle stuff going on here.  It makes sense once you figure it out, but you've got to wade through the Sangria source code a lot to figure out exactly what is needed where!

Oleg Ilyenko

unread,
Feb 9, 2017, 5:11:03 PM2/9/17
to sangria-graphql
Glad that you figured it out! Regarding your previous question about `Ctx`. `Ctx` represents a user context. This means that it's all up to you to define what type it should have. It is nothing more than a convenience that allows you to access a particular object in all of the fields (this object can provide useful stuff like data store access, user auth information, etc.). If you don't need it in your schema, I would suggest to just use `Unit` for it.

I agree about the documentation. I try to write documentation along the way, especially for new features. Recently (before v1.0 launch) I put extra effort to update the "Getting Started" and included a complete walk-through/tutorial:


Though I must admit that it's quite challenging for me to identify areas where I need to go deeper or explain the context. So I would appreciate if you could share specific things that need more explanation (maybe as an issue on the website project: https://github.com/sangria-graphql/sangria-website). Of course, PRs are also highly appreciated :)

Peter Hunsberger

unread,
Feb 14, 2017, 5:16:34 PM2/14/17
to sangria-graphql
I've got the basic schema building working and and am now looking for advice on how to best implement the data resolution. I'm just not seeing what the resolve methods looks like and what should be happening in side the relationship retrieval?  For example, if I call the relationship resolve like:

     resolve = ctx => GraphData.getRelationFromGraph( graph, vertex, ctx.value)     // Can I pass in a 3rd value here or does the Vertex need to become part of the context?

then what does the function definition actually look like and what is the type it's returning?

def getRelationFromGraph( graph: DseGraph, vert: GVertex, anyVal: AnyVal): ???  = {

     // What's needed here?
 }

Somewhat related, I assume that since for any ObjectType all of my fields will come from a single Vertex in my real graph that field resolution  is done via some form of DeferredResolver?  Since this is generic across all data fields does that mean this should be part of some class implementing DeferredResolver or is the Projector what I'm looking for

Oleg Ilyenko

unread,
Feb 14, 2017, 6:22:49 PM2/14/17
to sangria-graphql
Hi Peter,

Great that you are making progress! You are right deferred resolver can be very useful here and it will greatly improve the `Vertex` loading from DB (generally, it will ensure that for every GraphQL query nesting level only one single request is executed against the DB). You can also use Fetch API for this. I will demonstrate it in an example below.

I would suggest to continue with my previous example and define 2 resolve functions which I already used but haven't defined yet.
To make everything a bit more explicit and free of hidden assumptions, I first will define the basic setup that I will use for this example. Let me know whether it is very different from the way DSE models the graph and metadata about it:

// basic setup: the vertex itself

// just to make example a bit more clear, I created these simple types for IDs
case class VertexId(id: String)
case class VertexTypeId(id: String)

case class Vertex(
id: VertexId,
vertexTypeId: VertexTypeId,
properties: Map[String, Any],
outgoingEdgesToOtherVertices: Map[VertexTypeId, VertexId])

// basic setup: the meta-information about it (the connection between these is through `vertexTypeId`)

case class VertexMetaInfo(id: VertexTypeId, propertiesMetaInfo: List[PropertyMetaInfo])

object PropertyType extends Enumeration {
val CassandraInteger, CassandraString = Value
}

case class PropertyMetaInfo(name: String, propertyType: PropertyType.Value)

How that we have a basic setup, let's define the resolve functions. I also would like to point out the assumption that cassandra loads all data as a `Vertex`. this would also mean, that all of these `ObjectType`s that have defined for your schema are of type `ObjectType[DseContext, Vertex]`.

// The resolve function that you will use for normal properties

def resolvePropWithDBObject(propertyMetaInfo: PropertyMetaInfo, c: Context[DseContext, Vertex]) = {
val vertexProperties = c.value.properties
val propertyValue = vertexProperties(propertyMetaInfo.name)

// here you may or may not need to do this kind of transformations
// in the most simple scenario, you can just return `propertyValue`, I guess
propertyMetaInfo.propertyType match {
case PropertyType.CassandraString
transformCassandraStringToScalaString(propertyValue)

case PropertyType.CassandraInteger
transformCassandraIntegerToScalaInt(propertyValue)
}
}

// The resolve function that you will use edges between vertices

val vertexFetcher = Fetcher.caching[DseContext, Vertex, VertexId](
(ctx: DseContext, vertexIds: Seq[VertexId]) {
val loadedVertices: Future[Vertex] = ctx.loadVerticesByIds(vertexIds)

loadedVertices
})

def resolveRelationWithDBObject(otherVertex: VertexMetaInfo, c: Context[DseContext, Vertex]) = {
val otherVertexId = c.value.outgoingEdgesToOtherVertices.get(otherVertex.id)

vertexFetcher.defer(otherVertexId)
}

// at some point when you are about to execute the query,
// you also need to provide a deferred resolver

Executor.execute(schema, query,
deferredResolver = DeferredResolver.fetchers(vertexFetcher))

I made code and all types very explicit just to make it clear which types are involved. In the actual application you probably can relay on type inference more :) 

Let me know whether this example is helpful.

Cheers,
Oleg

Peter Hunsberger

unread,
Feb 15, 2017, 10:43:16 AM2/15/17
to sangria-graphql
Thank you for that. This is a side project so I don't get a ton of time to devote to it so it will take me a while to get through all of this but I will let you know where I go with this.

Peter Hunsberger

unread,
Feb 15, 2017, 3:42:23 PM2/15/17
to sangria-graphql
Turns out that the DSE Graph models is close enough to what you have that I think I can make it work.  Each Vertex does in fact have an Id (of type String).  However, it does not directly have a relationship with any other vertexes.  Instead there can be any number of edges.  Each Edge is directed, essentially has a "from Id" and a "to Id" (In and Out in their docs).  There is a query that given a list of Ids will find all the edges that have that as an in Id and I can make that query return the associated vertexes. So I think I can use that and pick up the vertexes needed in the deferred resolver.  However, when I code this up I get the error:

Can't find suitable `HasId` type-class instance for type `Vertex`. If you have defined it already, please consider defining an implicit instance `HasId[Vertex]`

for the definition of the Fetcher which is:

val  vertexFetcher = Fetcher.caching[DseContext, Vertex, VertexId]( .... )

There is a getId method on the Vertex, do I need to define a wrapper class, I can't see what the implicit would be or where I would define it?

Oleg Ilyenko

unread,
Feb 15, 2017, 7:19:50 PM2/15/17
to sangria-graphql
Glad that it conceptually fits in DSE model! I would suggest to start simple, and just do something like this:

val  vertexFetcher = Fetcher.caching[DseContext, Vertex, VertexId]( .... )(HasId(_.id))

You can also do it like this:

implcit val vertexhasId = HasId[Vertex, String](_.id)

val
 vertexFetcher = Fetcher.caching[DseContext, Vertex, VertexId]( .... )

Peter Hunsberger

unread,
Feb 16, 2017, 10:38:34 AM2/16/17
to sangria-graphql
Thank you,  I had tried at least a dozen variants of your second suggestion, I think including the one you provide, but I must have other things messed up at the time I was trying them.  Works now.

This takes me full circle on the issue of handling the Sangria Context. From what I can see, I have a set of dependencies on the context that start from what the Executor.execute method is expecting.  The Schema you pass in there apparently has to conform to sangria.schema.Schema[Any,Any]

However, that Schema is what is used in the resolve step, and there I think I need access to my database session and metadata.

I'm guessing this is yet another place where implicits might solve the issue?

Here's the (slightly cleaned up, DseContext also expands) error I'm getting for the Executor.execute call:

type mismatch;
 found   : models.DseSchema
    (which expands to)  sangria.schema.Schema[models.DseContext,models.GBase]
 required: sangria.schema.Schema[Any,Any]
Note: models.DseContext <: Any, but class Schema is invariant in type Ctx.
You may wish to define Ctx as +Ctx instead. (SLS 4.5)
Note: models.GBase <: Any, but class Schema is invariant in type Val.
You may wish to define Val as +Val instead. (SLS 4.5)

Oleg Ilyenko

unread,
Feb 18, 2017, 9:20:02 AM2/18/17
to sangria-graphql
It's hard for me to suggest something without seeing the actual code. `Schema[models.DseContext,models.GBase]` signature looks reasonable to me, so some types got mixed up somewhere along the way, i guess. I can only recommend to double-check the schema definition and all related `ObjectType`s and make sure that `Ctx` and `Val` types are fitting well together (and not unified to `Any`).

Peter Hunsberger

unread,
Feb 20, 2017, 9:58:30 AM2/20/17
to sangria-graphql
Hi Oleg,

basically the problem shows up as soon as you attempt to define a Context for Executor.execute that is not Context[Any,Any]

If I pass in:

Executor.execute( schema, queryAst, GraphData.session, operationName = operation,
              variables = variables getOrElse Json.obj(),
              deferredResolver = DeferredResolver.fetchers( GraphData.vertexFetcher ),
              maxQueryDepth = Some(10)).map(Ok(_))

where schema resolves to:

 sangria.schema.Schema[scala.collection.concurrent.TrieMap[String,models.Gbase],Any]

I get the error:

required: sangria.schema.Schema[Object,Any]
Note: scala.collection.concurrent.TrieMap[String,models.GBase] <: Object, but class Schema is invariant in type Ctx.
You may wish to define Ctx as +Ctx instead. (SLS 4.5)

If I understand this error, the definition of Context within the Sangria code needs to be +Ctx so that I can pass in a sub-class of Object and not just a plain Object?

I can work around this by not  putting any types on any of the methods that are working with the Context and keeping it Context[Any,Any] but that of course means I get no type checking....

Oleg Ilyenko

unread,
Feb 20, 2017, 10:34:02 AM2/20/17
to sangria-graphql
I would suggest you double check `GraphData.session`. looks like its type is `Object`?  The third argument to `Executor.execute` (`GraphData.session` in your case) should be of the same type that schema defines as a context type (first type argument, `scala.collection.concurrent.TrieMap[String,models.Gbase]` in your case)

Peter Hunsberger

unread,
Feb 20, 2017, 6:16:41 PM2/20/17
to sangria-graphql
So for query execution the user context becomes the context on the resolve instead of what was is associated with the Schema?  If so, and if they both don't have to match up, it likely helps me for where I need to get to.

I was attempting to pass in my Type definition over the DataStax session, which is a Java Interface (not an Object), but this was essentially a place holder as I got everything else to work. At Schema construction time having the session hanging around is a convenience but I can get it other ways or maybe wrap it.

At execution time I really want to have the original DataStax metadata hanging around and this is currently in a TrieMap in an Object.  I'll have to think about get some Class definitions built that will work end to end...

Peter Hunsberger

unread,
Feb 21, 2017, 11:26:00 AM2/21/17
to sangria-graphql
So answering one of my own questions: It doesn't appear you can have a different context at schema construction time and at execution time.  The resolve step get's a context of whatever context is used at the time it is created.  The resolve step determines the context at execution time.  So unless you can use an implicit to somehow define the context of the resolve step I think you always have the same context?

Either way, I can live with a single context so I built a case class that manages the metadata and set up everything to use it every where.  As soon as I try to modify the signatures on the Executor.execute method I run into problems again.  It seems I've now moved the problem one step up the hierarchy, I now get the following error:

type mismatch;
 found   : models.DseSchema
    (which expands to)  sangria.schema.Schema[models.SchemaMetadata,Any]
 required: sangria.schema.Schema[Serializable,Any]
Note: models.SchemaMetadata <: Serializable, but class Schema is invariant in type Ctx.
You may wish to define Ctx as +Ctx instead. (SLS 4.5)

This is for:

            Executor.execute( schema, queryAst, SchemaMetadata, operationName = operation,
              variables = variables getOrElse Json.obj(),
              deferredResolver = DeferredResolver.fetchers( GraphData.vertexFetcher ),
              maxQueryDepth = Some(10)).map(Ok(_))


Reply all
Reply to author
Forward
0 new messages