Cognitive Foundry 4.0.0 released

Justin Basilico

unread,

Mar 25, 2017, 1:47:44 AM3/25/17

to Cognitive Foundry

We just released version 4.0.0 of the Cognitive Foundry. This version contains many enhancements, new algorithms, adds a Graph package, and a new matrix package implementation. You can download it directly here and read the change log. The binaries are also available from Maven Central through dependency management tools like Ivy and Maven.

Thanks,

Justin

Stephen Mallette

unread,

Mar 25, 2017, 5:27:24 AM3/25/17

to cognitiv...@googlegroups.com

Nice to see a new release. The Graph package was a neat surprise. I'm already wondering if there might be any interesting integrations to be considered with Apache TinkerPop ( http://tinkerpop.apache.org/ ). Now that the Graph package has been introduced is there any future roadmap that could be shared with respect to it?

--
You received this message because you are subscribed to the Google Groups "Cognitive Foundry" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cognitive-foundry+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jeremy Wendt

unread,

Mar 28, 2017, 9:15:28 PM3/28/17

to Cognitive Foundry

Thanks for your interest in the graphs package. That's work we've been driving here at Sandia, and felt it made sense to share more broadly. I've not looked at TinkerPop before, so can't comment on how much sense it would make to integrate with it. What thoughts do you have on integrating with TinkerPop? What would it take?

I think your post had two questions which I'll answer. Please do help me better understand if you had a different question:

Why a graphs package?

We'd been building atop a different Java graphs package, but found that when we wanted to scale to even large-ish graphs (a few million nodes), we would run out of memory very quickly. Therefore, a few years ago, we built a graph POJO that scaled to millions of nodes with no problems. At that point, we needed to implement a community detection algorithm. Then, as research on various other codes progressed, we had to add various other things. Overtime, we realized it made sense to share this broadly, and the Foundry seemed a logical place.

What's the plan going forward?

As the above should indicate, the graphs package was something we found we needed more than something we planned to build years ago. However, now that we're bought in, we're moving forward pretty steadily. We're adding new functionality regularly internally and plan to push those updates out a couple times per year. New codes are added as the need arises, but always with the goal of implementing a published algorithm that should scale very well. In the near term, I expect we'll be adding k-Core, possibly some other forms of node labeling or multi-layer graph support (although these last ones are fuzzier as we submitted proposals in those areas, but haven't heard back yet).

Thanks,
Jeremy

To unsubscribe from this group and stop receiving emails from it, send an email to cognitive-foun...@googlegroups.com.

Stephen Mallette

unread,

Mar 29, 2017, 11:07:32 AM3/29/17

to cognitiv...@googlegroups.com

As a brief overview, Apache TinkerPop is a graph abstraction for different graph databases and graph processors. With respect to graph databases, it is implemented by just about every major graph database provider - both open source and commercial - to include IBM Graph, Neo4j, OrientDB, DataStax Enterprise Graph, and others. For graph processors there are implementations over spark, hadoop, and giraph. The point of all the abstraction is that all of your code is just the Gremlin Graph Traversal Language. The same line of Gremlin will traverse a graph in an OLTP style over Neo4j as it will in OrientDB. Similarly, that same line of Gremlin could also be executed using Spark as it's engine to execute a traversal over a multi-billion edge graph in parallel. It's all just Gremlin!

There's lots of examples of Gremlin in our documentation but a simple traversal basically looks like this:

gremlin> marko = g.V().has('name','marko').next() // find me a person named "marko"

==>v[1]

gremlin> g.V(marko).out('knows').values('name') // get me the names of people "marko" knows

==>vadas

==>josh

It's worth noting that Gremlin has both imperative and declarative aspects. You've seen the imperative style above, but for declarative traversals, you can use match(). The following traversal answers the question: "Who created a project named 'lop' that was also created by someone who is 29 years old? Return the two creators.":

gremlin> g.V().match(

__.as('a').out('created').as('b'),

__.as('b').has('name', 'lop'),

__.as('b').in('created').as('c'),

__.as('c').has('age', 29)).

select('a','c').by('name')

==>[a:marko,c:marko]

==>[a:josh,c:marko]

==>[a:peter,c:marko]

We've also started expanding Gremlin to other languages beyond Java and currently have a pure Python representation as well, with javascript and other languages to hopefully follow.

TinkerPop doesn't really build "algorithms", though it has a few as reference implementations and certainly has the ability for users to develop their own which you can read about here:

http://tinkerpop.apache.org/docs/current/reference/#vertexprogram

I only briefly studied the graph code in foundry but an immediate use case that came into mind was for foundry to implement the Graph API interfaces so that a user could use our subgraph step:

http://tinkerpop.apache.org/docs/current/reference/#subgraph-step

to pop off a subgraph from a traversal into foundry at which point they could then run foundry's algorithms on it. Since you're talking about foundry scaling to graphs in the millions of vertices analysis over some local subgraph might be a nice fit.I'm not sure if there's other value to integration, but that's the first one that came to mind. Of course, if you implement the Graph API interfaces, you then get the ability to execute arbitrary Gremlin traversals over your "FoundryGraph" - you basically get a dynamic query language for free plus everything else the TinkerPop stack has. Note that implementing the Graph API is not a massive load of work - we've seen these interfaces implemented in several days worth of effort - you can read more about this here:

http://tinkerpop.apache.org/docs/current/dev/provider/#graph-system-provider-requirements

Anyway, I've probably provided plenty for you to digest at this point. Look forward to hearing if you have any further thoughts on the matter.

Take care,

Stephen

To unsubscribe from this group and stop receiving emails from it, send an email to cognitive-foundry+unsubscribe@googlegroups.com.

Reply all

Reply to author

Forward