Idea for a centralized, graph-based log (tracing/diagnostics) for distributed systems.

larsw

unread,

Aug 3, 2012, 4:33:05 PM8/3/12

to distsys...@googlegroups.com

Background:

...

For the last couple of years, my former employer (I left last month), has built a fairly complex

distributed system for handling customer master data, in our context, which was debt/collection & financial services.

Keywords are: logical services, CQRS/ES.

Technology-wise; .NET 4, Neuron ESB, IIS7, MSMQ, Topshelf, SQL Server, WCF Data Services (OData - for read services), log4net

In all command & event handlers, we used log4net for logging - and used a simple convention to tag up the contents

in the log message to be able to easily extract / post-process it later. We had both file (log4net) appenders, as well as an

async appender to a centralized database (SQL Server).

An example of an entry in the log table:

Pseudo log entry:

TimeStamp, Service, Logger, MessageId, AggregateId, Message

Where the message could contain some tags (w/ data) and the message payload (xml/json), like (ex):

[PUBLISH:cd5e97eb-1263-4c0d-a85a-491aa0734535] [TOPIC:SomethingHasHappened] [TYPE:urn:company:project::SomeLogicalService:writeservices:events:SomethingHasHappened:1] <xmlPayload />

As you can see, there are "tags" inside the content [Identifier:Data]. The key-value pairs are trivial to extract with a simple regex.

Idea:

For a former customer, one of our consultants had great success creating a log viewer on top of a structure like this, for debugging/diagnostic purposes.

We started to brainstorm how we could build something similiar, but with forward and backwards navigation built in. By navigation, I need to elaborate a bit more;

In a message-based system, the messages normally describes something that is supposed to happen (commands) and something that has happened (events). For a given use-case/process flow, it is possible to relate which commands that spun off which events etc. By treating the log table as a staging table, and do some post-processing on it, it should be possible to not only do forward navigation (e.g. do new queries against the log table with a related message/aggregate ID) but also backwards navigation.

Although it is feasible creating something like this in a relational database, it would be much more suitable to use a graph database (like Neo4j or FlockDB) to insert the log projections into.

When the graph is up, it should be fairly easy building some nice UIs (D3.js/tree or something) on top of it.

Thoughts?

--larsw

Kelly Sommers

unread,

Aug 20, 2012, 12:42:28 PM8/20/12

to distsys...@googlegroups.com

I like this idea a lot. I know we have a lot of people on the mailing list from different ecosystems, is anyone persisting their distributed systems communication in a graph database? I'm with Lars, it feels like it would be a great way to debug what happened in a system.

Lars, I imagine you would need association persisted as well. Did you store any kind of correlation-id?

Leon Guzenda

unread,

Aug 29, 2012, 5:06:05 PM8/29/12

to distsys...@googlegroups.com

You could do this very easily in a distributed graph database, such as InfiniteGraph.They have a free download and you can even build production systems with up to a million nodes and edges.

Leon

Hoop Somuah

unread,

Aug 30, 2012, 4:50:41 PM8/30/12

to distsys...@googlegroups.com

My boss just sent me this, curious as to whether anyone here has looked into Titan: http://thinkaurelius.com/2012/08/06/titan-provides-real-time-big-graph-data/

This deck provides a nice overview of graph computing options and in slides 88 thru 94 contrasts the different classes of graph database engine out there:

http://www.slideshare.net/slidarko/titan-the-rise-of-big-graph-data

They also talk about tinkerpop which I hadn't heard of before but I like the way they break up thinking about graph databases.

Titan is backend agnostic, I believe the current implementation supports backing your data either with Cassandra or HBase and provides

--
You received this message because you are subscribed to the Google Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com.
To unsubscribe from this group, send email to distsys-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/distsys-discuss/-/2E8l2SCVtMsJ.

For more options, visit https://groups.google.com/groups/opt_out.

kell.s...@gmail.com

unread,

Aug 30, 2012, 4:59:44 PM8/30/12

to distsys...@googlegroups.com

I've heard of Titan but haven't used it yet. I have a lot to learn about graph databases still but I believe Titan when used with Cassandra uses a graph based partitioner instead of the often default of random partitioner so that graph nodes are closer together on the same physical cluster node.

The problem that I haven't figured out is at some point the problem of distributed joins will come into play even if a graph based partitioner is used to mitigate some.

I could be wrong but that's my interpretation so far.

Very interesting though. I definitely want to check it out soon.

Anyone else have some insights into Titan?

Yves Reynhout

unread,

Dec 6, 2012, 1:57:55 PM12/6/12

to distsys...@googlegroups.com

I think I came to a similar conclusion today. I was building a read-only REST api on top of my *gasp* relational event store. Resources I had fleshed out were streams (a collection of event streams, where an event stream is a set of correlated events that happened in a particular order), events (a collection of all the events themselves), causations (a collection of things (identifiers, command identifiers, if you will) that caused the events to happen), correlations (a collection of things (identifiers) that stitch together multiple causations and events). While building representations of these resources, it struck me how navigable I wanted them to be, i.e. including "named" hyperlinks to make it obvious how resources related to one another. Yet the underlying store was never optimized to answer these questions (let alone quickly). This is where graph dbs come in., I think. This is uncharted territory for me ... I started drawing what could be an abstract way of navigating event sources/streams (diagram). This would make for a great "ops" story, but I don't think it ends there. Typical of aggregate oriented design, and probably even more so with event sourcing, is the referencing of other entities using some form of identity. Coupled with the semantics of such references, messages (events, commands alike) could be decomposed into graphs, allowing to answer "entry level" BI questions.

Anyway, just felt like sharing this.

Greg Young

unread,

Dec 6, 2012, 2:03:28 PM12/6/12

to distsys...@googlegroups.com

im curious can you come to london tuesday on the train?

--

You received this message because you are subscribed to the Google Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com.
To unsubscribe from this group, send email to distsys-discu...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msg/distsys-discuss/-/Zzl-8ulcolcJ.

larsw

unread,

Dec 6, 2012, 2:05:46 PM12/6/12

to distsys...@googlegroups.com

I'm definitely interested in hearing about your progress / how you choose to implement it.

If anything ends up in an open source project, let me know.

--larsw

Yves Reynhout

unread,

Dec 6, 2012, 2:11:45 PM12/6/12

to distsys...@googlegroups.com

Doubt it ... in this place from 9/12 til 20/12.

Greg Young

unread,

Dec 6, 2012, 2:16:36 PM12/6/12

to distsys...@googlegroups.com

im interested as well in partivular with overlap..

--

You received this message because you are subscribed to the Google Groups "Distributed Systems" group.
To post to this group, send email to distsys...@googlegroups.com.
To unsubscribe from this group, send email to distsys-discu...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msg/distsys-discuss/-/6osZosGcelgJ.

Yves Reynhout

unread,

Dec 6, 2012, 2:17:16 PM12/6/12

to distsys...@googlegroups.com

OSS-ing my brainfarts, there's an idea.

esa...@cloudera.com

unread,

Mar 10, 2013, 3:41:21 PM3/10/13

to distsys...@googlegroups.com

This sounds suspiciously like Dapper[1] by Google. They obviously used BigTable for storage, banking on sane sparse row support for storage of the graph. Each "span" (akin to a stack frame, but across layers of the system) is stored in a cell, horizontally rather than vertically. This makes graph traversal trivial given the access model and locality of data within a row. The row is keyed by a trace ID, presumably containing the fields you mention; I don't recall the details, but that seems logical enough. Since rows are sorted by key, and there's support for start/end keys with table range scans, it's also easy to see how one could find all traces for a given service, or within a given time frame, depending on how the key is structured.

Of course, they'll follow the typical denormalize-and-brute-force path, but that isn't to say a typical star schema in a relational store doesn't work, at least logically. The problem will just be the way one would need to structure queries to pull back what you want. If you're willing to consider a specialized store (e.g. graph), it might make sense to jump directly to a BigTable-workalike[2] and use the same trick as Dapper. In this case, the graph of a trace isn't equivalent to other graphs where Neo et al excel. If you think about the relationship of trace data, while it is a graph, it is only locally (i.e. within a trace) connected. That is, all children of instrumentation point A are guaranteed to have no other parent. Typical graph databases support functionality used to efficiently store and navigate a graph with much wider connectedness (graph experts please excuse my ignorance of proper terminology). As we know something special about our graphs:

* They aren't incredibly deep.

* They're always accessed by their root node.

* They are small and directed in a very specific manner.

...I'd be inclined to focus on a storage model that is hyper-efficient at retrieval of a graph by only its root and tooling that then navigates within the selected graph (which is far smaller than for what a generalized solution might optimize).

[1] http://bit.ly/13POpY8
[2] Full disclosure: I work at Cloudera who has a horse (HBase) in this race.

Michael Rose

unread,

Mar 10, 2013, 4:00:38 PM3/10/13

to distsys...@googlegroups.com

Twitter has an implementation of Dapper

https://github.com/twitter/zipkin

--

You received this message because you are subscribed to the Google Groups "Distributed Systems" group.

To unsubscribe from this group and stop receiving emails from it, send an email to distsys-discu...@googlegroups.com.

To post to this group, send email to distsys...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msg/distsys-discuss/-/G5JEbZmP8-wJ.

For more options, visit https://groups.google.com/groups/opt_out.

--

--
Michael Rose (@Xorlev)
Senior Platform Engineer, FullContact
mic...@fullcontact.com

Reply all

Reply to author

Forward