Eiffel Persistance

118 views
Skip to first unread message

Azeem Ahmad (PhD candidatem LiU)

unread,
Feb 19, 2018, 11:43:09 PM2/19/18
to Eiffel Community
Hi,

As Ola Leifler mentioned earlier that many groups from different Universities/Industry with collaboration of Linkoping University, during this sprint, are working on different plugins for Eiffel event’s generation (i.e. JIRA, BitBucket or Garret). At the end, events, generated through these different plugins, will be integrated with Eiffel-vici tool for further evaluation. My job, in this sprint, is to provide solution for Eiffel persistence and integration with Eiffel-vici tool. I have explored the possibility for Redis, MongoDB, RabbitMQ and Neo4j as a candidate persistence solution and my findings are as below:

1. (Not Selected) Neo4j will create an overhead due to specific data storage schema and these overhead increases, when it comes to integrate Eiffel-vici with Neo4j. Neo4J, as an independent solution for viewing events, in graph, can be acceptable.

2. (Not Selected) Redis In-memory database, with its own queue system, is a good option which provides faster access, due to RAM persistence but there are few problems ranging from “just a cache” to “no encryption” and “from lower performance with big amount of data” to “lost data at crashes” and its performance is highly depended on the size of RAM as well as insufficient documentation and user community as compared to MongoDB (https://redis.io/topics/persistence).

3. (Selected) RabbitMQ+MondoDB will be selected for Eiffel event’s persistence. Teams, working on different plugins, can generate events and publish on RabbitMQ queue and their job is done. My solution is to consume events from RabbitMQ and persist on MangoDB so it can be saved and integrated with Eiffel-vici. This will provide following advantages:
       a. Teams do not need to work with persistence and they can focus on specific functionality.
       b. If the event is not consumed, it will not be lost and will return to the queue, until it is consumed properly. There are few specific cases, I need to control such as mentioned in http://tebros.com/2011/07/data-consistency-with-asynchronous-queues-and-mongodb/

Please share, if you have any other specific tool for persistance, in mind, that can serve the purpose for Eiffel’s events.

regards
Azeem

Magnus Bäck

unread,
Feb 20, 2018, 2:30:19 AM2/20/18
to Azeem Ahmad, Eiffel Community
On Tuesday, February 20, 2018 at 05:43 CET,
"Azeem Ahmad (PhD candidatem LiU)" <aze...@gmail.com> wrote:

> As Ola Leifler mentioned earlier that many groups from different
> Universities/Industry with collaboration of Linkoping University,
> during this sprint, are working on different plugins for Eiffel
> event’s generation (i.e. JIRA, BitBucket or Garret). At the end,
> events, generated through these different plugins, will be integrated
> with Eiffel-vici tool for further evaluation. My job, in this sprint,
> is to provide solution for Eiffel persistence and integration with
> Eiffel-vici tool. I have explored the possibility for Redis, MongoDB,
> RabbitMQ and Neo4j as a candidate persistence solution and my findings
> are as below:
> 1. (Not Selected) Neo4j will create an overhead due to specific data
> storage schema and these overhead increases, when it comes to
> integrate Eiffel-vici with Neo4j. Neo4J, as an independent solution
> for viewing events, in graph, can be acceptable.

Could you elaborate on this overhead? I haven't looked into Neo4j in
detail but find its graph-based data model well suited for storing
Eiffel events given the kind of queries one would want to make.

[...]

> 3. (Selected) RabbitMQ+MondoDB will be selected for Eiffel event’s
> persistence. Teams, working on different plugins, can generate events
> and publish on RabbitMQ queue and their job is done.

But... this is the general idea regardless of the persistence backend,
is it not? Producers talking directly to the backend has never been on
the table from my point of view.

> My solution is to consume events from RabbitMQ and persist on MangoDB
> so it can be saved and integrated with Eiffel-vici. This will provide
> following advantages:
> a. Teams do not need to work with persistence and they can focus
> on specific functionality.
> b. If the event is not consumed, it will not be lost and will
> return to the queue, until it is consumed properly. There are few
> specific cases, I need to control such as mentioned
> in [2]http://tebros.com/2011/07/data-consistency-with-asynchronous-queu
> es-and-mongodb/
> Please share, if you have any other specific tool for persistance, in
> mind, that can serve the purpose for Eiffel’s events.

If you're looking for a JSON document store that's moderately
well-suited for graphs then Elasticsearch would be a viable option.

Of course, given today's dearth of an Eiffel-integrated storage backend
I'd take anything.

--
Magnus Bäck | Software Engineer, Development Tools
magnu...@axis.com | Axis Communications

Daniel Ståhl

unread,
Feb 20, 2018, 3:24:23 AM2/20/18
to eiffel-c...@googlegroups.com
Hi,

First, a quick status update and an apology :) I'm well aware that open sourcing of Ericsson implementation of event persistence has been promised. Nobody would like to see this happen more than I would, but there's an issue of competing priorities. I still hope we can make good on this promise sooner rather than later, and then supply sandbox image demonstrating the various components working together.

Second, great input, Azeem! I think there are quite a few more potential options for underlying database. The thing about event queries is that they're both regular object lookups and graph traversals, so the choice of paradigm isn't self-evident. We are actually starting a Master Thesis project on this as we speak, expected to run through the spring, evaluating a series of options. I'll be sure to put that student in contact with you, Azeem, once he gets started!


> 3. (Selected) RabbitMQ+MondoDB will be selected for Eiffel event’s
> persistence. Teams, working on different plugins, can generate events
> and publish on RabbitMQ queue and their job is done.

But... this is the general idea regardless of the persistence backend,
is it not? Producers talking directly to the backend has never been on
the table from my point of view.

Agreed, reading from a message bus (e.g. RabbitMQ) is a given. Choice of database technology should be completely orthogonal to choice of message bus technology.
 
>        b. If the event is not consumed, it will not be lost and will
> return to the queue, until it is consumed properly. There are few
> specific cases, I need to control such as mentioned
> in [2]http://tebros.com/2011/07/data-consistency-with-asynchronous-queu
> es-and-mongodb/
> Please share, if you have any other specific tool for persistance, in
> mind, that can serve the purpose for Eiffel’s events.

Why would you return messages to the queue if not consumed? It's not the database's responsibility to keep track of whether there's anyone around to listen in real time. The database always consumes everything (barring some filtering of irrelevant events), regardless of whether anyone else is listening.

Daniel
 

Azeem Ahmad (PhD candidatem LiU)

unread,
Feb 20, 2018, 4:35:08 AM2/20/18
to Eiffel Community
Thank you Daniel for your feedback. I look forward for information about master thesis student. Can you share an expected date, when he/she starts the thesis?

The event should return to the queue, if the database server is not responding or not available.

Azeem Ahmad (PhD candidatem LiU)

unread,
Feb 20, 2018, 4:56:31 AM2/20/18
to Eiffel Community


On Tuesday, February 20, 2018 at 10:30:19 AM UTC+3, Magnus Bäck wrote:
On Tuesday, February 20, 2018 at 05:43 CET,
     "Azeem Ahmad (PhD candidatem LiU)" <aze...@gmail.com> wrote:

> As Ola Leifler mentioned earlier that many groups from different
> Universities/Industry with collaboration of Linkoping University,
> during this sprint, are working on different plugins for Eiffel
> event’s generation (i.e. JIRA, BitBucket or Garret). At the end,
> events, generated through these different plugins, will be integrated
> with Eiffel-vici tool for further evaluation. My job, in this sprint,
> is to provide solution for Eiffel persistence and integration with
> Eiffel-vici tool. I have explored the possibility for Redis, MongoDB,
> RabbitMQ and Neo4j as a candidate persistence solution and my findings
> are as below:
> 1. (Not Selected) Neo4j will create an overhead due to specific data
> storage schema and these overhead increases, when it comes to
> integrate Eiffel-vici with Neo4j. Neo4J, as an independent solution
> for viewing events, in graph, can be acceptable.

Could you elaborate on this overhead? I haven't looked into Neo4j in
detail but find its graph-based data model well suited for storing
Eiffel events given the kind of queries one would want to make.

---- My plan is to use Eiffel-vici tool for event/graph visualization. In order to store eiffel events in Neo4J, we need to convert each event into specific graph model, supported by Neo4J. After saving these events, I need to convert them again to JSON to visualize in Eiffel-vici tool (in this particular case, Neo4J, only as persistence, is not a good option as long as I do not use its capabilities for querying and traversing graphs for visualization).  Currently, there are some problems in Neo4J with respect to process/store bulk data of JSON as well.

After my last discussion with Ola Leifler, I am also considering Neo4J for persistence and visualization assuming a person must have technical capabilities to run queries in Neo4J if i do not find some better front-end. What do you suggest?

Daniel Ståhl

unread,
Feb 20, 2018, 5:03:25 AM2/20/18
to Eiffel Community
Hi Azeem,

Start date is "as soon as possible". Presumably this month. I'll send him your way on day one, promise :)

Perhaps I misinterpreted your original statement. What you're saying is that your database will only acknowledge (and therefore remove from the queue) messages once stored (and presumably you envision a durable queue), not that messages will be resent unless consumed by some other listener? If so, that makes a lot more sense, with the caveat that this is the default behavior you would expect from any consumer of event messages... which I guess is what threw me when I first read your initial post. Sorry for the mixup.

Regarding the interface of the persistence solution, Vici makes certain assumptions regarding that interface. Will you be following those assumptions? Essentially, the basic functionality that's needed from an event database is to fetch events matching a filter, and any events linked from or linked to using specified link types. For instance,

<uri>/_search?q=meta.type:EiffelArtifactCreatedEvent

This would return all EiffelArtifactCreatedEvents. Then particular convenience end-points, like id would simply be special cases of this:


<uri>/id/abcdef...

would be short-hand for:


<uri>/_search?q=meta.id:abcdef...

but would still return an array of matches.


With this setup, you can easily tack on upstream and downstream searches to the result via query parameters. First you fetch any matches, then you add anything along the specified upstream and downstream links to the result. To exemplify:


<uri>/_search?q=meta.type:EiffelArtifactCreatedEvent&dlTypes[]=IUT&ulTypes[]=ARTIFACT&ulTypes[]=ELEMENT

This would first fetch all EiffelArtifactCreatedEvents. Let's say this produces an array of two events:


[
 
EiffelArtifactCreatedEvent1,
 
EiffelArtifactCreatedEvent2
]



Then everything found along downstream IUT links would be added. Let's say one test case execution is found:


[
 
EiffelArtifactCreatedEvent1,
 
EiffelArtifactCreatedEvent2,
 
EiffelTestCaseTriggeredEvent1
]



... and so on and so forth for the specified ulTypes. Does that make any sense?


Perhaps we should have interfaces projects where we get together to define common interfaces of particular types of actors. We can have any number of implementations of persistence, potentially, but it would be a good idea to have a shared understanding of how to communicate with them.


Best regards,
Daniel

Magnus Bäck

unread,
Feb 21, 2018, 5:44:51 AM2/21/18
to Azeem Ahmad, Eiffel Community
On Tuesday, February 20, 2018 at 10:56 CET,
"Azeem Ahmad (PhD candidatem LiU)" <aze...@gmail.com> wrote:

> ---- My plan is to use Eiffel-vici tool for event/graph visualization.
> In order to store eiffel events in Neo4J, we need to convert each event
> into specific graph model, supported by Neo4J.

Yes, that needs to be done. I was hoping one could simply map all fields
from the Eiffel payload to fields/attributes/whatever-they-are-called in
the Neo4j nodes.

> After saving these events, I need to convert them again to JSON to
> visualize in Eiffel-vici tool

Can't you just store the whole JSON blob to avoid conversion back and
forth?

> (in this particular case, Neo4J, only as persistence, is not a good
> option as long as I do not use its capabilities for querying and
> traversing graphs for visualization).

Sure, the graph query ability is the whole point of using Neo4j. For
plain JSON blob storage there are better options.

> Currently, there are some problems in Neo4J with respect to
> process/store bulk data of JSON as well.

Could you elaborate?

> After my last discussion with Ola Leifler, I am also considering Neo4J
> for persistence and visualization assuming a person must have technical
> capabilities to run queries in Neo4J if i do not find some better
> front-end. What do you suggest?

I view Neo4j as a storage backend only. If its built-in frontend can be
used for end-user queries or visualizations that's great but it's not my
expectation.

Azeem Ahmad (PhD candidatem LiU)

unread,
Feb 28, 2018, 11:02:27 AM2/28/18
to Eiffel Community
Hello Daniel,

Apologies for late reply. I was conducting a pilot study with neo4j to explore its applicability for eiffel persistence.

Start date is "as soon as possible". Presumably this month. I'll send him your way on day one, promise :)
 

Perhaps I misinterpreted your original statement. What you're saying is that your database will only acknowledge (and therefore remove from the queue) messages once stored (and presumably you envision a durable queue), not that messages will be resent unless consumed by some other listener? If so, that makes a lot more sense, with the caveat that this is the default behavior you would expect from any consumer of event messages... which I guess is what threw me when I first read your initial post. Sorry for the mixup.

Regarding the interface of the persistence solution, Vici makes certain assumptions regarding that interface. Will you be following those assumptions?

I am aware of some assumptions, made by Vici, but can you please refer me some reading to enhance my understanding about these assumptions?

 
Essentially, the basic functionality that's needed from an event database is to fetch events matching a filter, and any events linked from or linked to using specified link types. For instance,

<uri>/_search?q=meta.type:EiffelArtifactCreatedEvent

This would return all EiffelArtifactCreatedEvents. Then particular convenience end-points, like id would simply be special cases of this:


<uri>/id/abcdef...

would be short-hand for:


<uri>/_search?q=meta.id:abcdef...

but would still return an array of matches.


With this setup, you can easily tack on upstream and downstream searches to the result via query parameters. First you fetch any matches, then you add anything along the specified upstream and downstream links to the result. To exemplify:


<uri>/_search?q=meta.type:EiffelArtifactCreatedEvent&dlTypes[]=IUT&ulTypes[]=ARTIFACT&ulTypes[]=ELEMENT

This would first fetch all EiffelArtifactCreatedEvents. Let's say this produces an array of two events:


[
 
EiffelArtifactCreatedEvent1,
 
EiffelArtifactCreatedEvent2
]



Then everything found along downstream IUT links would be added. Let's say one test case execution is found:


[
 
EiffelArtifactCreatedEvent1,
 
EiffelArtifactCreatedEvent2,
 
EiffelTestCaseTriggeredEvent1
]



... and so on and so forth for the specified ulTypes. Does that make any sense?


Yes, this makes a lot of sense now and I can see that neo4j provides this types of event fetching in an efficient way. but I was wondering, do we assume that this event fetching (particularly in the case of neo4j) requires some technical/specific language capabilities to fetch/views events/information or, as you mentioned, we need to define interfaces for particular types of actor? what do you think? which actor to address first or do you think, we can create some sort of actor classification? Meanwhile, we can discuss actors and interfaces. do you think, we can generalize actors and interfaces?


Perhaps we should have interfaces projects where we get together to define common interfaces of particular types of actors. We can have any number of implementations of persistence, potentially, but it would be a good idea to have a shared understanding of how to communicate with them.


I am working on creating a small project that utilize static eiffel events repository, used by current vici tool, to demonstrate some visualization and persistance. The most time consuming task is to write a wrapper that convert eiffel events to neo4j specific format (nodes and relationship). Another challenge is an aggregated view, apart from fetching specific events, of these events. I have my own vision for aggregated view, and would like to see, if you have visioned it in any way?

 


Best regards,
Daniel

Azeem Ahmad (PhD candidatem LiU)

unread,
Feb 28, 2018, 11:06:14 AM2/28/18
to Eiffel Community


> ---- My plan is to use Eiffel-vici tool for event/graph visualization.
> In order to store eiffel events in Neo4J, we need to convert each event
> into specific graph model, supported by Neo4J.

Yes, that needs to be done. I was hoping one could simply map all fields
from the Eiffel payload to fields/attributes/whatever-they-are-called in
the Neo4j nodes.

good news is, I am working on it :-)

> After saving these events, I need to convert them again to JSON to
> visualize in Eiffel-vici tool

Can't you just store the whole JSON blob to avoid conversion back and
forth?

I am conducting a pilot study with neo4j as I have some static repository of these events. I will update you.

> (in this particular case, Neo4J, only as persistence, is not a good
> option as long as I do not use its capabilities for querying and
> traversing graphs for visualization).

Sure, the graph query ability is the whole point of using Neo4j. For
plain JSON blob storage there are better options.

> Currently, there are some problems in Neo4J with respect to
> process/store bulk data of JSON as well.

Could you elaborate?

we need a little extra effort to load JSON file into neo4j as mentioned in https://neo4j.com/blog/cypher-load-json-from-url/

> After my last discussion with Ola Leifler, I am also considering Neo4J
> for persistence and visualization assuming a person must have technical
> capabilities to run queries in Neo4J if i do not find some better
> front-end. What do you suggest?

I view Neo4j as a storage backend only. If its built-in frontend can be
used for end-user queries or visualizations that's great but it's not my
expectation.

What are your plans for visualization? which tool, do you plan to use to view these events ?

Daniel Ståhl

unread,
Mar 1, 2018, 1:34:22 AM3/1/18
to Eiffel Community

I am aware of some assumptions, made by Vici, but can you please refer me some reading to enhance my understanding about these assumptions?

The assumptions I was thinking of was in terms of the API, as I described. I don't think there's any formal description of what Vici uses, though, but I would hope to address that in an implementation architecture description.
 
Yes, this makes a lot of sense now and I can see that neo4j provides this types of event fetching in an efficient way. but I was wondering, do we assume that this event fetching (particularly in the case of neo4j) requires some technical/specific language capabilities to fetch/views events/information or, as you mentioned, we need to define interfaces for particular types of actor? what do you think? which actor to address first or do you think, we can create some sort of actor classification? Meanwhile, we can discuss actors and interfaces. do you think, we can generalize actors and interfaces?

Sorry, I'm not following. I'll be visiting the LiU campus this afternoon in other business. Let's try to meet up to talk this through.

Daniel

Azeem Ahmad (PhD candidatem LiU)

unread,
May 2, 2018, 4:44:27 AM5/2/18
to Eiffel Community
Dear Magnus,

I remember, during our chat here, you mentioned an API that can look-up an event from Eiffel Persistence store. I believe, you have looked at https://github.com/eiffel-community/eiffel-store that is a dynamic persistence store which receives events from RabbitMQ, store in MangoDB and update vici accordingly (If I am not wrong, current Eiffel sandbox does not provide the source code for Eiffel persistence). I am planning to provide a REST API that can allow user to look-up an event information, given event id. Since, you are working on this project, your expert judgment in term of input and output of this API would be really appreciated. What information you want to retrieve, given specific input?

warm regards
Azeem

Magnus Bäck

unread,
May 9, 2018, 3:36:31 AM5/9/18
to Azeem Ahmad, Eiffel Community
On Wednesday, May 02, 2018 at 10:44 CEST,
Azeem Ahmad <aze...@gmail.com> wrote:

> I remember, during our chat here, you mentioned an API that can
> look-up an event from Eiffel Persistence store.

Yes! Unless Eiffel event ids can be passed downstream we're going to
have a need to look up ids given the entities we _do_ have. For example,
a CI system that triggers on source control events will have information
like the commit id but nothing on the EiffelSourceChangeSubmittedEvent
event.

> I believe, you have looked at
> https://github.com/eiffel-community/eiffel-store that is a dynamic
> persistence store which receives events from RabbitMQ, store in
> MangoDB and update vici accordingly (If I am not wrong, current Eiffel
> sandbox does not provide the source code for Eiffel persistence).

That's great!

> I am planning to provide a REST API that can allow user to look-up an
> event information, given event id. Since, you are working on this
> project, your expert judgment in term of input and output of this API
> would be really appreciated. What information you want to retrieve,
> given specific input?

I think the API should be capable of accepting queries expressed
in a reasonable and ideally non backend-specific form and returning
a list of matching Eiffel events. In other words, raw SQL or
MongoDB's own query language would be unfortunate as it'd be very
implementation-specific. GraphQL would be an interesting choice.
Of course, any query language is better than nothing at all so I
don't think we should spend too much time on that now. Here's a
simple example (using the Lucene query string syntax):

GET /events?q=meta.type:EiffelSourceChangeSubmittedEvent+AND+data.GitIdentifier:66f8d665323158a58595938c5b1722dc6c2c70ee
{
"result": "ok",
"matches": [
{"meta": {...}, "data": {...}, "links": {...}},
{"meta": {...}, "data": {...}, "links": {...}},
...
]
}

(If we're going to pick a query language, I think the Lucene query
language would be a reasonable choice. It's native to Elasticsearch
and, obviously, Lucene, and it wouldn't be hard to implement a parser
for a subset of the language and translate queries to a backend-native
representation.)
Reply all
Reply to author
Forward
0 new messages