next release?

Michael Hrivnak

unread,

Apr 24, 2016, 10:02:49 PM4/24/16

to cayley-users

This looks like a great project. In particular, it seems that it would be relatively easy to get this packaged and included in mainstream linux distributions. Currently the options are very limited if you want to use a graph DB that's already part of your distro, so it would be wonderful to have cayley as an option. I'm talking with a Fedora packager who is interested in packaging it there.

But is this project still active? I see the last release was 1 year ago, and the past month has seen no PRs merged or issues closed. Is this just a side-project for one person, or is there a developer community able to move things forward? I see that there are 196 un-released commits on master, so that's a good sign that progress has been made.

Thanks,

Michael

Denys Smirnov

unread,

Apr 25, 2016, 11:47:39 AM4/25/16

to cayley-users, Barak Michener

Hi Michael,

The project is alive, the progress is being made, but the current maintainer (Barak Michener) is busy right now, so most commits and PRs are in frozen state. I think as soon as he will be able to review some changes, new version will be released.

I'm also wondering what kind of use case do you have for the database? It's pretty surprising that it might be included in linux distribution)

All the best,
Denys

понедельник, 25 апреля 2016 г., 5:02:49 UTC+3 пользователь Michael Hrivnak написал:

Michael Hrivnak

unread,

Apr 25, 2016, 4:40:00 PM4/25/16

to cayley-users, m...@barakmich.com

Thank you for the quick response. It sounds like it may be time to add one or two more maintainers to keep the project moving.

To your surprise about linux distribution inclusion, I think it's perfectly reasonable. Many people want to try out graph DBs. Pick your favorite linux distribution, and there are likely numerous databases ready for easy installation. If I want to create a web app, everything is available to form the usual stacks people like. postgresql, mysql/mariadb, mongodb, redis, and more are all usually available. Strikingly, the options for graph DBs are almost non-existent. You have to manually install one from a third-party source (even if that third party is the vendor), or build it yourself. Having a DB packaged with a distribution has key advantages:

1) Ease of getting stared. "dnf install cayley" and I'm off to the races. I don't have to worry about dependencies, will it build, will it run in this environment, how do I start it, what about selinux, etc.

2) Maintenance. What versions will continue to be available? Who is keeping it up-to-date? Are they keeping the version/API stable? Sometimes that means backporting fixes...

3) Inclusive integration. If my project is packaged in a distribution, its dependencies should also be packaged in that distribution. Most distros have policies to that effect. The project I work on for my job happens to be one of those (http://pulpproject.org). A graph database could help us solve at least one big problem, but I don't have options I can use.

As for my use case, I'll try to describe it succinctly. Given a huge number of servers, with lots of variation in what software is installed and what repositories they have access to, I need to determine which systems need which software updates. Each system may have thousands of packages installed, and have access to tens of thousands. We are solving it now by pre-calculating a lot of data in very specific formats, which can then be queried quickly. But the calculation takes a lot of time. As a total newbie to graph DBs, it seems like a promising use case. Feedback on that is certainly welcome though.

Thanks,

Michael

Navpreet Singh

unread,

May 1, 2016, 2:36:37 PM5/1/16

to cayley-users, m...@barakmich.com

Hi Denys

Is there any way to set the limit a cayley can send or process the data, as if i run g.V().All() to fetch all data[ total records > 4 lakhs ] at once then cayley process id dies. So that if some one run heavy query it should return recovery conflict error or some other error rather that the query kills the cayley proces id.

Denys Smirnov

unread,

May 1, 2016, 3:11:37 PM5/1/16

to Navpreet Singh, cayley-users

All() is meant to return all results at once. You may use GetLimit(500) instead to set an actual limit, or use ForEach to process results one by one. I know that this will not prevent the end user from doing All query and waiting for process to be OOM-killed. We can implement some configurable limit that affects all Gremlin queries as an alternative. You may also use Cayley as a library to get the full control of execution.

вс, 1 мая 2016 г. в 21:36, Navpreet Singh <navpreet....@gmail.com>:

--
You received this message because you are subscribed to a topic in the Google Groups "cayley-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cayley-users/lyA6xuI_82A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cayley-users...@googlegroups.com.
To post to this group, send email to cayley...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cayley-users/bbd3d847-914c-49cd-9922-eb9c40426bd5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Navpreet Singh

unread,

May 2, 2016, 4:12:35 AM5/2/16

to cayley-users, navpreet....@gmail.com

Thanks Denis, it would be great to include the limit in configuration.

Reply all

Reply to author

Forward