using core.logic with datomic

Showing 21-41 of 41 messages
using core.logic with datomic Michael Bradley, Jr. 10/14/12 10:49 PM
Stuart,

At the Datomic workshop preceding the Strange Loop conference, you made a comment regarding some work David Nolen had done showing how core.logic and Datomic can be used together.  I don't remember exactly what you said, but I think the basic idea was that there is a "better" (more efficient? idiomatic?) way to accomplish that end than what David had demonstrated. (I have in mind a page from core.logic's github wiki, but I'm not sure that's what you were referring to.)

I'm about to start experimenting with core.logic in the context of an application that makes heavy use of Datomic, and was hoping you could restate, and perhaps further explain, what you said at the workshop.

By the way, my little team is really enjoying Datomic, and I'm sure you won't be surprised when I say that it's leading us to think about our data and algorithms in ways that were barely conceivable (for us, at least) only a few weeks ago.
Re: using core.logic with datomic David Nolen 10/15/12 6:49 AM
> --
> -- You received this message because you are subscribed to the Google Groups
> Datomic group. To post to this group, send email to
> dat...@googlegroups.com. To unsubscribe from this group, send email to
> datomic+u...@googlegroups.com. For more options, visit this group at
> https://groups.google.com/d/forum/datomic?hl=en
>
>

I've added datomic support to core.logic
http://github.com/clojure/core.logic/commit/b0f3b251ea9b4efd80d2b31d4e9eb1e9d096a1bf

I believe these changes should address the issues Stuart brought up -
I now pass along components to limit the number of datoms returned.
I've also made the API more sensible - you pass along a connection,
the index, and a 4 element vector that you want to unify with the
datom.

Feedback welcome!

David
Re: using core.logic with datomic Rich Hickey 10/15/12 7:54 AM
That's cool David, thanks!

There are a few things we could do to make that better and faster:

    It shouldn't be using the private datomic.db.Datum type, but the public datomic.Datom interface.

    datomic-rel should take a db (data), not a connection (communications).

    Taking the index explicitly is an anti-pattern - I think we could instead look at what is bound in a/q and pick the best index based on that.

There are probably others, but I'm not familiar with the core.logic internals.

I don't have time right at the moment to work on this, but would be happy to help someone who did.

Rich

Re: using core.logic with datomic David Nolen 10/15/12 8:38 AM
On Mon, Oct 15, 2012 at 10:53 AM, Rich Hickey <richh...@datomic.com> wrote:
> That's cool David, thanks!
>
> There are a few things we could do to make that better and faster:
>
>     It shouldn't be using the private datomic.db.Datum type, but the public datomic.Datom interface.

Easy enough to fix.

>     datomic-rel should take a db (data), not a connection (communications).

Done!

>     Taking the index explicitly is an anti-pattern - I think we could instead look at what is bound in a/q and pick the best index based on that.

This would have to happen by examining q - it's probably best then for
q to be a map and not a vector. Does this sound right to you?

Thanks,
David
Re: using core.logic with datomic Rich Hickey 10/15/12 9:03 AM
I'm not entirely sure of the role of q. Will people also put constants there? Don't-care values?

The (EAVT) nth-indexes work fine as indexes, a map wouldn't fit with the patterny aspect. Also the best index depends on 'a' as well, right?


Re: using core.logic with datomic David Nolen 10/15/12 9:16 AM
>> On Mon, Oct 15, 2012 at 10:53 AM, Rich Hickey <richh...@datomic.com> wrote:
> I'm not entirely sure of the role of q. Will people also put constants there? Don't-care values?

q is a Clojure vector but really it's treated just like a Prolog-style
4 element tuple of fresh logic vars and/or constants.

We run the Datomic query using whatever ground vars / constants are
provided in q as the components to the Datomic datoms call - this will
return a seq of datoms. We then unify each one in turn with q - this
will filter out anything that doesn't unify.

> The (EAVT) nth-indexes work fine as indexes, a map wouldn't fit with the patterny aspect. Also the best index depends on 'a' as well, right?

I think I see the API you want. For example we could imagine a q
vector that looks like so in terms of fresh and ground logic vars.
Again by fresh we just mean logic vars that have no value (yet).
ground means either we have a logic var bound to a value or a
constant.

[fresh ground fresh ground]

In this case we should use the index most sensible for this q - :aevt
in this case

Does this sound right?

David
Re: using core.logic with datomic Rich Hickey 10/15/12 9:27 AM
Yes, but might we not also find more things bound in the 'a' passed to the closure with which we can further constrain the index?



Re: using core.logic with datomic David Nolen 10/15/12 9:43 AM
On Mon, Oct 15, 2012 at 12:27 PM, Rich Hickey <richh...@datomic.com> wrote:
> Yes, but might we not also find more things bound in the 'a' passed to the closure with which we can further constrain the index?

I don't follow, is it possible to constrain the index by providing
something more than e, a, v, t?

To be clear, the datomic-rel will pass along as many components as we
can find in q.

To be extra clear, if you're asking if we lookup the values for e, a,
v, t provided in q by substituting any values we find for them in 'a'
... yes we do :)

David
Re: using core.logic with datomic Rich Hickey 10/15/12 9:55 AM
Again, I don't know the internal flow of core.logic. But, let's say [e a v t] are all unbound in q, but e is bound in 'a' - the logic for choosing the index should leverage that, not just the bound/ground contents of 'q'.





Re: using core.logic with datomic David Nolen 10/15/12 10:04 AM
On Mon, Oct 15, 2012 at 12:55 PM, Rich Hickey <richh...@datomic.com> wrote:
> Again, I don't know the internal flow of core.logic. But, let's say [e a v t] are all unbound in q, but e is bound in 'a' - the logic for choosing the index should leverage that, not just the bound/ground contents of 'q'.

Ok we're talking about the same thing :) What is and isn't bound in q
is fully determined by the contents of 'a', and datomic-rel already
takes this into account.

So I'll change the index to be inferred from the ground components of
q (which are determined through 'a') and I think we're good to go.

David
Re: using core.logic with datomic Rich Hickey 10/15/12 10:13 AM
For my edification, is that because walk(*) has effects on the logic vars in q, or due to the context of use of the closure? Because it's not functionally apparent in the code.

Re: using core.logic with datomic David Nolen 10/15/12 10:38 AM
On Mon, Oct 15, 2012 at 1:13 PM, Rich Hickey <richh...@datomic.com> wrote:
> For my edification, is that because walk(*) has effects on the logic vars in q, or due to the context of use of the closure? Because it's not functionally apparent in the code.

'a' is a substitution map - kind of like an environment in an interpreter. It maintains a list of bindings from logic vars to their values (if they exist).

"goals" in core.logic are just closures - these closures take a single parameter 'a' which is just the substitution map. goals update the substitution map in a purely functional way and return it so it can be threaded into the next goal. So for example:

(run* [q]
  (fresh [e a v t]
    (== q [e a v t])             ;; 'a' = {q [e a v t]}
    (== e 10)                    ;; 'a' = {q [e a v t], e 10}
    (datomic-rel db [e a v t])))


walk* is simply a recursive walk, it returns the second argument with all the logic vars replaced by whatever values we found for them in 'a'.

so (walk* 'a' q) in this case will result in [10 a v t]

David
Re: using core.logic with datomic Michael Bradley, Jr. 10/15/12 10:45 AM
On Monday, October 15, 2012 8:49:52 AM UTC-5, David Nolen wrote:
On Mon, Oct 15, 2012 at 1:49 AM, Michael Bradley, Jr.
<michaels...@gmail.com> wrote:
<snip> 


I've added datomic support to core.logic
http://github.com/clojure/core.logic/commit/b0f3b251ea9b4efd80d2b31d4e9eb1e9d096a1bf

I believe these changes should address the issues Stuart brought up -
I now pass along components to limit the number of datoms returned.
<snip> 


David, thanks for working in on this!  I'll be giving it a whirl shortly.
Re: using core.logic with datomic David Nolen 10/15/12 11:24 AM
On Mon, Oct 15, 2012 at 1:45 PM, Michael Bradley, Jr. <michaels...@gmail.com> wrote:
David, thanks for working in on this!  I'll be giving it a whirl shortly.

Might want to wait till I get the suggested improvements in later this evening :) I'll cut another beta shortly thereafter.
Re: using core.logic with datomic David Nolen 10/15/12 7:03 PM
On Mon, Oct 15, 2012 at 10:53 AM, Rich Hickey <richh...@datomic.com> wrote:
That's cool David, thanks!

There are a few things we could do to make that better and faster:

    It shouldn't be using the private datomic.db.Datum type, but the public datomic.Datom interface.

    datomic-rel should take a db (data), not a connection (communications).

    Taking the index explicitly is an anti-pattern - I think we could instead look at what is bound in a/q and pick the best index based on that.

There are probably others, but I'm not familiar with the core.logic internals.

I don't have time right at the moment to work on this, but would be happy to help someone who did.

Rich

Ok I've made the suggested changes. We now use the datomic.Datom interface for unification. We use a db not a connection. We now infer which index to use based on the contents of the query tuple. You can see the results at the bottom of the file:


Working on this helped me to understand some things conceptually about Datomic that I was unclear about. The core.logic is pretty low level, we have to extract the entids and we need to run two queries. It's neat but a lot of work would need to be done for core.logic's Datomic queries to be as concise as Datomic's Datalog syntax :)

Further suggestion welcome!

David
Re: using core.logic with datomic Rich Hickey 10/16/12 5:52 AM
Cool. I'll have to give you some logic for the index determination/prioritization. Not all attributes have AVET or VAET indexes available.

The entid resolving can be moved under the hood, and can be done for all E and A, and even V where the attr type is ref.

As far as two queries, do you mean the pairs of:

    (== q0 [e attr-id a])
    (datomic-rel db q0)

Before, you were passing the vector right into datomic-rel.

A shorter name than datomic-rel might help with the concision :)

Rich

Re: using core.logic with datomic David Nolen 10/16/12 6:04 AM
On Tue, Oct 16, 2012 at 8:52 AM, Rich Hickey <richh...@datomic.com> wrote:

Cool. I'll have to give you some logic for the index determination/prioritization. Not all attributes have AVET or VAET indexes available.

Excellent.
 

The entid resolving can be moved under the hood, and can be done for all E and A, and even V where the attr type is ref.

Ah right!
 
As far as two queries, do you mean the pairs of:

    (== q0 [e attr-id a])
    (datomic-rel db q0)

Before, you were passing the vector right into datomic-rel.

Yes, I could pass the vectors directly in - my point was more an observation we have to make two calls vs. where Datomic Datalog knows to do the join.
 
A shorter name than datomic-rel might help with the concision :)

Noted. 
Re: using core.logic with datomic Rich Hickey 10/16/12 6:26 AM
I see. Well, presumably people are using core.logic instead of datalog for a (good) reason. Doing anything that Datomic datalog could do (e.g. straight queries) is not a good reason :) But good reasons include logic programs that are generative, or search giant permutation spaces, and that produce values incrementally. In such programs it is likely an anti-pattern to do fully realized internal joins.

If we stick with core.logic's evaluation strategy, keep the entry point overhead low (that multimethod is not long for this world :) and fully leverage Datomic's indexes, core.logic + Datomic should give people a powerful logic tool coupled with durable and large (and dynamically updated, and well cached and ...) datasets, with good performance.

Rich

Re: using core.logic with datomic David Nolen 10/16/12 3:42 PM
On Tue, Oct 16, 2012 at 9:26 AM, Rich Hickey <richh...@datomic.com> wrote:
>
>
> If we stick with core.logic's evaluation strategy, keep the entry point overhead low (that multimethod is not long for this world :) and fully leverage Datomic's indexes, core.logic + Datomic should give people a powerful logic tool coupled with durable and large (and dynamically updated, and well cached and ...) datasets, with good performance.
>
> Rich


We'll see :) 


More improvements, it actually looks pretty fun now :)

David
Re: using core.logic with datomic kovasb 10/16/12 10:36 PM
This sounds like neat stuff.

Could anyone venture a explanation of how I might use this?

I've having a hard time wrapping my head around what this interop means.

Stu in his ORM talk termed it as "tuple at a time processing"
complimentary to the set-oriented datomic approach. I admit I didn't
grap that either.

It would be great to have a toy/sample problem a la the canonical
constraint puzzle problem to make this concrete, at least for me.
Thanks!
Re: using core.logic with datomic Rich Hickey 10/17/12 5:11 AM
Prolog-style logic engines (like core.logic) can be used to write generative programs like schedulers.

E.g. you might have a db of classrooms, classes, professors (and what they teach) and students (and what they want to take).

The number of schedule permutations is huge. A generative program can be used to spit out schedules one at a time that meet some basic constraints (no double-booking of people or places), while some downstream function assesses for other criteria (how many students get what they want, is it balanced for professors etc). Such a process will stop after it exhausts some (usually time-based) budget, yielding the best answer found so far.

This is different from traditional non-generative queries (as implemented by Datalog). While Prolog-style engines can be used for queries, they are usually less good at it than set-at-a-time engines like Datalog and SQL, having issues with both performance and termination.

The ability to have both kinds of engines work on the same database is tremendous power.

Because Datomic is not monolithic, there is actually nothing about its indexing and storage that has anything to to with Datalog. A query engine is yet another orthogonal component, once you no longer have to situate it in a central server.