Thanks for taking the initiative Soren.
Like Paul said, a smarter solver could be nice. Say I have the following example:
(defn tap-accessor [tap]
(<- [?a ?b]
(tap ?t)
(get-a ?t ?a)
(get-b ?t ?b)))
(?<- [?a] output-tap (tap-accessor ?a _))
In the current Cascalog version #'get-b is being called while the result is being neglected. A smarter solver could do this more efficiently probably by not calling #'get-b
Something a core.logic solver could also help with is providing better error messages, maybe even as far as 'did you mean this' when you have written a query that is unsolvable. Or provide ideas for writing a query more efficient.
Another thing that would be nice is adding tools/hooks for profiling. Maybe we could add certain information to the log output that could later be analysed or when the process is one java process we could put this information in an atom. I often find myself guessing or applying 'best' practices in order to make a query more efficient.
Other Ideas
------------------
One of the things that has bothered me the most is the Cascading Taps part of Cascalog. For example, I am a user of the
maple library which is very inefficient for large Postgres databases. To fix this though I need to rewrite a large part of this library in Java. It would be nice if we could provide Clojure abstractions that would make the process of creating Cascading Taps easier, while still being efficient. This is probably more for Cascalog-contrib, but I would still be interested in what others think of this: e.g. does it make sense?
Maybe not a feature, but still important, what about documentation? I think it would help for the Cascalog community if there would be a place where we could put tutorials. From getting started to using Cascalog Checkpoint, from building your own tap to deploying on EMR. Currently all this information is absent or somewhere on either Nathan's blog, the Wiki, this mailinglist or random blogposts
Maybe another wild idea that could be interesting is adding support for nrepl so you don't have to SSH into a (EMR) machine. Not even sure if this is a real benefit, just throwing some ideas.
Jeroen