sketching out meta among other things

45 views
Skip to first unread message

Doug Tangren

unread,
Feb 21, 2013, 10:42:04 AM2/21/13
to adep...@googlegroups.com
First off all, thank for outlining so much of this. I'm interesting in everything but am going to try and sketch out a proof of concept impl of the meta section this weekend, with conscript (as a user interface) and jgit (as a dcvs tool), sketching out other impls as needed. I like working with something tangible (and having something to play with). I'll post questions and progress along the way. Huge, huge, huge thanks to this very detailed outline of goals and suggestions. I am a huge fan of this whole idea.

One thing that seems like it will come up (but does not need to be decided now) is the serialized format of metadata. If all we need are keys and values I'm fine will the java properties style format. If we want more structure I was going to move toward json. Are there any concerns with that? Json is a well adopted exchange and serialization format and has decent support in the scala community, http://json4s.org/. If there are no issues. The only down side is that it's kind of awkward to author by hand. I am going to make the assumption though that this metadata is going to be machine authored and read so that's not that big of an issue.

I expect to have something like Mark's example usage this weekend but no promises. I may try to make it through an initial cache impl as well.

Wes Freeman

unread,
Feb 21, 2013, 10:51:09 AM2/21/13
to Doug Tangren, adept-dev
+1 for JSON; it already feels like some of those metadata items could be grouped together. down with XML!

Yeah, thanks for fleshing out your outline, Mark. I plan to contribute but it sounds like Doug will beat me to it, which is great. :)

Wes

Fredrik Ekholdt

unread,
Feb 21, 2013, 10:59:10 AM2/21/13
to Doug Tangren, adep...@googlegroups.com
Hey everyone!
Agree that this is a great awesome idea that hopefully will save future me from hours of frustration :)

Also have the beginnings of a concept here (me and mark took the name from this project): https://github.com/freekh/adept
As you can see in the README there is an outline to commands which I want to implement. I have started on a prototype as well, but it can only do init and add (there is some error handling that is missing as well). I hope to have the 'describe' by today (time permitting).

I see I have slightly different basis for the metadata since I am storing it in an embedded DB.
I started out with the idea of using git and text files as well, but realized that I ended up needing too much database functionality to not use a database. I want fast searching, data corruption checks and a way to update from a version of metadata to another (when we update adept).
With idempotency I think that versioning (updating repository and merging local changes) should be not demand too much effort.

Feedback is welcome :)

I am going to start up a thread on what is important when it comes to actual files as well..


- Fredrik

Doug Tangren

unread,
Feb 21, 2013, 11:19:16 AM2/21/13
to Fredrik Ekholdt, adep...@googlegroups.com

On Thu, Feb 21, 2013 at 10:59 AM, Fredrik Ekholdt <fre...@gmail.com> wrote:
Hey everyone!
Agree that this is a great  awesome idea that hopefully will save future me from hours of frustration :)

Also have the beginnings of a concept here (me and mark took the name from this project): https://github.com/freekh/adept
As you can see in the README there is an outline to commands which I want to implement. I have started on a prototype as well, but it can only do init and add (there is some error handling that is missing as well). I hope to have the 'describe' by today (time permitting).

I see I have slightly different basis for the metadata since I am storing it in an embedded DB.
I started out with the idea of using git and text files as well, but realized that I ended up needing too much database functionality to not use a database. I want fast searching, data corruption checks and a way to update from a version of metadata to another (when we update adept).
With idempotency I think that versioning (updating repository and merging local changes) should be not demand too much effort.

Feedback is welcome :)

Thanks a good point about using an embedded db. What parts are you working on so I know what to shy away from to avoid overlap?

Indrajit Raychaudhuri

unread,
Feb 21, 2013, 11:56:55 AM2/21/13
to adept-dev
Great stuff going on!

Btw, how do you guys feel about hocon [1]? On one hand it does a lot more than what might be necessary but on the other hand it has interesting unification of json and properties format.

- Indrajit

[1] https://github.com/typesafehub/config/blob/master/HOCON.md

Fredrik Ekholdt

unread,
Feb 21, 2013, 12:04:26 PM2/21/13
to Doug Tangren, adep...@googlegroups.com
On Feb 21, 2013, at 5:19 PM, Doug Tangren wrote:


On Thu, Feb 21, 2013 at 10:59 AM, Fredrik Ekholdt <fre...@gmail.com> wrote:
Hey everyone!
Agree that this is a great  awesome idea that hopefully will save future me from hours of frustration :)

Also have the beginnings of a concept here (me and mark took the name from this project): https://github.com/freekh/adept
As you can see in the README there is an outline to commands which I want to implement. I have started on a prototype as well, but it can only do init and add (there is some error handling that is missing as well). I hope to have the 'describe' by today (time permitting).

I see I have slightly different basis for the metadata since I am storing it in an embedded DB.
I started out with the idea of using git and text files as well, but realized that I ended up needing too much database functionality to not use a database. I want fast searching, data corruption checks and a way to update from a version of metadata to another (when we update adept).
With idempotency I think that versioning (updating repository and merging local changes) should be not demand too much effort.

Feedback is welcome :)

Thanks a good point about using an embedded db. What parts are you working on so I know what to shy away from to avoid overlap?
Hmm.. I am still trying to get my head around this and as such I am all over the place :/
Currently I just want to make the "describe"/"read" method work, which should be easy enough. Then I should continue to see if versioning really work, then I have to test versioning together with pushing and pulling. I think the work on pushing and pulling with a DB is a week at the very least in the future, so you would not overlap if you where to work on a prototype related to that! :)

Doug Tangren

unread,
Feb 21, 2013, 12:43:37 PM2/21/13
to Fredrik Ekholdt, adep...@googlegroups.com
On Thu, Feb 21, 2013 at 12:04 PM, Fredrik Ekholdt <fre...@gmail.com> wrote:

On Feb 21, 2013, at 5:19 PM, Doug Tangren wrote:


On Thu, Feb 21, 2013 at 10:59 AM, Fredrik Ekholdt <fre...@gmail.com> wrote:
Hey everyone!
Agree that this is a great  awesome idea that hopefully will save future me from hours of frustration :)

Also have the beginnings of a concept here (me and mark took the name from this project): https://github.com/freekh/adept
As you can see in the README there is an outline to commands which I want to implement. I have started on a prototype as well, but it can only do init and add (there is some error handling that is missing as well). I hope to have the 'describe' by today (time permitting).

I see I have slightly different basis for the metadata since I am storing it in an embedded DB.
I started out with the idea of using git and text files as well, but realized that I ended up needing too much database functionality to not use a database. I want fast searching, data corruption checks and a way to update from a version of metadata to another (when we update adept).
With idempotency I think that versioning (updating repository and merging local changes) should be not demand too much effort.

Feedback is welcome :)

Thanks a good point about using an embedded db. What parts are you working on so I know what to shy away from to avoid overlap?
Hmm.. I am still trying to get my head around this and as such I am all over the place :/
Currently I just want to make the "describe"/"read" method work, which should be easy enough. Then I should continue to see if versioning really work, then I have to test versioning together with pushing and pulling. I think the work on pushing and pulling with a DB is a week at the very least in the future, so you would not overlap if you where to work on a prototype related to that! :)


Question. I noticed that you (and mark ) work for typesafe and that the organization was typesafe namespaced ( https://github.com/freekh/adept/blob/master/client/build.sbt#L5 )

Is this going to be a "typesafe" project. My main interested was producing something from the community. If typesafe is paying people to work on it I'm a little less motivated.

Wes Freeman

unread,
Feb 21, 2013, 2:04:40 PM2/21/13
to Doug Tangren, Michael Hunger, Fredrik Ekholdt, adept-dev
I'm interested to hear some typesafe responses to Doug's question--although in the end I'm ok with it being namespaced in typesafe, largely because I think it would gain support faster, and that's sort of critical for this sort of thing to succeed. From the nescala presentation, it sounded like Mark was "never" going to work on this. But he didn't mention Fredrik, who appears to have already put some effort into it. :P

And back to the other stuff. Speaking of embedded databases... have you guys considered using an embedded graph database like neo4j? Would be a nice fit being able to handle dependencies via directed relationships. Fetching a whole subgraph could be just a single query instead of traversing over the children via SQL--maybe SLICK abstracts this out into lazy table queries?

START project=node:Projects("organization:com.typesafe AND name:sbt")
MATCH project-[:dependsOn*]->dependencies
RETURN collect(distinct dependencies)

Which would give you a collection of the dependencies nodes and their properties (of the whole dependency subgraph). You could even build in some resolution rules to that sort of query if you wanted.

Perhaps a premature optimization, but it fits the graph db use case extremely well. :P Just throwing the idea out there. Would give me an excuse to do a better job supporting embedded neo from my library (currently it's targeted for just the REST server).

Also, I noticed Fredrik mentioned code searching on the adept readme--how were you thinking of implementing that? 

Wes

Mark Harrah

unread,
Feb 21, 2013, 7:45:01 PM2/21/13
to adep...@googlegroups.com
I think both human-readable and machine-readable are important. The program has to read and write the metadata of course and I think preserving structure in the metadata is essential for reliable automation. Human readable has the advantage of reducing the up-front work required. A binary format requires tooling to debug and manipulate, whereas a text format can be diffed and manually generated or edited. Let's specify requirements and then find a solution and not just pick an existing solution. No one should refrain from using something like JSON or HOCON for prototyping of course, but it might help to remember that XML was once the hot thing.

-Mark

Mark Harrah

unread,
Feb 21, 2013, 7:45:57 PM2/21/13
to adep...@googlegroups.com
On Thu, 21 Feb 2013 16:59:10 +0100
Fredrik Ekholdt <fre...@gmail.com> wrote:

> Hey everyone!
> Agree that this is a great awesome idea that hopefully will save future me from hours of frustration :)
>
> Also have the beginnings of a concept here (me and mark took the name from this project): https://github.com/freekh/adept
> As you can see in the README there is an outline to commands which I want to implement. I have started on a prototype as well, but it can only do init and add (there is some error handling that is missing as well). I hope to have the 'describe' by today (time permitting).
>
> I see I have slightly different basis for the metadata since I am storing it in an embedded DB.
> I started out with the idea of using git and text files as well, but realized that I ended up needing too much database functionality to not use a database. I want fast searching, data corruption checks and a way to update from a version of metadata to another (when we update adept).

Can you elaborate? For example, what are the advantages/disadvantages of the native representation being a database vs. a locally generated index, for example? How are the database contents transported between machines?

-Mark

Mark Harrah

unread,
Feb 21, 2013, 7:48:47 PM2/21/13
to adep...@googlegroups.com
On Thu, 21 Feb 2013 10:51:09 -0500
Wes Freeman <freem...@gmail.com> wrote:

> +1 for JSON; it already feels like some of those metadata items could be
> grouped together. down with XML!
>
> Yeah, thanks for fleshing out your outline, Mark. I plan to contribute but
> it sounds like Doug will beat me to it, which is great. :)

I'm pretty sure there will be plenty of work to go around ;)

-Mark

Mark Harrah

unread,
Feb 21, 2013, 7:50:56 PM2/21/13
to adep...@googlegroups.com
On Thu, 21 Feb 2013 12:43:37 -0500
Doug Tangren <d.ta...@gmail.com> wrote:

> Is this going to be a "typesafe" project. My main interested was producing
> something from the community. If typesafe is paying people to work on it
> I'm a little less motivated.

Can you elaborate?

-Mark

Josh Suereth

unread,
Feb 21, 2013, 11:03:19 PM2/21/13
to Mark Harrah, adep...@googlegroups.com

Indexing for quick retrieval is the reason to use a database or sstable.   Don't knock the size of this metadata, maven central will not fit in memory...,

Fredrik Ekholdt

unread,
Feb 22, 2013, 9:00:32 AM2/22/13
to adep...@googlegroups.com, Fredrik Ekholdt


On Thursday, 21 February 2013 18:43:37 UTC+1, Doug Tangren wrote:


On Thu, Feb 21, 2013 at 12:04 PM, Fredrik Ekholdt <fre...@gmail.com> wrote:

On Feb 21, 2013, at 5:19 PM, Doug Tangren wrote:


On Thu, Feb 21, 2013 at 10:59 AM, Fredrik Ekholdt <fre...@gmail.com> wrote:
Hey everyone!
Agree that this is a great  awesome idea that hopefully will save future me from hours of frustration :)

Also have the beginnings of a concept here (me and mark took the name from this project): https://github.com/freekh/adept
As you can see in the README there is an outline to commands which I want to implement. I have started on a prototype as well, but it can only do init and add (there is some error handling that is missing as well). I hope to have the 'describe' by today (time permitting).

I see I have slightly different basis for the metadata since I am storing it in an embedded DB.
I started out with the idea of using git and text files as well, but realized that I ended up needing too much database functionality to not use a database. I want fast searching, data corruption checks and a way to update from a version of metadata to another (when we update adept).
With idempotency I think that versioning (updating repository and merging local changes) should be not demand too much effort.

Feedback is welcome :)

Thanks a good point about using an embedded db. What parts are you working on so I know what to shy away from to avoid overlap?
Hmm.. I am still trying to get my head around this and as such I am all over the place :/
Currently I just want to make the "describe"/"read" method work, which should be easy enough. Then I should continue to see if versioning really work, then I have to test versioning together with pushing and pulling. I think the work on pushing and pulling with a DB is a week at the very least in the future, so you would not overlap if you where to work on a prototype related to that! :)


Question. I noticed that you (and mark ) work for typesafe and that the organization was typesafe namespaced ( https://github.com/freekh/adept/blob/master/client/build.sbt#L5 )

Is this going to be a "typesafe" project. My main interested was producing something from the community. If typesafe is paying people to work on it I'm a little less motivated.
It is true that both me and Mark work for typesafe. Dependency management in general has in fact been brought up on our mailing lists, but there is nobody who has been assigned to work on dependency management in general or this project in particular AFAIK.
The way this project originated was that me and Mark started talking at breakfast at our company meeting noticing that we shared many opinions about dependency management. I had already given it a bit thought in my own repo. We decided to see if we could concretize our ideas and share them with the community to see if there is interest and to possibly start hacking on it. Mark, the way I understood him, believe that this is a project that must have a strong community support to be able to happen and I agreed. This is why this google group was started.
To be honest I haven't checked if there is any contractual issues of my participation. I would be surprised if there is a problem, but I will figure it out and tell you when I know for sure.

Fredrik Ekholdt

unread,
Feb 22, 2013, 9:42:30 AM2/22/13
to adep...@googlegroups.com, Doug Tangren, Michael Hunger, Fredrik Ekholdt


On Thursday, 21 February 2013 20:04:40 UTC+1, Wes Freeman wrote:
I'm interested to hear some typesafe responses to Doug's question--although in the end I'm ok with it being namespaced in typesafe, largely because I think it would gain support faster, and that's sort of critical for this sort of thing to succeed. From the nescala presentation, it sounded like Mark was "never" going to work on this. But he didn't mention Fredrik, who appears to have already put some effort into it. :P

And back to the other stuff. Speaking of embedded databases... have you guys considered using an embedded graph database like neo4j? Would be a nice fit being able to handle dependencies via directed relationships. Fetching a whole subgraph could be just a single query instead of traversing over the children via SQL--maybe SLICK abstracts this out into lazy table queries?
AFAIK, Slick doesn't support Neo4j yet. Nothing restricts us to use Slick though. I haven't used Neo4j for than  a tutorial, since I do not really know I am not sure if it is a better fit.

START project=node:Projects("organization:com.typesafe AND name:sbt")
MATCH project-[:dependsOn*]->dependencies
RETURN collect(distinct dependencies)

Which would give you a collection of the dependencies nodes and their properties (of the whole dependency subgraph). You could even build in some resolution rules to that sort of query if you wanted.

Perhaps a premature optimization, but it fits the graph db use case extremely well. :P Just throwing the idea out there. Would give me an excuse to do a better job supporting embedded neo from my library (currently it's targeted for just the REST server).

Also, I noticed Fredrik mentioned code searching on the adept readme--how were you thinking of implementing that? 
 Ah, yes :) Hehe, so those are my BIG idea of having a github-like website on top of a repo. This site would provide searching etc etc. The idea is that it would index the source code from the sources jars.

Fredrik Ekholdt

unread,
Feb 22, 2013, 9:58:00 AM2/22/13
to adep...@googlegroups.com, Doug Tangren, Michael Hunger, Fredrik Ekholdt


On Friday, 22 February 2013 15:42:30 UTC+1, Fredrik Ekholdt wrote:


On Thursday, 21 February 2013 20:04:40 UTC+1, Wes Freeman wrote:
I'm interested to hear some typesafe responses to Doug's question--although in the end I'm ok with it being namespaced in typesafe, largely because I think it would gain support faster, and that's sort of critical for this sort of thing to succeed. From the nescala presentation, it sounded like Mark was "never" going to work on this. But he didn't mention Fredrik, who appears to have already put some effort into it. :P

And back to the other stuff. Speaking of embedded databases... have you guys considered using an embedded graph database like neo4j? Would be a nice fit being able to handle dependencies via directed relationships. Fetching a whole subgraph could be just a single query instead of traversing over the children via SQL--maybe SLICK abstracts this out into lazy table queries?
AFAIK, Slick doesn't support Neo4j yet. Nothing restricts us to use Slick though. I haven't used Neo4j for than  a tutorial, since I do not really know I am not sure if it is a better fit.
EDIT: I meant: nothing restricts us from NOT using Slick though

Fredrik Ekholdt

unread,
Feb 22, 2013, 9:59:35 AM2/22/13
to adep...@googlegroups.com


On Friday, 22 February 2013 01:45:57 UTC+1, Mark Harrah wrote:
On Thu, 21 Feb 2013 16:59:10 +0100
Fredrik Ekholdt <fre...@gmail.com> wrote:

> Hey everyone!
> Agree that this is a great  awesome idea that hopefully will save future me from hours of frustration :)
>
> Also have the beginnings of a concept here (me and mark took the name from this project): https://github.com/freekh/adept
> As you can see in the README there is an outline to commands which I want to implement. I have started on a prototype as well, but it can only do init and add (there is some error handling that is missing as well). I hope to have the 'describe' by today (time permitting).
>
> I see I have slightly different basis for the metadata since I am storing it in an embedded DB.
> I started out with the idea of using git and text files as well, but realized that I ended up needing too much database functionality to not use a database. I want fast searching, data corruption checks and a way to update from a version of metadata to another (when we update adept).

Can you elaborate?  For example, what are the advantages/disadvantages of the native representation being a database vs. a locally generated index, for example?  How are the database contents transported between machines?
I am not familiar with that locally generated index (I guess a database could be used as a simple index) so it would be hard for me to say.
The reason I went for a DB in the prototype I am building is that I needed somewhere to persist data, rather than having to build solutions for indexing and searching and making sure things are consistent (let us say if multiple processes are running adept), I was thinking a DB gives me this for free.
The disadvantages of a DB is that it is a binary format, on the other hand debugging and inspecting/changing data is very straight forward in a DB (through jdbc).

I thought that copying database contents could be done either directly querying the other DB file (if it is on the same machine) or by rest calls if it is to a remote server. Exactly how the rest calls look like depends on what perf requirements we have. The simplest approach I was planning on taking would be to request the last version on the server, then request insert the data between the local version and the latest remote version.


Mark Harrah

unread,
Feb 22, 2013, 6:22:40 PM2/22/13
to adep...@googlegroups.com
On Thu, 21 Feb 2013 23:03:19 -0500
Josh Suereth <joshua....@gmail.com> wrote:

> Indexing for quick retrieval is the reason to use a database or sstable.

The two options were a) database as the native representation b) database populated from some other representation, like git. There was no c) no database, ever.

> Don't knock the size of this metadata, maven central will not fit in
> memory...,

I'm not sure when you'd ever try to fit central or any other repository into memory with or without a database.

-Mark

Josh Suereth

unread,
Feb 22, 2013, 7:51:54 PM2/22/13
to adept-dev
Ah, my misconception.
Reply all
Reply to author
Forward
0 new messages