I think the final two bits we need to work out here are blob storage
and making sure we dont have to pull down all the binaries when we do
a pull. The other is the metadata in one file.
On blobs,
I have done a bit of research and I cant figure out how to pull down
only certain subdirectories/repo parts. I think it might be possible
to do it below the git api layer, that is but implementing parts of
git or using library to interact directly with the server, I could be
very wrong there though. Going down into the deeps might be acceptable
but it makes me nervous.
On metadata in a single file.
The only negative that I can bring on this is that you loose the
ability to introspect whats in the library trivially with ls or tree.
That is in the multiple file version I can ls the organizations
directory and see all the orgs in the repo. In the single file model
we need to implement introspection in the tool. I am not at all sure
this is a benefit and so something to worry about.
Eric
Tim,
I think the final two bits we need to work out here are blob storage
and making sure we dont have to pull down all the binaries when we do
a pull. The other is the metadata in one file.
On blobs,
I have done a bit of research and I cant figure out how to pull down
only certain subdirectories/repo parts. I think it might be possible
to do it below the git api layer, that is but implementing parts of
git or using library to interact directly with the server, I could be
very wrong there though. Going down into the deeps might be acceptable
but it makes me nervous.
On metadata in a single file.
The only negative that I can bring on this is that you loose the
ability to introspect whats in the library trivially with ls or tree.
That is in the multiple file version I can ls the organizations
directory and see all the orgs in the repo. In the single file model
we need to implement introspection in the tool. I am not at all sure
this is a benefit and so something to worry about.
Eric
I have neither the interest nor the intention of making it pure git on
the first release. I just want to make sure that we can eventually go
to a pure git solution in the not-to-distant future. Preferably with
out a ground up rewrite.
I have been living too much in the porcelain I guess.
> I get the impression that with a combination of this and the
> fetch-pack/git-upload-pack commands, it should be possible to obtain the
> index and pack file (containing the data) for just one SHA
> - 00485fd50ae1a78496cc2159c51a94302697da07d760 clearly contains *only* the
> pack data for the tag, and the tag was built from a branch where we
>
> * deleted the index
> * added the specific artefact/version __only__
>
> So the pack file should contain *only* the data we want for os_env-0.0.1 and
> nothing else. Managing this on the client shouldn't be terrible, as there is
> an ssh capability in OTP - although I suspect we may fall fowl of deliberate
> github API limitations if we're not careful. Also the publication (locally
> into your own organisation repo) shouldn't be too difficult, as it's just a
> matter of branching, tagging and a bit of file system manipulation, plus a
> 'git push' when you're ready to make the changes public.
Actually now that you mention it. We could probably do this trivially
by just sticking the binary on its own branch and merging that branch
as needed into the 'core' working branch. I don't know why that didnt
occur to me.
>
>
> Ok fair enough, I can go with your point of view here and actually I agree
> with your point about discovery and introspection being much easier. Based
> on what I've said above about the use of low level git commands, we'll be
> able to checkout the master branch with the complete index at will, pulling
> only the index metadata without the published binaries. The individual
> binaries will be accessible separately using the low level git commands and
> these can be stored either in a parallel location or whatever.
This is good enough I think. Lets start migrating this to a document
so we can follow up to the erlang mailing list. If I can get some time
this weekend I will mine the history to do just that.
> Eric
>
>
Hang on a minute, how will that work for people consuming the repository? If they do git clone <repo> they get everything by default, even if the binaries aren't merged into the main branch. Keeping them in separate branches (+ immutable tags) provides a cleaner separation and allows to download only the bits required.
>>
>>
>> Ok fair enough, I can go with your point of view here and actually I agree
>> with your point about discovery and introspection being much easier. Based
>> on what I've said above about the use of low level git commands, we'll be
>> able to checkout the master branch with the complete index at will, pulling
>> only the index metadata without the published binaries. The individual
>> binaries will be accessible separately using the low level git commands and
>> these can be stored either in a parallel location or whatever.
>
> This is good enough I think. Lets start migrating this to a document
> so we can follow up to the erlang mailing list. If I can get some time
> this weekend I will mine the history to do just that.
>
Ok cool, thanks sounds good.
>> Eric
>>
>>
What ever works. Its actually pretty trivial to clone just single
branch. Though you are right that a default clone pulls down the
entire repo.
Whatever accomplishes the goal of pulling down only the required
binary is fine with me.
>>>
>>> Actually now that you mention it. We could probably do this trivially
>>> by just sticking the binary on its own branch and merging that branch
>>> as needed into the 'core' working branch. I don't know why that didnt
>>> occur to me.
>>>
>>
>> Hang on a minute, how will that work for people consuming the repository? If they do git clone <repo> they get everything by default, even if the binaries aren't merged into the main branch. Keeping them in separate branches (+ immutable tags) provides a cleaner separation and allows to download only the bits required.
>
> What ever works. Its actually pretty trivial to clone just single
> branch. Though you are right that a default clone pulls down the
> entire repo.
>
> Whatever accomplishes the goal of pulling down only the required
> binary is fine with me.
>
Good. I think as you say, it's time to write it up and see what kind of feedback we get from the community. Want me to do some writing up of bits as well, or would you rather put the initial draft together?
On 16 Mar 2012, at 14:21, Eric Merritt wrote:On Fri, Mar 16, 2012 at 7:11 AM, Tim Watson <watson....@gmail.com> wrote:I don't actually know why we have to make this 'pure git only' for a firstrelease, as I thought we were going to suck up being tied to githubinitially. In the 'tied to github' case, you can find and downloadindividual blobs easily using the github REST api.I have neither the interest nor the intention of making it pure git onthe first release. I just want to make sure that we can eventually goto a pure git solution in the not-to-distant future. Preferably without a ground up rewrite.
Sweet. Lets start getting this written up and pushed out (I may have
already mentioned this). I can do the general stuff if you want to
write something specific on repo organization and handling.
Eric
Just one other thing I wanted to cover before we finalise and start documenting. For packages that contain native code, I feel that the publisher should be able to override the auto-selected 'supported-platform' or perhaps add additional 'supported-platforms' such that we can manually distinguish between builds that only work on certain flavours of linux, versus generic linux, versus generic (posix compliant) unix platforms e.g., any platform supporting glibc >= version X. This will make it much easier when we know we can produce a binary that will work on across various unix based platforms.
In order for that to work, I think the OS hierarchy will need to have basic support for something like:
{os_platforms, [
{windows, [.....]},
{unix, [
generic, %% no version information required....
{linux, [
{generic, [">= 2.6"]},
{linux_<flavour>, [">= 2"]}
]},
{bsd, [
{darwin, ["10.6.8"]},
{free_bsd, [...]}
%% etc
]}
]}
}.
Thoughts???
The first thing that comes to my head (and I am far from sure this is
valid) is that you will have a fair amount of mapping with this
approach. That is that the information you will get back from erlang
or uname will be something like linux, free_bsd, darwin etc. So with
a hierarchical structure you will need to query someplace what
'family' this particular thing belongs to. That is, I dont believe the
family information is provided through any api. Again, that mapping
should be pretty static and having the hierarchy is probably a win
there.
On a side note (and I realize this is just an example), I am not a big
fan of including the constraint in the version string. It just
introduces a parsing problem. We can easly have a tuple there and it
should be just as readable and have no parsing issue at all.
sounds right to me.
> On a side note (and I realize this is just an example), I am not a big
> fan of including the constraint in the version string. It just
> introduces a parsing problem. We can easly have a tuple there and it
> should be just as readable and have no parsing issue at all.
Indeed. Was just hacking an example but I do concur that {atom(), predicate(), semver()} is a much cleaner approach, where we've got something like...
predicate() :: equals | greater_than | greater_than_or_equals | less_than | lteq... | '=' | '>' | '>=' | '<' | '=<'.
I also think that a two tuple should be shorthand for equals, so that these two definitions are semantically equivalent: {Thing, equals, Vsn} === {Thing, Vsn}.
This is exactly what sinans constraint solver does. So I am on board
with all of this. :P
Awesome. :D
I will try to come up with some candidates
Eric
> I think we should start working on a name for the suite. It may sound
> trivial but I think its actually important.
>
Yes I think you're right and it does really matter.
> I will try to come up with some candidates
>
Ok cool.