Git-Backed Racket Packages, git export

72 views
Skip to first unread message

Eric Eide

unread,
Jun 13, 2019, 1:24:07 PM6/13/19
to Racket Users
As described here <https://docs.racket-lang.org/pkg/git-workflow.html>,

> When a Git repository is specified as a package source, then a copy of the
> repository content is installed as the package implementation. That
> installation mode is designed for package consumers, who normally use a
> package without modifying it.

My understanding is that "the copy of the repository content" is produced
simply, e.g., by checking out the repository at the appropriate commit and then
discarding the `.git` directory.

An alternative would be to produce the repository content by running `git
archive`. This would allow certain transformations to be made on the content,
e.g., inserting the commit hash into one of the exported files. One could do
other minor tricks as well, like excluding "junk" files. See the ATTRIBUTES
section of the man page at <https://git-scm.com/docs/git-archive>.

Is there a reason why the Racket package system doesn't run `git archive` to
produce the content for the "non-developer" version of a package?

Thanks ---

Eric.

--
-------------------------------------------------------------------------------
Eric Eide <ee...@cs.utah.edu> . University of Utah School of Computing
http://www.cs.utah.edu/~eeide/ . +1 (801) 585-5512 voice, +1 (801) 581-5843 FAX

Eric Eide

unread,
Jun 13, 2019, 1:33:24 PM6/13/19
to Racket Users
Argh, I screwed up the subject line of my question email :-/.

I always have to remind myself that the command is `git archive`, not `git
export`. Sorry for any confusion.

Matthew Flatt

unread,
Jun 13, 2019, 2:30:03 PM6/13/19
to Eric Eide, Racket Users
At Thu, 13 Jun 2019 11:24:03 -0600, Eric Eide wrote:
> As described here <https://docs.racket-lang.org/pkg/git-workflow.html>,
>
> > When a Git repository is specified as a package source, then a copy of the
> > repository content is installed as the package implementation. That
> > installation mode is designed for package consumers, who normally use a
> > package without modifying it.
>
> My understanding is that "the copy of the repository content" is
> produced simply, e.g., by checking out the repository at the
> appropriate commit and then discarding the `.git` directory.

In case it helps clarify: The repository content is obtained not using
`git` in a shell, which would create portability and dependency
problems for Racket, but using `git-checkout` from the
`net/git-checkout` library. The `git-checkout` function doesn't create
a ".git" subdirectory for metadata.

Since only the "content" (is there a better technical term?) of a
commit is checked out, there's no ".git" to remove. Operationally,
though, I think you mean that it's equivalent to `git clone` followed
by removing the ".git" directory, which sounds right.

> An alternative would be to produce the repository content by running `git
> archive`. This would allow certain transformations to be made on the content,
> e.g., inserting the commit hash into one of the exported files. One could do
> other minor tricks as well, like excluding "junk" files. See the ATTRIBUTES
> section of the man page at <https://git-scm.com/docs/git-archive>.
>
> Is there a reason why the Racket package system doesn't run `git archive` to
> produce the content for the "non-developer" version of a package?

The simplistic answer is that `git-checkout` doesn't support a `git
archive`-like mode. And a practical answer is that no one is likely to
implement it in the near term. :)

Note that there's a notion of "source package" and "built package"
pruning at the Racket package-system level, and it at least includes a
`git archive`-like option in "info.rkt" for omitting files.[*] By
default, bundling a package skips a ".git" subdirectory. There could be
a `git archive` layer in addition, though.

[*] https://docs.racket-lang.org/pkg/strip.html

Eric Eide

unread,
Jun 13, 2019, 2:59:23 PM6/13/19
to Racket Users
Matthew Flatt <mfl...@cs.utah.edu> writes:

> The simplistic answer is that `git-checkout` doesn't support a `git
> archive`-like mode. And a practical answer is that no one is likely to
> implement it in the near term. :)

Thanks for the explanation!

As you might have guessed, my goal is to figure out how to insert a git commit
hash into a package installed in the "non-developer way."

Sam Tobin-Hochstadt

unread,
Jun 13, 2019, 5:18:03 PM6/13/19
to Eric Eide, Racket Users
What do you need the hash for? Could you get the hash from the package
system, for example:

> (require pkg/lib)
> (pkg-info-checksum (hash-ref (installed-pkg-table) "z3"))
"84059a4428454cc6edd57865befaedb1d29dedce"

Sam
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/m1tvcttipk.fsf%40gris-dmz.flux.utah.edu.
> For more options, visit https://groups.google.com/d/optout.

Eric Eide

unread,
Jun 13, 2019, 6:17:00 PM6/13/19
to Racket Users
Sam Tobin-Hochstadt <sa...@cs.indiana.edu> writes:

> What do you need the hash for? Could you get the hash from the package
> system, for example:

I want to know the ("a") hash so that I can reliably reproduce outputs,
diagnose crashes, etc.

The package hash might work; thank you for pointing it out! I'll have to
investigate how I can easily navigate from a package hash to a particular
version of the package. (I.e., the moral equivalent of `git checkout <hash>`.)

Eric.

Philip McGrath

unread,
Jun 13, 2019, 7:45:33 PM6/13/19
to Eric Eide, Racket Users
According to §2.2 Package Sources, for a Git package source in the syntax ‹scheme›://‹host›/.../‹repo›[.git][/][?path=‹path›][#‹rev›], "the package’s checksum is the hash identifying ‹rev› if ‹rev› is a branch or tag, otherwise ‹rev› itself serves as the checksum."


-Philip


--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.

Eric Eide

unread,
Jun 13, 2019, 11:24:09 PM6/13/19
to Racket Users
Philip McGrath <phi...@philipmcgrath.com> writes:

> According to §2.2 Package Sources, for a Git package source in the syntax
> ‹scheme›://‹host›/.../‹repo› [.git][/][?path=‹path›][#‹rev›], "the package’s
> checksum is the hash identifying ‹rev› if ‹rev› is a branch or tag, otherwise
> ‹rev› itself serves as the checksum."

Thanks! This seems to be exactly what I want!

Eric Eide

unread,
Jun 16, 2019, 10:11:24 PM6/16/19
to Racket Users
Sam Tobin-Hochstadt <sa...@cs.indiana.edu> writes:

> What do you need the hash for? Could you get the hash from the package
> system, for example:
>
>> (require pkg/lib)
>> (pkg-info-checksum (hash-ref (installed-pkg-table) "z3"))
> "84059a4428454cc6edd57865befaedb1d29dedce"

This trick does not work (in my tests so far) when a git sandbox is linked as a
package. When installed in that way, the package checksum is #f :-(.

Sam Tobin-Hochstadt

unread,
Jun 17, 2019, 11:09:55 AM6/17/19
to Eric Eide, Racket Users
I guess I don't fully understand what you're trying to accomplish. In
general, a Racket package might live inside some git repository on the
file system, but Racket wouldn't necessarily know anything about that.

If you want "best-effort checksum associated with this code in some
way" then combining the information from `pkg-info` with calling the
git binary is probably necessary. If you want "the checksum used to
install this package" then if the package checksum in #f it doesn't
exist.

Sam
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/m1tvcpklkm.fsf%40cs.utah.edu.

Eric Eide

unread,
Jun 17, 2019, 12:50:18 PM6/17/19
to Racket Users
Sam Tobin-Hochstadt <sa...@cs.indiana.edu> writes:

> If you want "best-effort checksum associated with this code in some
> way" then combining the information from `pkg-info` with calling the
> git binary is probably necessary.

Yes, "best-effort checksum" is what I'm trying to do. Thanks again for the
help!

Eric Eide

unread,
Jun 17, 2019, 1:23:45 PM6/17/19
to Racket Users
Robby Findler <ro...@cs.northwestern.edu> writes:

> But stepping back a little bit, I'm curious why you're doing this more
> generally. Is this a way to communicate with users about what version
> they are using somehow or to tell them how to get specific versions
> that aren't the version listed on pkgs.racket-lang.org?

Communicate with users. I want to put the appropriate git hash into the output
of my program (Xsmith-based random program generators) so that I can attempt to
reproduce the output, if necessary.

Robby Findler

unread,
Jun 17, 2019, 1:49:26 PM6/17/19
to Eric Eide, Racket Users
On Mon, Jun 17, 2019 at 12:23 PM Eric Eide <ee...@cs.utah.edu> wrote:
>
> Robby Findler <ro...@cs.northwestern.edu> writes:
>
> > But stepping back a little bit, I'm curious why you're doing this more
> > generally. Is this a way to communicate with users about what version
> > they are using somehow or to tell them how to get specific versions
> > that aren't the version listed on pkgs.racket-lang.org?
>
> Communicate with users. I want to put the appropriate git hash into the output
> of my program (Xsmith-based random program generators) so that I can attempt to
> reproduce the output, if necessary.

OIC. That's neat.

Robby

Robby Findler

unread,
Jun 17, 2019, 2:20:55 PM6/17/19
to Eric Eide, Racket Users
But stepping back a little bit, I'm curious why you're doing this more
generally. Is this a way to communicate with users about what version
they are using somehow or to tell them how to get specific versions
that aren't the version listed on pkgs.racket-lang.org?

Robby
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/m1r27sjgvt.fsf%40gris-dmz.flux.utah.edu.

Matthew Butterick

unread,
Jun 17, 2019, 9:09:12 PM6/17/19
to Eric Eide, Racket Users

On Jun 17, 2019, at 10:23 AM, Eric Eide <ee...@cs.utah.edu> wrote:

Communicate with users.  I want to put the appropriate git hash into the output
of my program (Xsmith-based random program generators) so that I can attempt to
reproduce the output, if necessary.

Once upon a time I tried to do the same for Pollen and couldn't sort it out. AFAICT the git hash isn't generated until the commit is made. Instead, I ended up adding a git push hook that writes a timestamp into a "ts.rktd" file as part of the commit. Then when Pollen is installed, that timestamp can be baked into the version number as a "build number". [1] So it is not the git hash, but it still identifies a particular commit. I'm sure this offends common decency, but it has indeed been useful for pinpointing user problems.



Eric Eide

unread,
Jun 17, 2019, 9:27:12 PM6/17/19
to Racket Users
Matthew Butterick <m...@mbtype.com> writes:

> Once upon a time I tried to do the same for Pollen and couldn't sort it
> out. AFAICT the git hash isn't generated until the commit is made.

That's true; you can't figure out the hash before the commit is made.

I think I have the pieces of a workable (for me) implementation now, but I
haven't actually implemented it yet. When I do I'll send you an email with a
pointer to the relevant code.

Thank for the pointer to the timestamp-based technique! I'll take a look.
Reply all
Reply to author
Forward
0 new messages