Proposal: Permit specifying version requirement when declaring Mix dependencies via Git.

35 views
Skip to first unread message

Frerich Raabe

unread,
Oct 23, 2021, 10:03:18 AM10/23/21
to elixir-lang-core
Hi all,

specifying dependencies in `mix.exs` files via Git repositories could be made
more powerful (and arguably reduce the need for self-hosted Hex repositories)
by adding support for version requirements. To support this, Mix could
automatically map Git tags to versions which can be consumed by the dependency
resolution algorithm.


Motivation
==========

Declaring dependencies of Elixir projects via Git, e.g. by defining

    {:foobar, git: "https://github.com/elixir-lang/foobar.git", tag: "0.1"}

is very convenient. In many cases, dependencies are already stored in Git
repositories and (when working with private repositories) you get
authentication for free (e.g. via SSH keys).

However, there are at least two noteworthy downsides to this approach:

1. The dependency version is opaque: Mix has no idea which version of the
   dependency is used, the value of the 'tag' option (resp. 'branch' or 'ref')
   has no associated semantic as far as Mix is concerned. This makes life
   harder for the dependency resolution algorithm.

2. The dependency version is overly specific: in the above example, we
   expressed that we'd like a checkout of the repository as referencedb by the
   '0.1' version tag. However, maybe *any* 0.1.x release is suitable (e.g.
   0.1.1, 0.2.2 etc.). There is no way to express this, so bugfix releases done
   to the :foobar application won't get picked up when rebuilding our project.

Thus, users tend to turn to private[1] or self-hosted[2] Hex repositories.
Doing so allows defining a proper version requirement which is understood by
Mix, e.g. "== 1.0". Alas, new Hex repositories come with their own challenges,
such as

* How are new package versions published: can you use `mix hex.publish` or do you rather
  copy plain tarballs around (cf. mix hex.registry build)

* Where is the repository hosted, who maintains it?

* How does authorization work, in case it is not desirable to have packages
  available to everyone?


Proposed Solution
=================

Much like Hex repositories, Many Git repositories already have a notion of
'versions', usually implemented via Git tags - Mix just doesn't make use of
this. By making Mix fetch all release tags from the given repository, it could
construct a set of available versions which could then be considered by the
dependency resolution algorithm to select an appropriate version for the build.

For example, a dependency in the form of

    {:foobar, git: "https://github.com/elixir-lang/foobar.git", tag: "0.1"}

could be rewritten as

    {:foobar, "== 0.1", git: "https://github.com/elixir-lang/foobar.git"}

In particular, this would enable using the powerful `~>` operator to express
things such as 'the last stable 0.1.x' release:

    {:foobar, "~> 0.1", git: "https://github.com/elixir-lang/foobar.git"}

When defining a Git repository as a dependency like this, specifying the `tag`,
`branch` or `ref` options would be prohibited and result in a build time
error (much as is already the case when specifying more than one of those
three options).

Adding this feature would alleviate the need to reach for self-hosted Hex
repositories: the proven dependency resolution algorithm would kick in,
enabling more flexible forms of specifying versions (via ~>) and allowing
Mix to dynamically pick a matching version. Consequently, none of the
challenges associated with using a self-hosted Hex repository need to be
tackled.


Implementation Considerations
=============================

Interaction With Other Git-Specific Options
-------------------------------------------
When defining a Git repository as a dependency using a version requirement,
specifying the `tag`, `branch` or `ref` options would be prohibited and
result in a build time error (much as is already the case when specifying
more than one of those three options).

Custom Tag Formats
------------------
Many repositories use different tag formats than those which are understood by
Version.parse/1[3]. Thus, it would be convenient to support an optional mapping
function of the form

    String.t :: nil | Version.t

which can be specified by the user to implement a custom mapping of Git tags to
Version structs (or `nil`, in case the given tag does not reference a released
version).

Maybe this optional mapping could be defined via an optional `tap_mapping:`
option. For example, to process tags in the form `vX.Y.Z`:

    {:foobar, "== 0.1", git: "https://github.com/elixir-lang/foobar.git", tag_mapping: &strip_leading_v/1}

    def strip_leading_v("v" <> version) do
      case Version.parse(version) do
        {:ok, version} -> version
        :error -> nil
      end
    end

    def strip_leading_v(_), do: nil

Efficiently Fetching Tags
-------------------------
The git-ls-remote[4] command permits fetching all (or just some) references
from a remote repository without cloning the repository. Something like

    git ls-remote --tags --refs https://github.com/elixir-lang/foobar.git

can be used to get all tags in the given repository, based on which the set of
all available versions could be constructed.


I'd love to hear your thoughts on this!

- Frerich


Austin Ziegler

unread,
Oct 23, 2021, 11:21:26 AM10/23/21
to elixir-l...@googlegroups.com
I believe that this already works, as I have one dependency in a
project where I am not able to do this because the maintainer
explicitly leaves the git version as `0.0.0` and applies the correct
version during their release process.

https://github.com/rzane/file_store/issues/24

-a
> --
> You received this message because you are subscribed to the Google Groups "elixir-lang-core" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-co...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-core/7c961da0-b047-4e48-9df4-43e9eca99b0dn%40googlegroups.com.



--
Austin Ziegler • halos...@gmail.comaus...@halostatue.ca
http://www.halostatue.ca/http://twitter.com/halostatue

José Valim

unread,
Oct 23, 2021, 11:23:52 AM10/23/21
to elixir-l...@googlegroups.com
Hi Frerich Raabe,

Thank you for the detailed proposal.

Unfortunately, the issue with ~> 0.1 is that people may expect us to perform proper dependency resolution. The dependency resolution does not consider only how to download the package "foo" but rather the package "foo" and all of its dependencies. You have detailed how we can prefetch tags but the dependency resolution also needs to know all of the dependencies (and their requirements) for said tag.

For example, imagine that we specify ~> 0.1 for a project and tag v0.3.0 is available. We checkout v0.3.0, load the Mix.Project, then we find a version conflict in its deps. Now, in order to find the next version to try out, we could go on and check the next tag, load the next Mix.Project, and so on. We will do it for all matching tags until we find a matching set of deps, abort, or backtrack. This process is going to be extremely slow, because every tag attempt requires a checkout, loading a file, compiling a module, and getting the deps to try out.

Contrast this to Hex.pm, where all versions, their deps and requirements are available and cached upfront. Still, even with this, Hex dependency resolution can still be slow in some pathological cases. I can't even imagine how much slower they would be if trying out every combination of the resolution required a checkout, file parsing, etc.

Due to this, I am afraid we simply cannot support dependency resolution with git repos. It is going to quickly become too slow as the number of versions and dependencies grow.

Frerich Raabe

unread,
Oct 25, 2021, 5:08:17 AM10/25/21
to elixir-lang-core
Hi,

thanks for your feedback! I now have a better understanding of the challenge. :-)

You're totally right - I didn't consider that the dependencies need to be considered _recursively_, but it makes perfect sense of course. I see how this might be a real performance issue.

I briefly considered somehow exposing this information in the Git repository: my idea was to have a dedicated Git namespace[1] specifically for refs used by Mix. Some to-be-defined Mix task could then be run within the repository, mapping release tags to so-called annotated tags[2] which can have a commit message. This message could be used to describe the dependency versions for each specific release.

When accessing a Git repository (as part of fetching dependencies), Mix could then enumerate the annotated tags in the custom namespace and parse the commit messages to figure out all the (recursive) dependencies.

However, I guess at this point it's not really worth it: a much easier approach might be to just put the files generated by 'mix hex.registry build' into a (potentially private) Git repository which permits HTTPS access, e.g. a GitHub repository. I can then use that as the repository URL with 'mix hex.repo add ...'. :-)

- Frerich

Reply all
Reply to author
Forward
0 new messages