Expressing dependencies on PyPI

380 views
Skip to first unread message

Lukács T. Berki

unread,
Feb 2, 2018, 5:15:02 AM2/2/18
to bazel-si...@googlegroups.com, Klaus Aehlig, Dmitry Lomov, Luis Fernando Pino Duque, Rosica Dejanovska
Hey there,

(I'd especially like the opinion of Klaus and Dmitry, since they have all the WORKSPACE logic in their heads at the moment)

Looking at the new rules_pyz repository, the way it works is that it parses requirements.txt, then emits a WORKSPACE file that downloads the transitive dependencies of whatever is in that requirements.txt . While it works for the exact use case it was conceived for, it does have a few issues.

AFAIU it only works for the dependencies of the top-level repository; if I have a tool that uses rules_pyz, its dependencies don't end up in my WORKSPACE.

It does not handle cases where two binaries require two different versions of the same binary; in fact, the name of the WORKSPACE rule is a function of only the name of the package, thus, pulling in two different versions of the same package is impossible.

It requires running a tool to build the WORKSPACE file, which makes the most convenient "git clone; bazel build" use case reliant on checking in its output (thiry_party/pypi/pypi_rules.bzl) and one has to remember to re-run that tool if the requirements change.

It assumes that the architecture one is building for is the one where pip_generate_wrapper is run. This makes both cross-compilation and checking out the code and building it on a different platform impossible. Ideally, one would only download the wheels / binaries for only the platform one needs them for.

WDY'allT? How do we do this so that it meshes well with the way we think WORKSPACE files should work?

--
Lukács T. Berki | Software Engineer | lbe...@google.com | 

Google Germany GmbH | Erika-Mann-Str. 33  | 80636 München | Germany | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado | Registergericht und -nummer: Hamburg, HRB 86891

Evan Jones

unread,
Feb 2, 2018, 10:14:50 AM2/2/18
to Lukács T. Berki, bazel-si...@googlegroups.com, Klaus Aehlig, Dmitry Lomov, Luis Fernando Pino Duque, Rosica Dejanovska
I am definitely not an expert here. I copied the concept of generating WORKSPACE rules from bazel-deps, and I stole bits of code from the rules_python repo :) It is definitely very flawed, but I'm happy it inspired some discussion: My mission has been accomplished, so thanks. My thoughts:


Checking in results of dependency resolution: I think this is a "good" way to work, and the WORKSPACE.resolved proposal seems to agree, from what I understand? I like that the "logical" dependencies (pip requirements.txt in this case) are separated in some way from the "concrete" versions (pypi_rules.bzl in this case). I like that when a teammate pulls my changes, they will get the exact same dependencies that I tested/deployed. Someone needs to take a "manual" action to update the dependencies, which can be recorded as a change in version control.

The place this tends to break is "libraries" and similar tools, managed in separate repositories, where you can't constrain your consumer's choice of versions. I have few opinions here: I'm a fan of monorepos, and at Bluecore I think it makes sense that we will share a single set of PyPI libraries, each with a single version.


Version conflicts: So far, we use a single version of any library across our different services/targets. We aren't big enough that this has failed for us yet, so again I don't have strong opinions here. I think this is a fairly common assumption in Python projects, where most projects use a single virtualenv to run all the stuff inside that project. However, your example use case should work: I should be able to run a py_binary from another WORKSPACE, even if it conflicts with the version I want to use for my py_binaries in my WORKSPACE. (Minor point: it is a bug that pip_generate's WORKSPACE wheel names are not versioned)


Cross platform: This is a huge pain: We develop primarily on Mac OS X and deploy on Linux, so we support both. pip_generate has a hack to try and detect native dependencies and select the right one (e.g. a grpc example). It doesn't work for wheels where PyPI does not contain a binary.


Thanks!



--
You received this message because you are subscribed to the Google Groups "Bazel/Python Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-python+unsubscribe@googlegroups.com.
To post to this group, send email to bazel-sig-python@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-sig-python/CAOu%2B0LWzUW9u07y%2BShpgbuPdDsTD-UTtR6ydugNCtsX4GH1hYw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Lukács T. Berki

unread,
Feb 14, 2018, 4:24:57 AM2/14/18
to Evan Jones, bazel-si...@googlegroups.com, Klaus Aehlig, Dmitry Lomov, Luis Fernando Pino Duque, Rosica Dejanovska
Hm, I've been persistently waiting for people to chime in with their opinions, without much success, to I'll try to prod the discussion along.



On Fri, 2 Feb 2018 at 16:14, Evan Jones <evan....@bluecore.com> wrote:
I am definitely not an expert here. I copied the concept of generating WORKSPACE rules from bazel-deps, and I stole bits of code from the rules_python repo :) It is definitely very flawed, but I'm happy it inspired some discussion: My mission has been accomplished, so thanks. My thoughts:


Checking in results of dependency resolution: I think this is a "good" way to work, and the WORKSPACE.resolved proposal seems to agree, from what I understand? I like that the "logical" dependencies (pip requirements.txt in this case) are separated in some way from the "concrete" versions (pypi_rules.bzl in this case). I like that when a teammate pulls my changes, they will get the exact same dependencies that I tested/deployed. Someone needs to take a "manual" action to update the dependencies, which can be recorded as a change in version control.
Yep, that sounds like the exact problem WORKSPACE.resolved is designed to solve. The only problem I see is that having a requirements.txt is not quite how Bazel works; I was thinking one could express the same things as requirements.txt but in WORKSPACE file syntax. WDYT? Or else we can have a WORKSPACE rule that refers to a requirements.txt .
 

The place this tends to break is "libraries" and similar tools, managed in separate repositories, where you can't constrain your consumer's choice of versions. I have few opinions here: I'm a fan of monorepos, and at Bluecore I think it makes sense that we will share a single set of PyPI libraries, each with a single version.
Yep. Unfortunately, I don't think we can ignore this use case and solving this would also fix the version conflict issue. How about a plan like this: there will be a rule in the WORKSPACE file that references a requirements.txt . That rule generates two kinds of things:
  • A repository for each transitive package the dependency resolution comes up whose names are either some sort of checksum or the concatenation of the package name and the version. This is so that if the processing of two requirements.txt files end up fetching the same package, they neither conflict nor do they end up with two repositories fetching the same package.
  • A single repository whose name is the same as the name of the rule in the WORKSPACE file that generated it and one alias rule for each package specified in the requirements.txt that points to the appropriate versioned instance of the rule.
Example: if the WORKSPACE file looks like this:

pypi_requirements(name="alice_deps", requirements=["alice_requirements.txt"]

and alice_requirements.txt contains this:

Foo >= 1.4

and Foo depends on Bar and the result of package resolution is that Foo should be version 1.4.1 and Bar should be version 2.0, the following BUILD files are generated:

@alice_deps//BUILD:
alias(name="Foo", actual="@pypi_foo_1_4_1//:Foo")

@pypi_foo_1_4_1//BUILD:
py_library(name="Foo", deps=["@pypi_Bar_2_0//:Bar")

@pypi_bar_2_0/BUILD:
py_library(name="Bar")

Then in your py_library rules, you would write
py_libary(name="alice", deps=["@alice_deps//:Foo"])


 


Version conflicts: So far, we use a single version of any library across our different services/targets. We aren't big enough that this has failed for us yet, so again I don't have strong opinions here. I think this is a fairly common assumption in Python projects, where most projects use a single virtualenv to run all the stuff inside that project. However, your example use case should work: I should be able to run a py_binary from another WORKSPACE, even if it conflicts with the version I want to use for my py_binaries in my WORKSPACE. (Minor point: it is a bug that pip_generate's WORKSPACE wheel names are not versioned)
 


Cross platform: This is a huge pain: We develop primarily on Mac OS X and deploy on Linux, so we support both. pip_generate has a hack to try and detect native dependencies and select the right one (e.g. a grpc example). It doesn't work for wheels where PyPI does not contain a binary.
Oh yes, totally a PITA. I was thinking that if there is a way to figure out which operating systems a PyPI package supports, there could be an alias in the repository generated for that package. I.e. in the above example, instead of

py_library(name="Foo")

one would have

alias(name="Foo", actual=select({"i_am_osx": ":Foo_osx", "i_am_linux_x86": "Foo_linux_x86"})

This hinges upon two things: us being able to determine which OSes a PyPI package supports and mapping the concept of a "OS" in PyPI to Bazel constraints. The second is a manageable, albeit fiddly affair and I don't know if PyPI supports the latter kind of query. Does it? 
 


Thanks!



On Fri, Feb 2, 2018 at 5:14 AM, 'Lukács T. Berki' via Bazel/Python Special Interest Group <bazel-si...@googlegroups.com> wrote:
Hey there,

(I'd especially like the opinion of Klaus and Dmitry, since they have all the WORKSPACE logic in their heads at the moment)

Looking at the new rules_pyz repository, the way it works is that it parses requirements.txt, then emits a WORKSPACE file that downloads the transitive dependencies of whatever is in that requirements.txt . While it works for the exact use case it was conceived for, it does have a few issues.

AFAIU it only works for the dependencies of the top-level repository; if I have a tool that uses rules_pyz, its dependencies don't end up in my WORKSPACE.

It does not handle cases where two binaries require two different versions of the same binary; in fact, the name of the WORKSPACE rule is a function of only the name of the package, thus, pulling in two different versions of the same package is impossible.

It requires running a tool to build the WORKSPACE file, which makes the most convenient "git clone; bazel build" use case reliant on checking in its output (thiry_party/pypi/pypi_rules.bzl) and one has to remember to re-run that tool if the requirements change.

It assumes that the architecture one is building for is the one where pip_generate_wrapper is run. This makes both cross-compilation and checking out the code and building it on a different platform impossible. Ideally, one would only download the wheels / binaries for only the platform one needs them for.

WDY'allT? How do we do this so that it meshes well with the way we think WORKSPACE files should work?

--
Lukács T. Berki | Software Engineer | lbe...@google.com | 

Google Germany GmbH | Erika-Mann-Str. 33  | 80636 München | GermanyGeschäftsführer: Paul Manicle, Halimah DeLaine Prado | Registergericht und -nummer: Hamburg, HRB 86891

--
You received this message because you are subscribed to the Google Groups "Bazel/Python Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-pyth...@googlegroups.com.
To post to this group, send email to bazel-si...@googlegroups.com.

Evan Jones

unread,
Feb 14, 2018, 9:30:37 AM2/14/18
to Lukács T. Berki, Bazel/Python Special Interest Group, Klaus Aehlig, Dmitry Lomov, Luis Fernando Pino Duque, Rosica Dejanovska
On Wed, Feb 14, 2018 at 4:24 AM, Lukács T. Berki <lbe...@google.com> wrote:
On Fri, 2 Feb 2018 at 16:14, Evan Jones <evan....@bluecore.com> wrote:
Checking in results of dependency resolution: I think this is a "good" way to work, and the WORKSPACE.resolved proposal seems to agree, from what I understand? I like that the "logical" dependencies (pip requirements.txt in this case) are separated in some way from the "concrete" versions (pypi_rules.bzl in this case). I like that when a teammate pulls my changes, they will get the exact same dependencies that I tested/deployed. Someone needs to take a "manual" action to update the dependencies, which can be recorded as a change in version control.
Yep, that sounds like the exact problem WORKSPACE.resolved is designed to solve. The only problem I see is that having a requirements.txt is not quite how Bazel works; I was thinking one could express the same things as requirements.txt but in WORKSPACE file syntax. WDYT? Or else we can have a WORKSPACE rule that refers to a requirements.txt .

Personally: I have no strong preference on the particular syntax/files. There is some minor advantage to using "what normal Python projects use", but there is also an advantage to standardizing on "What Bazel projects use."  Whatever makes it easier to implement is fine by me.


The place this tends to break is "libraries" and similar tools, managed in separate repositories, where you can't constrain your consumer's choice of versions. I have few opinions here: I'm a fan of monorepos, and at Bluecore I think it makes sense that we will share a single set of PyPI libraries, each with a single version.
Yep. Unfortunately, I don't think we can ignore this use case and solving this would also fix the version conflict issue. How about a plan like this: there will be a rule in the WORKSPACE file that references a requirements.txt . That rule generates two kinds of things:
  • A repository for each transitive package the dependency resolution comes up whose names are either some sort of checksum or the concatenation of the package name and the version. This is so that if the processing of two requirements.txt files end up fetching the same package, they neither conflict nor do they end up with two repositories fetching the same package.
  • A single repository whose name is the same as the name of the rule in the WORKSPACE file that generated it and one alias rule for each package specified in the requirements.txt that points to the appropriate versioned instance of the rule.
Example: if the WORKSPACE file looks like this:

pypi_requirements(name="alice_deps", requirements=["alice_requirements.txt"]

and alice_requirements.txt contains this:

Foo >= 1.4

and Foo depends on Bar and the result of package resolution is that Foo should be version 1.4.1 and Bar should be version 2.0, the following BUILD files are generated:

@alice_deps//BUILD:
alias(name="Foo", actual="@pypi_foo_1_4_1//:Foo")

@pypi_foo_1_4_1//BUILD:
py_library(name="Foo", deps=["@pypi_Bar_2_0//:Bar")

@pypi_bar_2_0/BUILD:
py_library(name="Bar")

Then in your py_library rules, you would write
py_libary(name="alice", deps=["@alice_deps//:Foo"])

This seems like a very reasonable proposal to me. I can't see any problems with it. I think it is important for py_libraries to be able to specify their dependency on "unversioned" packages, like you have here with "@alice_deps//:Foo", so I'm glad to see that. It is a huge pain if as part of a third-party library versien upgrade you also need to search and replace every rule that depends on it.


Cross platform: This is a huge pain: We develop primarily on Mac OS X and deploy on Linux, so we support both. pip_generate has a hack to try and detect native dependencies and select the right one (e.g. a grpc example). It doesn't work for wheels where PyPI does not contain a binary.
Oh yes, totally a PITA. I was thinking that if there is a way to figure out which operating systems a PyPI package supports, there could be an alias in the repository generated for that package. I.e. in the above example, instead of

py_library(name="Foo")

one would have

alias(name="Foo", actual=select({"i_am_osx": ":Foo_osx", "i_am_linux_x86": "Foo_linux_x86"})

This hinges upon two things: us being able to determine which OSes a PyPI package supports and mapping the concept of a "OS" in PyPI to Bazel constraints. The second is a manageable, albeit fiddly affair and I don't know if PyPI supports the latter kind of query. Does it? 

You should check with someone who understands Python package management more intimately than I do. In the Python world there are many fun issues:

Python Version: Python2 only? Python 3 only? Both Python 2 and Python 3?
Operating System: Cross platform/pure python? Platform specific?

In my battles with this, when you query PyPI, you are going to find one of the following cases:

1. A source package only, which is cross-platform (however, you can't tell without executing setup.py). Example: pycparser: https://pypi.python.org/pypi/pycparser
2. A source package only, which is platform-specific (again, must execute setup.py to find it references native code). Example: MySQL-Python: https://pypi.python.org/pypi/MySQL-python/1.2.5
3. Wheels with some platform-specific versions, and a generic fallback. Example: protobuf: https://pypi.python.org/pypi/protobuf/3.5.1
4. Wheels with platform-specific versions. Example: grpcio: https://pypi.python.org/pypi/grpcio/1.9.1

This is a huge mess and I don't know how to best deal with it. Good luck! :)

Evan


Doug Greiman

unread,
Feb 16, 2018, 8:46:51 PM2/16/18
to Bazel/Python Special Interest Group
Here's a strawman proposal:

[A] Fetching and using Python packages not created by Bazel

(Note that PyPI is pronounced like "pie pee eye", because "pie pie", aka PyPy, is a different unrelated thing.  I use "pip_" below because it sounds better in my head, but any naming suggestions are welcome.)

1) WORKSPACE rule: pip_mirror()

By default, packages are fetched from "www.pypi.org".  Users can add a directive in their WORKSPACE to control this:

pip_mirror(
     # Where to query and fetch packages from.
    #  None means "don't access the internet for my builds"
    index = "pypi.mycompany.com",
    # Directory to look for and fetch package from.
    # Bazel doesn't know or care if this is in source control, temporary dir, mounted NFS, etc.
    vendor_dir = "some local directory",
    # Additional security and caching attributes... 
)

There are also two "policy" attributes added:

  pip_policy_allow_precompiled_binaries = True|False

If True, then Bazel is allowed to use wheels that contain precompiled extension modules and native libraries.  Otherwise, Bazel must fetch the package's 'sdist', if available, and compile it.

  pip_policy_allow_compilation = True|False

If True, and Bazel encounters a pip package that needs to be compiled, either because that package doesn't have a precompiled binary available or Bazel isn't allowed to use it, then Bazel is allowed to compile it from source.  If False, this situation is an error.

Questions:

Q1) Should these be in WORKSPACE or in .bazelrc?
Q2) Should the 'policy' attributes be part of pip_mirror()?
Q3) Policies are basically: Allow downloads/Don't allow downloads, and Allow compiled code From The Internet/Don't allow compiled code from the internet.  Are there others that people care about?

2) Python Dependency Specification (https://www.python.org/dev/peps/pep-0508/)

The atomic unit of dependency for Python Distribution Packages is a line like this:

  some_package

which means "I depend on the Python Distribution Package named 'some_package'".  A more complex, but entirely realistic example is:

  some_package [security,tests] >= 2.8.1, == 2.8.* ; python_version < "2.7" and os_name == "nt"

which means "I depend on 'some_package' if I'm running Python 2.6 or below on Windows, and I need a version in the range  [2.8.1, 2.9,0)".

[security,tests] are "extras".  It's probably easiest to handle these as separate packages called, e.g., "some_package[security]", which have implicit relationships between themselves and the parent package "some_package".

3) BUILD rule: pip_dependency_set()

This is a list of dependency specifications, equivalent to a "requirements.txt" file.

pip_dependency_set(
    name = "my_requirements",
    # A direct list of dependency specifications
    specs = [spec1, spec2, ...],
    # Path to a standard 'requirements.txt' file, containing a list of specs.
    # 'specs' and 'requirements_path' are mutually exclusive.
    requirements_path = "/path/to/requirements.txt")

Ideally, the specs would be an unordered set, but there are currently some implementation limitations where order makes a difference, so we preserve it here.

4) Bazel concept: pip_resolved_package_set()

A pip_dependency_set() must be evaluated in a particular context, which will be represented in Bazel by a py_toolchain().  The result of this evaluation is an object called pip_resolved_package_set().

The contents of a pip_resolved_packaged_set() are a concrete list of files/urls representing the artifacts to use.  These artifacts can either be 'sdists', or 'wheels'.  Wheels can be Universal Wheels, Pure Python Wheels, or Platform Wheels.  For this discussion, we ignore the problems of producing "manylinux" wheels inside Bazel, since we're just consuming wheels, not producing them.  This list might look like:

["numpy-1.14.0.tar.gz",
 "grpcio-1.0.0-cp34-cp34m-win_amd64.whl", 
 "nox_automation-0.18.2-py2.py3-none-any.whl"]

This list is generated by recursively, since each dependency has its own dependencies, and these sub-dependencies are different for different versions of the parent dependency.  I.e. grpc 1.0 depends on protobuf 3.0, while grpc 2.0 depends on protobuf 4.0.

For each dependency spec, the active pip_mirror() is queried.  This returns a list of versions, and for each version, a list of artifacts.  For example, "numpy" might have available versions: [1.0.0, 1.1.0, ..., 1.14.0].  For the version "1.14.0", it might have available artifacts: ["numpy-1.14.0.tar.gz", "numpy-1.14.0-cp35-cp35m-manylinux1_i686.whl", ...].  For each candidate version, Bazel determines which artifacts are applicable to the current py_toolchain() environment.  Some of this work will be done by the 'pip' tool, but some will need to be reimplemented inside Bazel, because 'pip' does not handle any kind of cross-compilation scenario.  I.e. it's not possible for 'pip' running on Windows under Python 2 to evaluate specs for Linux and Python 3.

5) Bazel concept: pip_installed_package_set()

A pip_resolved_package_set() has a list of artifacts, but we need an actual bunch of files on disk, possibly requiring compilation and/or other tasks defined in setup.py.  We call this a pip_installed_package_set(), which is a "virtualenv"-type directory tree in which all the packages have been installed to, and possibly compiled in.  This tree should ideally match what would happen if a user ran "pip install -r requirements.txt" outside of bazel.

However, we might use a "runfiles" structure instead, since that's what we already have.  This has a few problems, for example, namespace packages don't work when not installed in a "virtualenv" or "site-package" directory.  In particular, the "google" package is a namespace package, so this is kind of big deal.  We can work around this with various hacks, but it's not an ideal user experience.


Implementation note: Different platforms will vary on exactly how much sharing/copying/symlinking of these directory trees can or should be done.

6) BUILD attribute: py_binary(deps)

The py_* rules will accept a 'deps' argument of type 'pip_dependency_set'.  py_library() targets will simply propagate these deps to other targets.  py_binary() and py_test() targets will, at build time, evaluate the pip_dependency_set in the context of the appropriate py_toolchain() to generate a pip_resolved_package_set(), pip_installed_package_set(), and ultimately a "runfiles" tree and/or "virtualenv" tree.

If multiple targets have the same dependency set, then they will share the resulting directory trees to the extent the platform allows (e.g. no symlinks on Windows).  Conversely, if a particular target is built in multiple configurations (e.g. host vs target, or Python 2 vs Python 3), each configuration will has its own installed package set.

Lukács T. Berki

unread,
Feb 19, 2018, 9:10:26 AM2/19/18
to Doug Greiman, bazel-si...@googlegroups.com
On Sat, 17 Feb 2018 at 02:46, 'Doug Greiman' via Bazel/Python Special Interest Group <bazel-si...@googlegroups.com> wrote:
Here's a strawman proposal:

[A] Fetching and using Python packages not created by Bazel

(Note that PyPI is pronounced like "pie pee eye", because "pie pie", aka PyPy, is a different unrelated thing.  I use "pip_" below because it sounds better in my head, but any naming suggestions are welcome.)

1) WORKSPACE rule: pip_mirror()

By default, packages are fetched from "www.pypi.org".  Users can add a directive in their WORKSPACE to control this:

pip_mirror(
     # Where to query and fetch packages from.
    #  None means "don't access the internet for my builds"
    index = "pypi.mycompany.com",
    # Directory to look for and fetch package from.
    # Bazel doesn't know or care if this is in source control, temporary dir, mounted NFS, etc.
    vendor_dir = "some local directory",
    # Additional security and caching attributes... 
)
Is this necessary for an MVP?
 

There are also two "policy" attributes added:

  pip_policy_allow_precompiled_binaries = True|False

If True, then Bazel is allowed to use wheels that contain precompiled extension modules and native libraries.  Otherwise, Bazel must fetch the package's 'sdist', if available, and compile it.

  pip_policy_allow_compilation = True|False

If True, and Bazel encounters a pip package that needs to be compiled, either because that package doesn't have a precompiled binary available or Bazel isn't allowed to use it, then Bazel is allowed to compile it from source.  If False, this situation is an error.
This requires Bazel to compile possibly binary packages from PIP, which is a nontrivial endeavor. Do we need to support compiling binary packages? 
 

Questions:

Q1) Should these be in WORKSPACE or in .bazelrc?
*definiteily* not in .bazelrc. 

Q2) Should the 'policy' attributes be part of pip_mirror()?
Q3) Policies are basically: Allow downloads/Don't allow downloads, and Allow compiled code From The Internet/Don't allow compiled code from the internet.  Are there others that people care about?
Are even these policies necessary? Both eventually and for an MVP? I'd much rather omit flags like this and build them in later, if and when needed so that we can get something useful off the ground earlier.
 

2) Python Dependency Specification (https://www.python.org/dev/peps/pep-0508/)

The atomic unit of dependency for Python Distribution Packages is a line like this:

  some_package

which means "I depend on the Python Distribution Package named 'some_package'".  A more complex, but entirely realistic example is:

  some_package [security,tests] >= 2.8.1, == 2.8.* ; python_version < "2.7" and os_name == "nt"

which means "I depend on 'some_package' if I'm running Python 2.6 or below on Windows, and I need a version in the range  [2.8.1, 2.9,0)".

[security,tests] are "extras".  It's probably easiest to handle these as separate packages called, e.g., "some_package[security]", which have implicit relationships between themselves and the parent package "some_package".

3) BUILD rule: pip_dependency_set()

This is a list of dependency specifications, equivalent to a "requirements.txt" file.

pip_dependency_set(
    name = "my_requirements",
    # A direct list of dependency specifications
    specs = [spec1, spec2, ...],
    # Path to a standard 'requirements.txt' file, containing a list of specs.
    # 'specs' and 'requirements_path' are mutually exclusive.
    requirements_path = "/path/to/requirements.txt")

Ideally, the specs would be an unordered set, but there are currently some implementation limitations where order makes a difference, so we preserve it here.
This must live in the WORKSPACE file, since it requires fetching and/or compiling things, that is, non-hermetic activities.


4) Bazel concept: pip_resolved_package_set()

A pip_dependency_set() must be evaluated in a particular context, which will be represented in Bazel by a py_toolchain().  The result of this evaluation is an object called pip_resolved_package_set().
How would this be represented in Bazel? My best idea would be a repository that contains a rule for each package specifically requested that Python rules in BUILD files can depend on and rules with private visibility for the transitive dependencies of the requested packages.
 

The contents of a pip_resolved_packaged_set() are a concrete list of files/urls representing the artifacts to use.  These artifacts can either be 'sdists', or 'wheels'.  Wheels can be Universal Wheels, Pure Python Wheels, or Platform Wheels.  For this discussion, we ignore the problems of producing "manylinux" wheels inside Bazel, since we're just consuming wheels, not producing them.  This list might look like:

["numpy-1.14.0.tar.gz",
 "grpcio-1.0.0-cp34-cp34m-win_amd64.whl", 
 "nox_automation-0.18.2-py2.py3-none-any.whl"]

This list is generated by recursively, since each dependency has its own dependencies, and these sub-dependencies are different for different versions of the parent dependency.  I.e. grpc 1.0 depends on protobuf 3.0, while grpc 2.0 depends on protobuf 4.0.

For each dependency spec, the active pip_mirror() is queried.  This returns a list of versions, and for each version, a list of artifacts.  For example, "numpy" might have available versions: [1.0.0, 1.1.0, ..., 1.14.0].  For the version "1.14.0", it might have available artifacts: ["numpy-1.14.0.tar.gz", "numpy-1.14.0-cp35-cp35m-manylinux1_i686.whl", ...].  For each candidate version, Bazel determines which artifacts are applicable to the current py_toolchain() environment.  Some of this work will be done by the 'pip' tool, but some will need to be reimplemented inside Bazel, because 'pip' does not handle any kind of cross-compilation scenario.  I.e. it's not possible for 'pip' running on Windows under Python 2 to evaluate specs for Linux and Python 3.
Whoa, so we need to reimplement the version resolution logic of "pip" in Bazel? That doesn't seem like a path to happiness :(
 

5) Bazel concept: pip_installed_package_set()

A pip_resolved_package_set() has a list of artifacts, but we need an actual bunch of files on disk, possibly requiring compilation and/or other tasks defined in setup.py.  We call this a pip_installed_package_set(), which is a "virtualenv"-type directory tree in which all the packages have been installed to, and possibly compiled in.  This tree should ideally match what would happen if a user ran "pip install -r requirements.txt" outside of bazel.

However, we might use a "runfiles" structure instead, since that's what we already have.  This has a few problems, for example, namespace packages don't work when not installed in a "virtualenv" or "site-package" directory.  In particular, the "google" package is a namespace package, so this is kind of big deal.  We can work around this with various hacks, but it's not an ideal user experience.


Implementation note: Different platforms will vary on exactly how much sharing/copying/symlinking of these directory trees can or should be done.

6) BUILD attribute: py_binary(deps)

The py_* rules will accept a 'deps' argument of type 'pip_dependency_set'.  py_library() targets will simply propagate these deps to other targets.  py_binary() and py_test() targets will, at build time, evaluate the pip_dependency_set in the context of the appropriate py_toolchain() to generate a pip_resolved_package_set(), pip_installed_package_set(), and ultimately a "runfiles" tree and/or "virtualenv" tree.

If multiple targets have the same dependency set, then they will share the resulting directory trees to the extent the platform allows (e.g. no symlinks on Windows).  Conversely, if a particular target is built in multiple configurations (e.g. host vs target, or Python 2 vs Python 3), each configuration will has its own installed package set.
By "same dependency set" do you mean the same pip_dependency_set rule?
 


On Friday, February 2, 2018 at 2:15:02 AM UTC-8, Lukács T. Berki wrote:
Hey there,

(I'd especially like the opinion of Klaus and Dmitry, since they have all the WORKSPACE logic in their heads at the moment)

Looking at the new rules_pyz repository, the way it works is that it parses requirements.txt, then emits a WORKSPACE file that downloads the transitive dependencies of whatever is in that requirements.txt . While it works for the exact use case it was conceived for, it does have a few issues.

AFAIU it only works for the dependencies of the top-level repository; if I have a tool that uses rules_pyz, its dependencies don't end up in my WORKSPACE.

It does not handle cases where two binaries require two different versions of the same binary; in fact, the name of the WORKSPACE rule is a function of only the name of the package, thus, pulling in two different versions of the same package is impossible.

It requires running a tool to build the WORKSPACE file, which makes the most convenient "git clone; bazel build" use case reliant on checking in its output (thiry_party/pypi/pypi_rules.bzl) and one has to remember to re-run that tool if the requirements change.

It assumes that the architecture one is building for is the one where pip_generate_wrapper is run. This makes both cross-compilation and checking out the code and building it on a different platform impossible. Ideally, one would only download the wheels / binaries for only the platform one needs them for.

WDY'allT? How do we do this so that it meshes well with the way we think WORKSPACE files should work?

--
Lukács T. Berki | Software Engineer | lbe...@google.com | 

Google Germany GmbH | Erika-Mann-Str. 33  | 80636 München | Germany | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado | Registergericht und -nummer: Hamburg, HRB 86891

--
You received this message because you are subscribed to the Google Groups "Bazel/Python Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-pyth...@googlegroups.com.
To post to this group, send email to bazel-si...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Hyrum Wright

unread,
Feb 19, 2018, 9:38:06 AM2/19/18
to Doug Greiman, Bazel/Python Special Interest Group
On Fri, Feb 16, 2018 at 8:46 PM, 'Doug Greiman' via Bazel/Python Special Interest Group <bazel-sig-python@googlegroups.com> wrote:
Here's a strawman proposal:

[A] Fetching and using Python packages not created by Bazel

(Note that PyPI is pronounced like "pie pee eye", because "pie pie", aka PyPy, is a different unrelated thing.  I use "pip_" below because it sounds better in my head, but any naming suggestions are welcome.)

1) WORKSPACE rule: pip_mirror()

By default, packages are fetched from "www.pypi.org".  Users can add a directive in their WORKSPACE to control this:

pip_mirror(
     # Where to query and fetch packages from.
    #  None means "don't access the internet for my builds"
    index = "pypi.mycompany.com",
    # Directory to look for and fetch package from.
    # Bazel doesn't know or care if this is in source control, temporary dir, mounted NFS, etc.
    vendor_dir = "some local directory",
    # Additional security and caching attributes... 
)

There are also two "policy" attributes added:

  pip_policy_allow_precompiled_binaries = True|False

If True, then Bazel is allowed to use wheels that contain precompiled extension modules and native libraries.  Otherwise, Bazel must fetch the package's 'sdist', if available, and compile it.

  pip_policy_allow_compilation = True|False

If True, and Bazel encounters a pip package that needs to be compiled, either because that package doesn't have a precompiled binary available or Bazel isn't allowed to use it, then Bazel is allowed to compile it from source.  If False, this situation is an error.

Questions:

Q1) Should these be in WORKSPACE or in .bazelrc?

In the WORKSPACE file.  This is something I want as part of the project configuration, not set independently on a per-user basis.
 
Q2) Should the 'policy' attributes be part of pip_mirror()?
Q3) Policies are basically: Allow downloads/Don't allow downloads, and Allow compiled code From The Internet/Don't allow compiled code from the internet.  Are there others that people care about?

2) Python Dependency Specification (https://www.python.org/dev/peps/pep-0508/)

The atomic unit of dependency for Python Distribution Packages is a line like this:

  some_package

which means "I depend on the Python Distribution Package named 'some_package'".  A more complex, but entirely realistic example is:

  some_package [security,tests] >= 2.8.1, == 2.8.* ; python_version < "2.7" and os_name == "nt"

which means "I depend on 'some_package' if I'm running Python 2.6 or below on Windows, and I need a version in the range  [2.8.1, 2.9,0)".

[security,tests] are "extras".  It's probably easiest to handle these as separate packages called, e.g., "some_package[security]", which have implicit relationships between themselves and the parent package "some_package".

3) BUILD rule: pip_dependency_set()

This is a list of dependency specifications, equivalent to a "requirements.txt" file.

pip_dependency_set(
    name = "my_requirements",
    # A direct list of dependency specifications
    specs = [spec1, spec2, ...],
    # Path to a standard 'requirements.txt' file, containing a list of specs.
    # 'specs' and 'requirements_path' are mutually exclusive.
    requirements_path = "/path/to/requirements.txt")

Ideally, the specs would be an unordered set, but there are currently some implementation limitations where order makes a difference, so we preserve it here.

Would this be referenceable from an external repository?

Consider an organization with a number of different python repositories which wants to build on a common set of third-party dependencies.  In this scenario, it makes sense to put a vetted collection of third-party dependencies and requirements in a separate repository, and then just reference that repository everywhere dependencies are required.  This is currently semi-doable, but runs into problems like https://github.com/bazelbuild/rules_python/issues/47 where *all* the requirements are built, even if only a few are needed.
 

--
You received this message because you are subscribed to the Google Groups "Bazel/Python Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-python+unsubscribe@googlegroups.com.
To post to this group, send email to bazel-sig-python@googlegroups.com.

Lukács T. Berki

unread,
Feb 19, 2018, 9:41:25 AM2/19/18
to Hyrum Wright, Doug Greiman, bazel-si...@googlegroups.com



On Mon, 19 Feb 2018 at 15:38, Hyrum Wright <hwr...@duolingo.com> wrote:


On Fri, Feb 16, 2018 at 8:46 PM, 'Doug Greiman' via Bazel/Python Special Interest Group <bazel-si...@googlegroups.com> wrote:
Here's a strawman proposal:

[A] Fetching and using Python packages not created by Bazel

(Note that PyPI is pronounced like "pie pee eye", because "pie pie", aka PyPy, is a different unrelated thing.  I use "pip_" below because it sounds better in my head, but any naming suggestions are welcome.)

1) WORKSPACE rule: pip_mirror()

By default, packages are fetched from "www.pypi.org".  Users can add a directive in their WORKSPACE to control this:

pip_mirror(
     # Where to query and fetch packages from.
    #  None means "don't access the internet for my builds"
    index = "pypi.mycompany.com",
    # Directory to look for and fetch package from.
    # Bazel doesn't know or care if this is in source control, temporary dir, mounted NFS, etc.
    vendor_dir = "some local directory",
    # Additional security and caching attributes... 
)

There are also two "policy" attributes added:

  pip_policy_allow_precompiled_binaries = True|False

If True, then Bazel is allowed to use wheels that contain precompiled extension modules and native libraries.  Otherwise, Bazel must fetch the package's 'sdist', if available, and compile it.

  pip_policy_allow_compilation = True|False

If True, and Bazel encounters a pip package that needs to be compiled, either because that package doesn't have a precompiled binary available or Bazel isn't allowed to use it, then Bazel is allowed to compile it from source.  If False, this situation is an error.

Questions:

Q1) Should these be in WORKSPACE or in .bazelrc?

In the WORKSPACE file.  This is something I want as part of the project configuration, not set independently on a per-user basis.
+1
 
 
Q2) Should the 'policy' attributes be part of pip_mirror()?
Q3) Policies are basically: Allow downloads/Don't allow downloads, and Allow compiled code From The Internet/Don't allow compiled code from the internet.  Are there others that people care about?

2) Python Dependency Specification (https://www.python.org/dev/peps/pep-0508/)

The atomic unit of dependency for Python Distribution Packages is a line like this:

  some_package

which means "I depend on the Python Distribution Package named 'some_package'".  A more complex, but entirely realistic example is:

  some_package [security,tests] >= 2.8.1, == 2.8.* ; python_version < "2.7" and os_name == "nt"

which means "I depend on 'some_package' if I'm running Python 2.6 or below on Windows, and I need a version in the range  [2.8.1, 2.9,0)".

[security,tests] are "extras".  It's probably easiest to handle these as separate packages called, e.g., "some_package[security]", which have implicit relationships between themselves and the parent package "some_package".

3) BUILD rule: pip_dependency_set()

This is a list of dependency specifications, equivalent to a "requirements.txt" file.

pip_dependency_set(
    name = "my_requirements",
    # A direct list of dependency specifications
    specs = [spec1, spec2, ...],
    # Path to a standard 'requirements.txt' file, containing a list of specs.
    # 'specs' and 'requirements_path' are mutually exclusive.
    requirements_path = "/path/to/requirements.txt")

Ideally, the specs would be an unordered set, but there are currently some implementation limitations where order makes a difference, so we preserve it here.

Would this be referenceable from an external repository?

Consider an organization with a number of different python repositories which wants to build on a common set of third-party dependencies.  In this scenario, it makes sense to put a vetted collection of third-party dependencies and requirements in a separate repository, and then just reference that repository everywhere dependencies are required.  This is currently semi-doable, but runs into problems like https://github.com/bazelbuild/rules_python/issues/47 where *all* the requirements are built, even if only a few are needed.
Representing pip_dependency_set as a single external repository would neatly solve this problem; there would be a single @my_requirements repository with a target for each of the explicit requirements. Then if a Python binary doesn't depend on a target in the repository, it won't get built and won't be in its runfiles tree / virtualenv / whatever we come up with.

 
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-pyth...@googlegroups.com.
To post to this group, send email to bazel-si...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Bazel/Python Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-pyth...@googlegroups.com.
To post to this group, send email to bazel-si...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-sig-python/CALmYJEW9dcADDDRo2op5C7a1Bq-N%2BNgBEEAOmSX_TJq8%3DxqK4Q%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

Evan Jones

unread,
Feb 19, 2018, 3:29:58 PM2/19/18
to Lukács T. Berki, Doug Greiman, Bazel/Python Special Interest Group
On Mon, Feb 19, 2018 at 9:10 AM, 'Lukács T. Berki' via Bazel/Python Special Interest Group <bazel-si...@googlegroups.com> wrote:
There are also two "policy" attributes added:

  pip_policy_allow_precompiled_binaries = True|False

If True, then Bazel is allowed to use wheels that contain precompiled extension modules and native libraries.  Otherwise, Bazel must fetch the package's 'sdist', if available, and compile it.

  pip_policy_allow_compilation = True|False

If True, and Bazel encounters a pip package that needs to be compiled, either because that package doesn't have a precompiled binary available or Bazel isn't allowed to use it, then Bazel is allowed to compile it from source.  If False, this situation is an error.
This requires Bazel to compile possibly binary packages from PIP, which is a nontrivial endeavor. Do we need to support compiling binary packages? 

For an initial version, maybe not, but you won't get very far in the PyPI universe if you can't use a "source" package (although maybe only pure Python source packages would be fine?). For example, see the top 360 most downloaded PyPI packages, and their current wheel status: https://pythonwheels.com/ . For the Google Cloud APIs, you are going to get blocked by this package published by Google: https://pypi.python.org/pypi/googleapis-common-protos


For each dependency spec, the active pip_mirror() is queried.  This returns a list of versions, and for each version, a list of artifacts.  For example, "numpy" might have available versions: [1.0.0, 1.1.0, ..., 1.14.0].  For the version "1.14.0", it might have available artifacts: ["numpy-1.14.0.tar.gz", "numpy-1.14.0-cp35-cp35m-manylinux1_i686.whl", ...].  For each candidate version, Bazel determines which artifacts are applicable to the current py_toolchain() environment.  Some of this work will be done by the 'pip' tool, but some will need to be reimplemented inside Bazel, because 'pip' does not handle any kind of cross-compilation scenario.  I.e. it's not possible for 'pip' running on Windows under Python 2 to evaluate specs for Linux and Python 3.
Whoa, so we need to reimplement the version resolution logic of "pip" in Bazel? That doesn't seem like a path to happiness :(

If this is only necessary for cross-compilation, I would recommend deferring this if it simplifies things. It would be very nice to have working cross-complication, but I can deal with this if it helps get something working shipped sooner.


6) BUILD attribute: py_binary(deps)

The py_* rules will accept a 'deps' argument of type 'pip_dependency_set'.  py_library() targets will simply propagate these deps to other targets.  py_binary() and py_test() targets will, at build time, evaluate the pip_dependency_set in the context of the appropriate py_toolchain() to generate a pip_resolved_package_set(), pip_installed_package_set(), and ultimately a "runfiles" tree and/or "virtualenv" tree.

If multiple targets have the same dependency set, then they will share the resulting directory trees to the extent the platform allows (e.g. no symlinks on Windows).  Conversely, if a particular target is built in multiple configurations (e.g. host vs target, or Python 2 vs Python 3), each configuration will has its own installed package set.
By "same dependency set" do you mean the same pip_dependency_set rule?

I'm also a bit confused by this. In addition:

* It is useful to list a set of set of Python requirements, then have Bazel py_library/py_binary targets depend on individual specific packages, but the resolution to be "consistent". E.g. maybe I have one thing that depends on numpy, and another that depends on both scipy and numpy; I'd like scipy and numpy to be resolved together and compatible, and these two targets to use the same version of numpy.

* Even if its better for some reason to represent fully resolved "sets" of Python packages, Bazel will still have to deal with diamond dependency conflicts, unless I'm confused. E.g. To use your example, I could define one set that depends on grpcio==1.0, and another that depends on grpcio==2.0, so what happens if I list both in my deps? I assume the answer is "error because the both try to write the same set of files"?


The rest of this proposal seems fine to me. I'm not picky about the specific details. :)

Evan


Hyrum Wright

unread,
Feb 19, 2018, 4:16:38 PM2/19/18
to Evan Jones, Lukács T. Berki, Doug Greiman, Bazel/Python Special Interest Group


On Mon, Feb 19, 2018 at 3:29 PM, Evan Jones <evan....@bluecore.com> wrote:

On Mon, Feb 19, 2018 at 9:10 AM, 'Lukács T. Berki' via Bazel/Python Special Interest Group <bazel-sig-python@googlegroups.com> wrote:
There are also two "policy" attributes added:

  pip_policy_allow_precompiled_binaries = True|False

If True, then Bazel is allowed to use wheels that contain precompiled extension modules and native libraries.  Otherwise, Bazel must fetch the package's 'sdist', if available, and compile it.

  pip_policy_allow_compilation = True|False

If True, and Bazel encounters a pip package that needs to be compiled, either because that package doesn't have a precompiled binary available or Bazel isn't allowed to use it, then Bazel is allowed to compile it from source.  If False, this situation is an error.
This requires Bazel to compile possibly binary packages from PIP, which is a nontrivial endeavor. Do we need to support compiling binary packages? 

For an initial version, maybe not, but you won't get very far in the PyPI universe if you can't use a "source" package (although maybe only pure Python source packages would be fine?). For example, see the top 360 most downloaded PyPI packages, and their current wheel status: https://pythonwheels.com/ . For the Google Cloud APIs, you are going to get blocked by this package published by Google: https://pypi.python.org/pypi/googleapis-common-protos

Agreed.  Almost every wrapper around some native library needs native compilation support, as well as things like grpc and protobuf which require native code for fast implementations on platforms their wheels don't support.

This does actually raise the question of what to use as input for local compilation of things like pycairo, which have traditionally just used the headers installed by the third-party package on the local system.  In an ideal world, that third-party package becomes part of the input to the python targets, and gets compiled from source for the target platform, but that may be more work than most users are willing to put in.  (Today, we just install the wrapper library in the target Docker image using the system package manger, and that pulls in the native library.)

For each dependency spec, the active pip_mirror() is queried.  This returns a list of versions, and for each version, a list of artifacts.  For example, "numpy" might have available versions: [1.0.0, 1.1.0, ..., 1.14.0].  For the version "1.14.0", it might have available artifacts: ["numpy-1.14.0.tar.gz", "numpy-1.14.0-cp35-cp35m-manylinux1_i686.whl", ...].  For each candidate version, Bazel determines which artifacts are applicable to the current py_toolchain() environment.  Some of this work will be done by the 'pip' tool, but some will need to be reimplemented inside Bazel, because 'pip' does not handle any kind of cross-compilation scenario.  I.e. it's not possible for 'pip' running on Windows under Python 2 to evaluate specs for Linux and Python 3.
Whoa, so we need to reimplement the version resolution logic of "pip" in Bazel? That doesn't seem like a path to happiness :(

If this is only necessary for cross-compilation, I would recommend deferring this if it simplifies things. It would be very nice to have working cross-complication, but I can deal with this if it helps get something working shipped sooner.

Also agreed.  Cross-compilation is a nice-to-have, not a need-to-have at this point.
 
6) BUILD attribute: py_binary(deps)

The py_* rules will accept a 'deps' argument of type 'pip_dependency_set'.  py_library() targets will simply propagate these deps to other targets.  py_binary() and py_test() targets will, at build time, evaluate the pip_dependency_set in the context of the appropriate py_toolchain() to generate a pip_resolved_package_set(), pip_installed_package_set(), and ultimately a "runfiles" tree and/or "virtualenv" tree.

If multiple targets have the same dependency set, then they will share the resulting directory trees to the extent the platform allows (e.g. no symlinks on Windows).  Conversely, if a particular target is built in multiple configurations (e.g. host vs target, or Python 2 vs Python 3), each configuration will has its own installed package set.
By "same dependency set" do you mean the same pip_dependency_set rule?

I'm also a bit confused by this. In addition:

* It is useful to list a set of set of Python requirements, then have Bazel py_library/py_binary targets depend on individual specific packages, but the resolution to be "consistent". E.g. maybe I have one thing that depends on numpy, and another that depends on both scipy and numpy; I'd like scipy and numpy to be resolved together and compatible, and these two targets to use the same version of numpy.

* Even if its better for some reason to represent fully resolved "sets" of Python packages, Bazel will still have to deal with diamond dependency conflicts, unless I'm confused. E.g. To use your example, I could define one set that depends on grpcio==1.0, and another that depends on grpcio==2.0, so what happens if I list both in my deps? I assume the answer is "error because the both try to write the same set of files"?


The rest of this proposal seems fine to me. I'm not picky about the specific details. :)

Evan


--
You received this message because you are subscribed to the Google Groups "Bazel/Python Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-python+unsubscribe@googlegroups.com.
To post to this group, send email to bazel-sig-python@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-sig-python/CACRPLjg3Ch46oHAT%2BsemxhojrTH9Mo2J2gO4AXO_Z_P9WDZKOw%40mail.gmail.com.

Chad Moore

unread,
Feb 20, 2018, 11:23:32 AM2/20/18
to Hyrum Wright, Evan Jones, Lukács T. Berki, Doug Greiman, Bazel/Python Special Interest Group
I need an internal index for packages.  Our CI system is not going to rely on the stability of an external server.  Even for internal ones, we need a way to (optionally?) pin version requirements rather than leaving them loose (>= 1.0, etc.) similar to how there's a sha256 sum for http_archives.

Can pip be used under the hood?

For instance, there could be a pip_runtime or something that gets/has a specific pip version, perhaps from another new_local_repository, etc. and that pip can be used as a dep to the other rules.  With pip, --index-url works inside requirements.txt and so does == version.  (Note that I see modifying requirements.txt as just a workaround until the parameters can be exposed rather than a long-term solution.)

For runfiles, can the rules create a venv-compatible tree (perhaps actually via pip) and then just symlink it into runfiles and add the path?

In general, I like the way this thread is going so far.

Thanks,
Chad

Lukács T. Berki

unread,
Feb 28, 2018, 7:07:40 AM2/28/18
to cmo...@uber.com, Hyrum Wright, Evan Jones, Doug Greiman, bazel-si...@googlegroups.com
On Tue, 20 Feb 2018 at 17:23, Chad Moore <cmo...@uber.com> wrote:
I need an internal index for packages.  Our CI system is not going to rely on the stability of an external server.  Even for internal ones, we need a way to (optionally?) pin version requirements rather than leaving them loose (>= 1.0, etc.) similar to how there's a sha256 sum for http_archives.

Can pip be used under the hood?
That sounds like the best option. If we didn't, we'd have to re-implement pip. And if we did, we get full requirements.txt support without any effort.
 

For instance, there could be a pip_runtime or something that gets/has a specific pip version, perhaps from another new_local_repository, etc. and that pip can be used as a dep to the other rules.  With pip, --index-url works inside requirements.txt and so does == version.  (Note that I see modifying requirements.txt as just a workaround until the parameters can be exposed rather than a long-term solution.)

For runfiles, can the rules create a venv-compatible tree (perhaps actually via pip) and then just symlink it into runfiles and add the path?

In general, I like the way this thread is going so far.

Thanks,
Chad
On Mon, Feb 19, 2018 at 4:16 PM, Hyrum Wright <hwr...@duolingo.com> wrote:
On Mon, Feb 19, 2018 at 3:29 PM, Evan Jones <evan....@bluecore.com> wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-pyth...@googlegroups.com.
To post to this group, send email to bazel-si...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Bazel/Python Special Interest Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-sig-pyth...@googlegroups.com.
To post to this group, send email to bazel-si...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages