How can I build and run command-line tools with bazel?

2,655 views
Skip to first unread message

aus...@zoox.com

unread,
Apr 27, 2017, 12:47:19 AM4/27/17
to bazel-discuss
I'm porting some command-line tools that are used for analyzing log files from CMake to Bazel. For the sake of example, let's call this tool grok

In CMake, I did add_executable, target_link_libraries, and install to build, link and install grok , and eventually it was installed into /usr/local/bin (or wherever the user wanted to install it), and was therefore on the user's PATH

Then I would run `grok foo.log foo.summary` to analyze the foo.log file in my working directory and produce a summary file called foo.summary, again in my working directory.

Now that I've ported this tool to bazel, I can build it with `bazel build grok`, but running it is problematic.

Running it with `bazel run grok -- foo.log foo.summary` cannot find foo.log (because it's now running in the runfiles directory) and now the user can't find foo.summary, because it is created in the runfiles directory.

I can run `bazel run grok -- $(pwd)/foo.log $(pwd)/foo.summary` and get the previous result, but this it a lot of extra typing and tab-completion no longer works, so it slows down the user's workflow considerably.

Further, for the sake of argument, let's say that grok relies on some data file (say, a pre-built database of analysis rules) that the bazel-ported version finds using a workspace-relative path. Previously grok was using the absolute, installed path of this data file, but now grok relies on being executed within its runfiles tree to find its resources correctly.

This leads to the following questions:

* Does bazel provide a mechanism for installing binaries, or some other way to put them on the PATH? How is this achieved within google?
* Does bazel provide a working-directory agnostic mechanism for finding resource files? How is this done within google?
* How should I pass file arguments to my program? Do I always need to pass absolute paths or is there a better way?

Any advice would be appreciated,

Thanks,
-Austin

Damien Martin-Guillerez

unread,
Apr 27, 2017, 4:31:28 AM4/27/17
to aus...@zoox.com, bazel-discuss
On Thu, Apr 27, 2017 at 6:47 AM <aus...@zoox.com> wrote:
I'm porting some command-line tools that are used for analyzing log files from CMake to Bazel. For the sake of example, let's call this tool grok

In CMake, I did add_executable, target_link_libraries, and install to build, link and install grok , and eventually it was installed into /usr/local/bin (or wherever the user wanted to install it), and was therefore on the user's PATH

Then I would run `grok foo.log foo.summary` to analyze the foo.log file in my working directory and produce a summary file called foo.summary, again in my working directory.

Now that I've ported this tool to bazel, I can build it with `bazel build grok`, but running it is problematic.

Running it with `bazel run grok -- foo.log foo.summary` cannot find foo.log (because it's now running in the runfiles directory) and now the user can't find foo.summary, because it is created in the runfiles directory.

I can run `bazel run grok -- $(pwd)/foo.log $(pwd)/foo.summary` and get the previous result, but this it a lot of extra typing and tab-completion no longer works, so it slows down the user's workflow considerably.

you can also do `bazel-bin/grok foo.log foo.summary` 


Further, for the sake of argument, let's say that grok relies on some data file (say, a pre-built database of analysis rules) that the bazel-ported version finds using a workspace-relative path. Previously grok was using the absolute, installed path of this data file, but now grok relies on being executed within its runfiles tree to find its resources correctly.

This leads to the following questions:

* Does bazel provide a mechanism for installing binaries, or some other way to put them on the PATH? How is this achieved within google?

We generally build bundled binary that are self-contained (e.g. par for python archive) and just deploy that binary. We do not have a proper mechanisms for installation but you can build one: http://stackoverflow.com/questions/43549923/how-do-i-install-a-project-built-with-bazel
 
* Does bazel provide a working-directory agnostic mechanism for finding resource files? How is this done within google?

IIUC foo.log is not really a ressource, more an input. A resource would be put inside the data attribute of your target and ends-up in the runfiles directory (which is working directory agnostic).
 
* How should I pass file arguments to my program? Do I always need to pass absolute paths or is there a better way?

Yes if through bazel run, or you can invoke your program directly: `bazel-bin/grok foo.log foo.summary` 


Any advice would be appreciated,

Thanks,
-Austin

--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/f27e134d-d6bb-47b7-9d32-4a02b058b1ec%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matthew Woehlke

unread,
Apr 27, 2017, 11:22:33 AM4/27/17
to bazel-...@googlegroups.com
On 2017-04-27 04:31, 'Damien Martin-Guillerez' via bazel-discuss wrote:
> On Thu, Apr 27, 2017 at 6:47 AM <aus...@zoox.com> wrote:
>> * Does bazel provide a mechanism for installing binaries, or some other
>> way to put them on the PATH?

No.

> We generally build bundled binary that are self-contained (e.g. par for
> python archive) and just deploy that binary.

This is completely unacceptable for any sane Linux distribution. (I know
absolutely that Fedora will not accept any packages like this. AFAIK the
same is true for most major distributions.)

> We do not have a proper mechanisms for installation but you can build
> one:
> http://stackoverflow.com/questions/43549923/how-do-i-install-a-project-built-with-bazel

This needs to be added to Bazel. In particular, this:

"do checksums to check if there are change to the installed file and
does the actual copy of the files if they have changed"

...needs to be part of Bazel itself, for the sake of portability.

CMake does all these things for you. Personally, if you have a choice, I
would recommend staying with CMake, at least until Bazel is more mature
and ready to be used in an open source ecosystem.

I *might* be able to work on some of these problems in the
not-so-distant future (our customer for the project I am working on is
very insistent about using Bazel in spite of its shortcomings, and they
are running into these same issues).

Right now I have a stand-alone shell script that has a bunch of
hard-coded logic to copy stuff around into a "scratch" install tree
which can then be turned into a tarball and/or copied (with
modified-content checks, per above) to a user specified install tree.
This is a really ugly approach, however, as it is a) not portable (at
least not to Windows), and b) puts the what-to-build and what-to-install
logic in widely divergent locations.

What I want from Bazel - what I *expect* from a modern build tool - is
the moral equivalent of CMake's `install` commands. That is, I want a
command that I can write next to (or even as part of?) `cc_binary` that
says 'this target/file should be installed' and (optionally) 'this
target should be available for import into other projects'.

(I'd anticipate the way this would be implemented is that each such call
records information that is later used by another process that collects
all such information and generates a build and/or run target to generate
the export information and perform the install. It's okay if that
includes having to call an extra function/macro for that purpose.)

--
Matthew

Ulf Adams

unread,
Apr 27, 2017, 1:00:55 PM4/27/17
to Matthew Woehlke, bazel-...@googlegroups.com
On Thu, Apr 27, 2017 at 5:22 PM, Matthew Woehlke <matthew...@kitware.com> wrote:
On 2017-04-27 04:31, 'Damien Martin-Guillerez' via bazel-discuss wrote:
> On Thu, Apr 27, 2017 at 6:47 AM <aus...@zoox.com> wrote:
>> * Does bazel provide a mechanism for installing binaries, or some other
>> way to put them on the PATH?

No.

> We generally build bundled binary that are self-contained (e.g. par for
> python archive) and just deploy that binary.

This is completely unacceptable for any sane Linux distribution. (I know
absolutely that Fedora will not accept any packages like this. AFAIK the
same is true for most major distributions.)

Bazel doesn't require building self-contained binaries, although that's how it's commonly used at Google.
 

> We do not have a proper mechanisms for installation but you can build
> one:
> http://stackoverflow.com/questions/43549923/how-do-i-install-a-project-built-with-bazel

This needs to be added to Bazel. In particular, this:

  "do checksums to check if there are change to the installed file and
does the actual copy of the files if they have changed"

...needs to be part of Bazel itself, for the sake of portability.

CMake does all these things for you. Personally, if you have a choice, I
would recommend staying with CMake, at least until Bazel is more mature
and ready to be used in an open source ecosystem.

Not everyone has the same requirements. In particular, not everyone who's using Bazel needs to ship packages for 'normal' linux distributions.

That said, are you saying that CMake "checksums to check if there are change to the installed file and
does the actual copy of the files if they have changed"? That seems to be what you imply, but I am confused - wouldn't that be duplicating the task of the package manager (if one exists)?


I *might* be able to work on some of these problems in the
not-so-distant future (our customer for the project I am working on is
very insistent about using Bazel in spite of its shortcomings, and they
are running into these same issues).

Can you be more specific about what you believe to be Bazel's shortcomings?
 

Right now I have a stand-alone shell script that has a bunch of
hard-coded logic to copy stuff around into a "scratch" install tree
which can then be turned into a tarball and/or copied (with
modified-content checks, per above) to a user specified install tree.
This is a really ugly approach, however, as it is a) not portable (at
least not to Windows), and b) puts the what-to-build and what-to-install
logic in widely divergent locations.

You could write a rule that describes the binary and the install tree, and that generates the shell script, and it could also do different things on Windows vs. other platforms.
 

What I want from Bazel - what I *expect* from a modern build tool - is
the moral equivalent of CMake's `install` commands. That is, I want a
command that I can write next to (or even as part of?) `cc_binary` that
says 'this target/file should be installed' and (optionally) 'this
target should be available for import into other projects'.

Instead of a moral equivalent of CMake's 'install', wouldn't it be better to use the local package manager for installing? Also, why couldn't that be written as a Bazel rule?
 

(I'd anticipate the way this would be implemented is that each such call
records information that is later used by another process that collects
all such information and generates a build and/or run target to generate
the export information and perform the install. It's okay if that
includes having to call an extra function/macro for that purpose.)

--
Matthew
--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discuss+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/69b2eeb8-8102-000d-0580-e2975876e644%40kitware.com.

Matthew Woehlke

unread,
Apr 27, 2017, 1:21:05 PM4/27/17
to bazel-...@googlegroups.com
On 2017-04-27 11:22, Matthew Woehlke wrote:
> CMake does all these things for you. Personally, if you have a choice, I
> would recommend staying with CMake, at least until Bazel is more mature
> and ready to be used in an open source ecosystem.

Ahem. Disclaimer: my employer maintains CMake. (However, I am not a
CMake developer.)

That said... I mentioned CMake because a) it does solve the OP's
problems, and b) the OP mentioned that his project previously used
CMake. Also, my preference for CMake predates my current employment.

I don't know that I can *recommend* autotools (can anyone? :-)), but I
*would* recommend that an autotools project not switch to Bazel.

I'm going to resist turning this into a dissertation on everything I
think is wrong with Bazel. The short version however can be boiled down
to two points.

1. It's immature. If I've learned anything from years of building a
number of software projects, it's that software has many, many more
corner cases than the people that want to reinvent the wheel think.
CMake is complicated because it handles a great many corner cases (the
same is true for autotools, though some of autotools' complexity has
become superfluous). Any young build system is going to have major
problems; early adopters need to be willing to pay the cost of dealing
with those. Any build system that claims to be "simple" will probably be
unable to handle to handle a corner case that your project needs.

2. It isn't designed for "real world" software. Of course, this needs a
*lot* of clarification. Obviously, it works for Google's internal
software. It would probably work well for proprietary software that is
distributed as a self-contained binary blob. (Here, you may notice we
are back on the subject matter of the original post...) Where it falls
apart is in assuming that the entire world lives in the One True
Repository: there are no external dependencies that the software uses,
and there are no external consumers of the software. This is true for
Google internally, but *not* for most open source projects.

Bazel does *try* to provide a versatile environment (though I've already
been bit by lack of regular expression support), but it's still "too
young". Windows support is very much a work in progress. Depending on
Java makes it INCREDIBLY bloated compared to something implemented in
pure C++. Worst, I see Bazel making some of the same mistakes that have
historically plagued other projects (e.g. using literal flags instead of
semantic attributes, which is one of the main problems with pkg-config).
Given that, I can virtually guarantee that, unless it goes through a few
cycles of uncompromising, compatibility-breaking refactoring, it *will*
wind up every bit as messy as any other tool within a few years.

All that said... being able to install the software you just built, and
other projects being able to use that software, seem like fairly basic
requirements for a build tool. I'd like to see Bazel develop some first
class support here. Until it does, I really can't do other than
recommend that developers use a tool that *does* provide these basic
features.

The good news is... it's still in the 0.x stage. If the developers are
willing to *learn* the lessons other tools have learned, and to be
ruthless about refactoring, there is still hope :-).

--
Matthew

Matthew Woehlke

unread,
Apr 27, 2017, 1:59:11 PM4/27/17
to Ulf Adams, bazel-...@googlegroups.com
On 2017-04-27 13:00, Ulf Adams wrote:
> On Thu, Apr 27, 2017 at 5:22 PM, Matthew Woehlke wrote:
>> On 2017-04-27 04:31, 'Damien Martin-Guillerez' via bazel-discuss wrote:
>>> We generally build bundled binary that are self-contained (e.g. par for
>>> python archive) and just deploy that binary.
>>
>> This is completely unacceptable for any sane Linux distribution. (I know
>> absolutely that Fedora will not accept any packages like this. AFAIK the
>> same is true for most major distributions.)
>
> Bazel doesn't require building self-contained binaries, although that's how
> it's commonly used at Google.

Sure, but from what I've seen so far, being able to consume external
dependencies could use some work.

>> In particular, this:
>>
>> "do checksums to check if there are change to the installed file and
>> does the actual copy of the files if they have changed"
>>
>> ...needs to be part of Bazel itself, for the sake of portability.
>>
>> CMake does all these things for you. Personally, if you have a choice, I
>> would recommend staying with CMake, at least until Bazel is more mature
>> and ready to be used in an open source ecosystem.
>
> Not everyone has the same requirements. In particular, not everyone who's
> using Bazel needs to ship packages for 'normal' linux distributions.

True, but being *able* to be shipped by Linux distributions really,
really ought to be a goal of anyone developing open source software. (It
does look to me like Bazel is a much better fit for proprietary software
than for open source.)

> That said, are you saying that CMake "checksums to check if there
> are change to the installed file and does the actual copy of the
> files if they have changed"?

I'm actually not familiar with the implementation of `file(INSTALL...)`,
but not touching previously installed files if they are unchanged is
certainly the intent. (And this definitely at least *partly* works; just
look at the number of 'up to date' notices from running `make install`
twice in a row.)

> That seems to be what you imply, but I am confused - wouldn't that be
> duplicating the task of the package manager (if one exists)?
Definitely not, and TBH I'm not even sure package managers bother; they
might just overwrite everything.

Incremental installs are actually much more important when *not* using a
package manager, especially if that is your mechanism for downstream
users to consume a project.

The usual workflow is:

- Build project A (e.g. something using Bazel)
- Build project B that consumes A
- Make changes to A
- Repeat

The 'repeat' step needs to be as inexpensive as possible. The only
answer Bazel provides to this currently (AFAICT) is to combine A and B
into one repository and build both with Bazel.

Another alternative is to make A usable from its build tree, but that
can get messy, since the build tree layout can look radically different
from the install tree layout. (In Bazel, more so than other build
systems, even...)

This is one of three reasons for providing local installs (i.e. not
through creating and subsequently installing a package). The others are
user convenience and distribution packaging. The first is probably the
least important. The second can be worked around by creating a package,
installing the package in the build environment, and then packing what
got installed... not the end of the world, but it sure feels silly ;-).

>> I *might* be able to work on some of these problems in the
>> not-so-distant future (our customer for the project I am working on is
>> very insistent about using Bazel in spite of its shortcomings, and they
>> are running into these same issues).
>
> Can you be more specific about what you believe to be Bazel's shortcomings?

Local installs should be as easy to implement in *stock*¹ Bazel as they
are in CMake or autotools.

(¹ Meaning, whatever extra "plug-ins" are needed should be part of Bazel
proper with a single, canonical instance, not something users have to go
out of their way to add to their project that gets replicated and forked
all over the place.)

>> Right now I have a stand-alone shell script that has a bunch of
>> hard-coded logic to copy stuff around into a "scratch" install tree
>> which can then be turned into a tarball and/or copied (with
>> modified-content checks, per above) to a user specified install tree.
>> This is a really ugly approach, however, as it is a) not portable (at
>> least not to Windows), and b) puts the what-to-build and what-to-install
>> logic in widely divergent locations.
>
> You could write a rule that describes the binary and the install tree, and
> that generates the shell script, and it could also do different things on
> Windows vs. other platforms.

Yes, I could. But that rule should be *part of Bazel itself*. If I have
to write anything (significantly²) more complicated than:

install(
target="//my:library",
dest="lib", # `builtin.libdir` would be better
export=... # `true` might be too simple
)

...then Bazel has failed. Critically, as I noted above, *Bazel itself*
needs to build in a mechanism to compare file content in order to only
install a file in case of modified content. There is no `sha256sum` on
Windows, and even on other platforms, I'm not sure how portable it is.

Moreover, it seems like there should not be any great difficulty here;
Bazel already has a ton of caching logic and I believe already can
compute file hashes (for download verification), no? This just needs to
be integrated with an install facility.

(² This is actually a *little* bit oversimplified, but only a little.)

>> What I want from Bazel - what I *expect* from a modern build tool - is
>> the moral equivalent of CMake's `install` commands. That is, I want a
>> command that I can write next to (or even as part of?) `cc_binary` that
>> says 'this target/file should be installed' and (optionally) 'this
>> target should be available for import into other projects'.
>
> Instead of a moral equivalent of CMake's 'install', wouldn't it be better
> to use the local package manager for installing?

NO!! What if I don't have root? What if I want to choose where to
install the package? (What if I don't *have* a local package manager...
or, at least, don't have one supported by Bazel?)

(I'd say something about "superbuilds" here, but... while they *are* a
real world use case, they have their own problems and aren't a good
*motivating* use case.)

> Also, why couldn't that be written as a Bazel rule?

It *could*, but one of the problems to be solved is separation of the
declarations to build an artifact and to install that artifact. If these
aren't in the same place, they are harder to maintain.

At least it needs to be possible for `install_library` to appear
immediately after `cc_library`. However, I suspect it would be best
(least maintenance, least possibility to do something wrong) if
installation was integrated with the commands that actually generate files.

--
Matthew
Reply all
Reply to author
Forward
0 new messages