Please help me make sense of the filter function in bazel query

876 views
Skip to first unread message

Konstantin

unread,
May 10, 2021, 7:54:34 PM5/10/21
to bazel-discuss
Per Bazel query documentation there is a function filter(pattern, input) which supposed to remove all items from "input" which do not satisfy the regular expression "pattern". Sounds simple enough, but still I struggle to understands the results I get from it.

For example, here is the query which does not filter anything (pattern is empty string:
bazel cquery filter('', kind(rule, deps(mytarget))) --notool_deps --noimplicit_deps
And here is the fragment of the output:
//modules/platform/tabcoredata:test_tabcoredata (013a75e)
//thirdparty:Qt5__Network (013a75e)
//thirdparty:boostlib_filesystem (013a75e)
@tab_toolchains//cc/sanitizers:asan (013a75e)
@tab_toolchains//conditions:windows (013a75e)
//modules/platform/tabcoredata:testlib_tabcoredata (013a75e)
@Qt5__Gui__windows_release_x64//:Qt5__Gui__windows_release_x64 (013a75e)
@gmock__windows_release_x64//:gmock__windows_release_x64 (013a75e)
@platforms//os:windows (013a75e)

Pattern '//modules' nicely gives me everything starting with modules:
//modules/platform/tabcoredata:test_tabcoredata (013a75e)
//modules/platform/tabcoredata:src_unittest_folder (013a75e) 
Etc.

Here comes the question: what pattern shall I use to get all the packages from my current workspace and skip all external workspaces?

Local labels start with the two slashes and external ones start with @, so naturally I try to filter for '^//' and it returns me everything! Why?

On this page I learned that instead of ^ I can use \A as the beginning of the input, so I set my pattern to '\A//' It produces everything starting with //modules, //codegen, etc. but for the reason beyond my comprehension misses everything starting with //thirdparty! I am puzzled.

What is the magic word I don't know? Why empty filter let //thirdparty through, but '\A//' blocks it? Why ^ does not work? Help!

Konstantin

Andrew Allen

unread,
May 11, 2021, 12:30:10 AM5/11/21
to Konstantin, bazel-discuss
I think the query

`bazel cquery '@//...'` will return to you all the targets defined by the current workspace. Could you try that out and report back?

/** ~Andrew Z Allen */


--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/86804e45-1269-401e-9097-a3d6097374e6n%40googlegroups.com.

Konstantin

unread,
May 11, 2021, 1:35:55 AM5/11/21
to bazel-discuss
Andrew, 

yes, bazel cquery '@//...' does indeed return all the targets defined by the current workspace. As a matter of fact even bazel cquery '//...' is sufficient.
But it is not exactly what I am looking for. I actually need to get kind(rule, deps(mytarget)) and then filter it to only current workspace targets. 

Well, thinking about it a little the following construction using intersect achieves the goal: bazel cquery kind(rule, deps(mytarget)) intersect //... 
So the problem at hand is solved! Thank you very much!

But I am still curious about that unexplainable behavior of the filter(pattern, input) function. 

Konstantin

Alex Humesky

unread,
May 11, 2021, 5:30:20 PM5/11/21
to Konstantin, bazel-discuss

> I try to filter for '^//' and it returns me everything

That's strange, it seems to be working for me:

Everything (fragment):

$ bazel cquery "filter('', deps(example_test))"
INFO: Analyzed target //:example_test (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
//:example_test (86d23d2)
@bazel_tools//tools/test:collect_coverage (HOST)
@bazel_tools//tools/test:xml_writer (HOST)
@bazel_tools//tools/test:coverage_support (86d23d2)
//:test.cc (null)
@bazel_tools//tools/test:test_wrapper (HOST)
@local_config_platform//:host (86d23d2)
//thirdparty:data_files (86d23d2)
@bazel_tools//tools/test:test_xml_generator (HOST)
@bazel_tools//tools/cpp:toolchain_type (86d23d2)
@bazel_tools//tools/def_parser:def_parser (HOST)
@bazel_tools//tools/test:coverage_report_generator (HOST)
@bazel_tools//tools/cpp:grep-includes (HOST)
@bazel_tools//tools/test:runtime (HOST)
@local_config_cc//:cc-compiler-k8 (HOST)
@bazel_tools//tools/test:collect_cc_coverage (HOST)
@bazel_tools//tools/test:test_setup (HOST)
@bazel_tools//tools/cpp:toolchain (86d23d2)
@bazel_tools//tools/cpp:malloc (86d23d2)
@platforms//os:linux (86d23d2)
@bazel_tools//tools/cpp:interface_library_builder (HOST)
@bazel_tools//src/conditions:host_windows (HOST)
@bazel_tools//tools/test:collect_cc_coverage.sh (null)
//thirdparty:data_file (null)
<snip>

Then with ^//

$ bazel cquery "filter('^//', deps(example_test))"
INFO: Analyzed target //:example_test (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
//:example_test (86d23d2)
//:test.cc (null)
//thirdparty:data_files (86d23d2)
//thirdparty:data_file (null)
INFO: Elapsed time: 0.106s
INFO: 0 processes.
INFO: Build completed successfully, 0 total actions

And adding a kind() filter:

$ bazel cquery "filter('^//', kind(rule, deps(example_test)))"
INFO: Analyzed target //:example_test (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
WARNING: Targets were missing from graph: [ConfiguredTargetKey{label=@bazel_tools//tools/jdk:dummy_toolchain, config=BuildConfigurationValue.Key[8b28ed5ee57dc9b78aaec3f4fb7264907b07f6739b1397efa9b80d227c5fc91e]}]
//:example_test (86d23d2)
//thirdparty:data_files (86d23d2)
INFO: Elapsed time: 0.106s
INFO: 0 processes.
INFO: Build completed successfully, 0 total actions
And using \A:

$ bazel cquery "filter('\A//', kind(rule, deps(example_test)))"
Starting local Bazel server and connecting to it...
INFO: Analyzed target //:example_test (28 packages loaded, 332 targets configured).
INFO: Found 1 target...
//:example_test (86d23d2)
//thirdparty:data_files (86d23d2)
INFO: Elapsed time: 8.361s
INFO: 0 processes.
INFO: Build completed successfully, 0 total actions


One difference I see is that you don't have quotes around the query expression. I get a syntax error from bash if I remove the quotes: 

$ bazel cquery filter('^//', kind(rule, deps(example_test)))
-bash: syntax error near unexpected token `('

Is it possible your shell is doing something with the caret character or \A?

Also, for what it's worth, bazel does have some code that treats packages starting with "third_party" (but not "thirdparty") specially, but this is only for a legacy license checking framework that I believe is disabled.

Konstantin

unread,
May 11, 2021, 10:26:37 PM5/11/21
to bazel-discuss
Hey Alex, long time no see!
My experiments were on Windows and the outer double-quotes don't seem to be necessary, but I tried it anyway and to my surprise got a different result! So filter('^//', ... started to work for me (albeit in some weird way) when I enclosed cquery expression in double-quotes. So let's consider one problem down and see what remains.

Second problem is that I definitely observe some kind of mystery around the folder named "thirdparty" and which is not supposed to be any special. 
My first query
    bazel cquery "kind(rule, deps(example)) intersect //..."
produces expected result: all targets defined in the current workspace.

My second query 
    bazel cquery "filter('^//', kind(rule, deps(example)))"
supposed to achieve the same results but for some mystical reasons it skips all the targets from //thirdparty !!

This is really weird, because as you said that folder is not supposed to be any different than others. Could you please create such folder (if you don't have it) and see if the phenomenon reproduces for you? If it does not I will work on the dedicated repro.

Problem #3 is that I need to filter out (remove from the query output) all targets with the names starting on two underscores. I try to do it how this article recommends. LMN if it can be done easier.
    bazel cquery "filter('((?!:__).)*',  my_previous_query_here)"
And what I observe is that negative filters do not alter query output at all, i.e. I get exactly the same results with and without such filter. 
Any clue how to shall I write a negative filter so that it actually works?

Thanks!
Konstantin

Alex Humesky

unread,
May 11, 2021, 11:10:45 PM5/11/21
to Konstantin, bazel-discuss
I put a "thirdparty" directory in my test above, and it seems to be included:
"//thirdparty:data_files"

So seems there must be something going on here

Regarding problem 3, unfortunately "zero-width negative lookahead" is a somewhat more advanced regex feature than I've ever used.
My guess is that this is because filter() uses java's Pattern.find():
which in the usage here seems to mean that filter() will match if any part of the regex matches the target pattern.

Playing around with the regex, it seems that that pattern will match parts of "//foo/bar:__baz":

Playing around with the regex some more, it looks like .*:((?!__).)+ works:

I haven't tested it exhaustively though

Konstantin

unread,
May 11, 2021, 11:47:25 PM5/11/21
to bazel-discuss
I have tried to make a repro for the elusive "thirdparty" problem and of course I could not, so definitely something else is at play, BUT out of a hunch I have tried Bazel 4.1.0 rc4 and it worked as expected! So whatever it is hitting me is fixed in 4.1.0 and that's as much as I want to know. Problem #2 is fixed by Bazel upgrade. Great!

For the Problem #3 I am not a fan of Regular Expressions myself and that  "zero-width negative lookahead" thingy is way more advanced than I ever wanted to know about RegEx, but unfortunately it seems to be the only way to get subtractive filtering working. 
I have looked at your examples on regexr (nice site!) and can see that the last example indeed works, but why it does and the previous one does not - beats me.
Anyway your last example provides usable solution therefore we can consider the Problem #3 successfully solved. 
Case closed. Thanks a lot! :-)

ahum...@google.com

unread,
May 12, 2021, 1:27:19 PM5/12/21
to bazel-discuss
Very strange about the "thirdparty" issue.... but glad to help! 
Reply all
Reply to author
Forward
0 new messages