Now that the ACTION* macros have been checked in and I've seen some
people using them, I'd like to proceed with some macros for defining
matchers. They will be very much like ACTION* in their syntax and
usage.
Motivation
==========
Sometimes you need non-trivial logic when validating a value. For
example:
// Blah("b") must be outside of the range [x, y].
EXPECT_TRUE(Blah("b") < x || Blah("b") > y);
However, this leads to evaluating Blah("b") twice, which is
inefficient and can be problematic when Blah() has side effects; you
will also have a hard time knowing why the assertion fails as the
values of x, y, and Blah("b") aren't printed in the failure message.
You can try to eliminate the double evaluation using a variable:
int n = Blah("b");
EXPECT_TRUE(n < x || n > y);
but this makes the failure message even worse, as it doesn't tell you
from which expression n is evaluated.
A custom matcher can solve the above problem. Suppose you have
defined matcher IsNotInClosedRange(min, max), you can write:
EXPECT_THAT(Blah("b"), IsNotInClosedRange(x, y));
which can generate a very helpful failure message:
Value of: Blah("b")
Expected: is not in closed range (5, 9)
Actual: 6
It tells you:
- which expression is being validated (Blah("b"))
- what value it has (6)
- why it fails (6 doesn't satisfy the expectation "is not in closed
range (5, 9)").
The only problem, however, is that defining the IsNotInClosedRange()
matcher requires lots of boilerplate code. This proposal cuts the
boilerplate almost to non-existent.
Basic Matchers
==============
The syntax
MATCHER(name) { statements; }
defines a matcher with the given name. In 'statements' you can refer
to the value being tested as 'arg'. The bool value returned by
'statements' indicates whether the match succeeds. For example,
MATCHER(IsEven) { return (arg % 2) == 0; }
allows you to write:
// Expects mock_foo.Bar(n) to be called where n is even.
EXPECT_CALL(mock_foo, Bar(IsEven()));
or,
// Verifies that the value of some_expression is even.
EXPECT_THAT(some_expression, IsEven());
If the above assertion fails, it will print something like:
Value of: some_expression
Expected: is even
Actual: 7
where the description "is even" is calculated from the matcher name
IsEven.
Parameterized Matchers
======================
The MATCHER_P macro allows you to parameterize your matcher. For
instance, given:
MATCHER_P(AbsoluteValueIs, x) { return abs(arg) == x; }
the assertion
EXPECT_THAT(Blah("a"), AbsoluteValueIs(n));
may lead to the following failure (assuming n is 10):
Value of: Blah("a")
Expected: absolute value is 10
Actual: -9
The type of a parameter named 'param' is available as 'param_type', as
in the case of ACTION_P.
If you need more than one parameter, you can use MATCHER_P2,
MATCHER_P3, and so on. For example, the IsNotInClosedRange matcher we
mentioned in the "motivation" section can be defined as:
MATCHER_P2(IsNotInClosedRange, min, max) { return arg < min || arg > max; }
That's it - a one-liner.
Conclusion
==========
As I have shown, by using a custom matcher in assertions and
expectations, you can get intuitive test code that reads like English,
and informative error messages (since both the value being validated
and the values of all parameters will be printed). MATCHER* allow you
to define custom matchers trivially, and I expect them to be used a
lot once implemented.
Thoughts? Thanks,
--
Zhanyong
--
Zhanyong
Glad you like it!
> I have but one nit. I've missed my chance to comment on the ACTION macro but
> would like to ask at least now. Since MATCHER is quite a generic name, and
> macros don't support namespaces, don't you think it makes sense to simulate
> them by giving it the name with, say, GMOCK_ prefix?
Yeah, I saw this coming. :-)
I expect this to be a frequently used macro, and thus want to keep it
short. Note that we don't have the GTEST_ or GMOCK_ prefix in the
following macros: TEST, TEST_F, ASSERT_TRUE, EXPECT_CALL, ON_CALL,
MOCK_METHODn, ACTION, and etc, even though some of them are generic
names. I'd like to be consistent with them and give our users a
better experience.
--
Zhanyong
Unfortunately, no. ACTION* have the same limitation.
This is a result of the irregularity of the C++ language, and I don't
see a way around it.
OTOH, this may not be too bad, as it forces you to make these member
variables parameters of the matcher, which can improve the error
message.
--
Zhanyong
On Fri, Jan 16, 2009 at 3:30 PM, Vlad Losev <vl...@google.com> wrote:Glad you like it!
> I think this is very cool. It comes as close to the lambda syntax as you can
> reasonably get.
Yeah, I saw this coming. :-)
> I have but one nit. I've missed my chance to comment on the ACTION macro but
> would like to ask at least now. Since MATCHER is quite a generic name, and
> macros don't support namespaces, don't you think it makes sense to simulate
> them by giving it the name with, say, GMOCK_ prefix?
I expect this to be a frequently used macro, and thus want to keep it
short. Note that we don't have the GTEST_ or GMOCK_ prefix in the
following macros: TEST, TEST_F, ASSERT_TRUE, EXPECT_CALL, ON_CALL,
MOCK_METHODn, ACTION, and etc, even though some of them are generic
names. I'd like to be consistent with them and give our users a
better experience.
I don't see any collision at least inside Google.
I'm being a little bit selfish (on behalf of our users) here. If we
don't claim the short name MATCHER now, some day, some other popular
library will claim it. I'd rather we be the one to own it. It's like
registering popular domain names before other people do. (half kidding
:-)
> think it makes sense to provide alternatively named (with GTEST_ or GMOCK_
> prefix) macros and an ability to undefine the ones that have a collision
> with the user code?
That's extra complexity for hypothetical concerns. We will worry
about it when the problem is real. Thanks,
--
Zhanyong
I have a further argument in favor of exceedingly simple names -- the
macro namespace is most important when you are likely to keep
including headers down very deeply in the code base; an API header is
a good example. Testing code, even testing helper code designed to be
shared amongst many tests, is rarely included in such a prolific
manner (and likely bad design when it is), and so I think the risk
incurred by filling the global namespace is very manageable. Tests are
also one of the places where we don't need to optimize quite as
extremely for absolute lack of ambiguity in naming and patterns.
-Chandler
I have finished the implementation, which is being reviewed. However,
I'm thinking about a small change to the design, which I'd like to
discuss here.
In the current design, the matcher description is automatically
generated from the matcher name. For example, the description for
IsEven() will be "is even", and the description for
IsNotInClosedRange(m, n) will be "is not in closed range (5, 9)"
(assuming m is 5 and n is 9). As long as the person defining the
matcher picks a descriptive name, we get the description (used in
error messages) for free.
When reviewing it, Chandler pointed out that sometimes it's desirable
to override this default description. For example, you may want to
keep the matcher name reasonably short, but you don't want to
sacrifice the description string. Or you may want to specify how the
parameters are printed (e.g. "[5, 9]" instead of "(5, 9)" in the above
example).
I think this is a good point. Therefore I'm considering the following
change to the API:
We'll add a format string argument to each of the MATCHER* macros.
If it's empty (""), the default description generated from the matcher
name will be used; otherwise it overrides the default.
In the example given by Chandler, a user wants to define matcher
IsFib() to check that the number is in the Fibonacci sequence. He
would do this by:
MATCHER(IsFib, "is in the Fibonacci sequence") { ... }
To support describing parameterized matchers, we will allow
Python-style interpolation in the format string:
%% prints a single '%' character
%* prints all parameters as a tuple
%(foo)s prints the parameter named 'foo'
For example, given:
MATCHER_P2(IsNotInClosedRange, min, max,
"is not in range [%(min)s, %(max)s]") {
return arg < min || arg > max;
}
matcher IsNotInClosedRange(5, 9)'s description will be "is not in
range [5, 9]".
Notes:
- To select the default description, the user needs to supply an empty
format string (MATCHER(IsFib, "") { ... }), which is a bit annoying.
I considered using variadic macro to make the format string argument
optional, but decided that it's a bad idea to rely on this
not-yet-standard C++ feature. Another option is to use two macros
for default vs custom description, but I think that's a bigger crime
than having the user writing "".
- We use "%(foo)s" instead of "%foo" to refer to the value of 'foo', as
the latter doesn't work when you want to print some letters right
after the value, for example.
- I don't plan to support format modifiers (like field width,
precision, and type specifiers) at this time, as that's a lot of
work and rarely needed. If a user wants to be precise about the
format, he can instead define the matcher using the native
MatcherInterface API.
- However, we may want to support format modifiers (e.g. "%(foo)03d
%(bar)g") in the future. That's why today we insist on the 's' in
"%(foo)s". If you write anything other than 's' after "%(foo)",
you'll get a syntax error. This ensures that we don't break existing
code when we add support for more format modifiers in the future. (For
those of you unfamiliar with Python, 's' means to print the value as a
string. The value itself doesn't have to be a string.)
Thoughts? Thanks,
On Thu, Jan 15, 2009 at 8:48 PM, Zhanyong Wan (λx.x x) <w...@google.com> wrote:
--
Zhanyong
- We use "%(foo)s" instead of "%foo" to refer to the value of 'foo', as
the latter doesn't work when you want to print some letters right
after the value, for example.
- I don't plan to support format modifiers (like field width,
precision, and type specifiers) at this time, as that's a lot of
work and rarely needed. If a user wants to be precise about the
format, he can instead define the matcher using the native
MatcherInterface API.
- However, we may want to support format modifiers (e.g. "%(foo)03d
%(bar)g") in the future. That's why today we insist on the 's' in
"%(foo)s". If you write anything other than 's' after "%(foo)",
you'll get a syntax error. This ensures that we don't break existing
code when we add support for more format modifiers in the future. (For
those of you unfamiliar with Python, 's' means to print the value as a
string. The value itself doesn't have to be a string.)
Thoughts? Thanks,
I'm considering a minor adjustment:
>> %* prints all parameters as a tuple
I'd like to change this to "%(*)s" to be consistent with "%(foo)s" and
allow format modifiers in the future.
>> %(foo)s prints the parameter named 'foo'
--
Zhanyong
This doesn't seem controversial. I'm gonna add the format string
argument to the MATCHER* macros now. Initially we only allow the
format string to be "". In the next patch I'll implement the proposed
functionality properly. Please let me know if you object to this
plan. Thanks,
--
Zhanyong
For reference, this is what I had envisioned when originally
requesting this feature -- a _D or some other suffix to allow
descriptions. However, I see both sides of this argument. =] I'm glad
to watch how it plays out.
-Chandler
The TEST vs TEST_F case is different, as technically we cannot merge
the two into a single construct. And that's just one more macro.
If we add the _D variation, we are adding 11 new macros (admittedly
they are in the same family, but still it takes up 11 more names from
every namespace).
If your concern is the terseness of the macros, I don't think a couple
of characters are gonna make a difference. We are talking about
MATCHER(foo) and MATCHER_D(foo, description)
vs
MATCHER(foo, "") and MATCHER(foo, description)
here.
While I do think that a *lot* of the time the default description can
serve us well, I'm not sure it's *most* of the time. Even if you have
picked a very descriptive name for a matcher, chances are that the
error message can benefit from a custom description.
The benefit of your approach is that sometimes you don't have to write
"". I think this is a pretty small gain for the cost. I'm not just
talking about the cost of implementation here. We also need to
consider the cost of user education and cognitive burden: the _D
suffix is certainly not intuitive, so the users have to remember it.
I would rather that they use their brain on something more signficant.
We can make the suffix _WITH_DESCRIPTION, but that's awkward and
clearly worse than writing "".
So, while the purpose of the _D variation is to hide details, it ends
up introducing more details by forcing the users to learn what _D means.
I'm not saying that people shouldn't be allowed to skip the custom
description if they think that's the right thing to do. We are just
choosing different syntaxes for expressing that: you are proposing
MATCHER_D(foo, description) vs MATCHER(foo), while I prefer
MATCHER(foo, description) vs MATCHER(foo, ""). The difference is
purely syntactical.
I also like Matt's argument that having the user write "" makes him
think more about the description. When you see "", naturally you know
that the string could be something else, and you are more likely to
consider if you should pick a different string. If you see the
MATCHER(foo) syntax and you are not already familiar with
MATCHER_D(foo, description), it will less likely come to your mind
that you have an option to provide a custom description.
--
Zhanyong
On Mon, Feb 2, 2009 at 5:16 PM, Vlad Losev <vlad...@gmail.com> wrote:> Hi, Zhanyong,
>The TEST vs TEST_F case is different, as technically we cannot merge
>> - To select the default description, the user needs to supply an empty
>> format string (MATCHER(IsFib, "") { ... }), which is a bit annoying.
>> I considered using variadic macro to make the format string argument
>> optional, but decided that it's a bad idea to rely on this
>> not-yet-standard C++ feature. Another option is to use two macros
>> for default vs custom description, but I think that's a bigger crime
>> than having the user writing "".
>
> Why do you think so? After all, we have both TEST and TEST_F macros in
> Google Test. IMHO, most of the time people will not need to write custom
> descriptions, and if possible, they should be able to skip them. Like in
> gtest, it's possible to have both MATHER and MATCHER_D macros.
the two into a single construct. And that's just one more macro.
If we add the _D variation, we are adding 11 new macros (admittedly
they are in the same family, but still it takes up 11 more names from
every namespace).
If your concern is the terseness of the macros, I don't think a couple
of characters are gonna make a difference. We are talking about
MATCHER(foo) and MATCHER_D(foo, description)
vs
MATCHER(foo, "") and MATCHER(foo, description)
here.
While I do think that a *lot* of the time the default description can
serve us well, I'm not sure it's *most* of the time. Even if you have
picked a very descriptive name for a matcher, chances are that the
error message can benefit from a custom description.
The benefit of your approach is that sometimes you don't have to write
"". I think this is a pretty small gain for the cost. I'm not just
talking about the cost of implementation here. We also need to
consider the cost of user education and cognitive burden: the _D
suffix is certainly not intuitive, so the users have to remember it.
I would rather that they use their brain on something more signficant.
We can make the suffix _WITH_DESCRIPTION, but that's awkward and
clearly worse than writing "".
So, while the purpose of the _D variation is to hide details, it ends
up introducing more details by forcing the users to learn what _D means.
I'm not saying that people shouldn't be allowed to skip the custom
description if they think that's the right thing to do. We are just
choosing different syntaxes for expressing that: you are proposing
MATCHER_D(foo, description) vs MATCHER(foo), while I prefer
MATCHER(foo, description) vs MATCHER(foo, ""). The difference is
purely syntactical.
I also like Matt's argument that having the user write "" makes him
think more about the description. When you see "", naturally you know
that the string could be something else, and you are more likely to
consider if you should pick a different string. If you see the
MATCHER(foo) syntax and you are not already familiar with
MATCHER_D(foo, description), it will less likely come to your mind
that you have an option to provide a custom description.
--
Zhanyong
> We provide the TEST macro so our users don't have to write
>
> class TestCase : public Test {};
> TEST_F(TestCase, TestName) { ... }
>
> when they just want to write a test. I think it is a brilliant idea. It
> makes a level of effort required to write a test so low, it's practically
> non-existent. This encourages the use of tests and increases their adoption.
> I argue it's the same with the matchers.
Compared with the example above, the first instance of
TEST(TestCase, TestName) { ... }
saves 35 characters (more if you consider "testing::"), and each
additional TEST() will save 2 characters. So the savings are much
more significant than MATCHER(foo, "") vs MATCHER(foo) here.
Also, I believe TEST() is used much, much, much, much more often than
MATCHER() will be, as in orders of magnitudes more. Count how many
matchers gmock defines; then count how many TESTs use gmock.
Admittedly the MATCHER* macros will give a boost to the number of
user-defined matchers, but I don't think that will change the overall
picture, as when you define a matcher, you probably want to reuse it
in many tests.
>> If we add the _D variation, we are adding 11 new macros (admittedly
>> they are in the same family, but still it takes up 11 more names from
>> every namespace).
>>
> Those names are not going to be nearly as commonly used as MATCHER, and you
> had no problems with claiming that one. ;-)
We use a macro when we have to. The _D macros are not a must, and
don't have a huge pay-off either.
The biggest problem I have with *_D is that its meaning is not obvious
at all. Good names should be self-documenting and don't require a
trip to the manual. A relatively frequent criticism on TEST_F I heard
from some people is that they have no idea what _F means (false?
failure?). In the end I still decided to call it TEST_F as I believe
it will be used often enough that most users will know its meaning by
heart. Since MATCHER_D will be used much less than TEST_F, a user is
less likely to know what it means.
>> If your concern is the terseness of the macros, I don't think a couple
>> of characters are gonna make a difference. We are talking about
>>
>> MATCHER(foo) and MATCHER_D(foo, description)
>> vs
>> MATCHER(foo, "") and MATCHER(foo, description)
>>
>> here.
>>
>> While I do think that a *lot* of the time the default description can
>> serve us well, I'm not sure it's *most* of the time. Even if you have
>> picked a very descriptive name for a matcher, chances are that the
>> error message can benefit from a custom description.
>
> My concern is that on one hand, it stops an engineer from writing the test
> for a moment to recall what this parameter is and whether it has to be
> anything other then an empty string. It is a burden. A small one, I agree,
> but it can still make a difference.
I argue this is a good thing. A short moment spent on improving the
error messages pays for itself the first time the matcher is used.
Remember that you define a matcher once, and many people can use it,
many times.
> On the other hand, when reading the code
> it is also a cognitive burden. One has to parse it in every matcher
> definition. In the modern editors with syntax coloring the string -- they
> are usually colored with red or magenta -- stands out and immediately
> diverts attention from the matcher name. Then you realize that it actually
> contains no information and have to look elsewhere for the it. The end
> result is that reading code with such definitions is slightly annoying.
I think the _D suffix is even more distracting and cryptic. Plus,
matcher definitions are few compared with the tests using them so you
are talking about the rare case here.
>> The benefit of your approach is that sometimes you don't have to write
>> "". I think this is a pretty small gain for the cost. I'm not just
>> talking about the cost of implementation here. We also need to
>> consider the cost of user education and cognitive burden: the _D
>> suffix is certainly not intuitive, so the users have to remember it.
>>
>> I would rather that they use their brain on something more signficant.
>> We can make the suffix _WITH_DESCRIPTION, but that's awkward and
>> clearly worse than writing "".
>>
>> So, while the purpose of the _D variation is to hide details, it ends
>> up introducing more details by forcing the users to learn what _D means.
>
> The same way TEST_F is less intuitive then TEST, and we use both. Whoever
> needs an advanced case uses an advanced macro. Use of MATCHER_D will hide
> details in the simple case - when users don't care about the description. In
> this case there is no cost. And like with TEST_F, in the advanced case users
> will utilize the advanced macro.
As said earlier, the name TEST_F is a compromise and I'm not
pretending to like the suffix. I think the compromise was worthwhile
since the macro is used a lot. Compared with TEST_F, MATCHER_D is
saving too little to be worth the extra 11 macros.
>> I'm not saying that people shouldn't be allowed to skip the custom
>> description if they think that's the right thing to do. We are just
>> choosing different syntaxes for expressing that: you are proposing
>> MATCHER_D(foo, description) vs MATCHER(foo), while I prefer
>> MATCHER(foo, description) vs MATCHER(foo, ""). The difference is
>> purely syntactical.
>>
>> I also like Matt's argument that having the user write "" makes him
>> think more about the description. When you see "", naturally you know
>> that the string could be something else, and you are more likely to
>> consider if you should pick a different string. If you see the
>> MATCHER(foo) syntax and you are not already familiar with
>> MATCHER_D(foo, description), it will less likely come to your mind
>> that you have an option to provide a custom description.
>
> I actually disagree with Matt about the utility of making users think about
> descriptions all the time. When a user is writing her tests the best we can
> do is to provide her with the tools and then get out of her way. We
> definitely should educate our users about advanced features of the
> framework, but not when they are trying to concentrate on the task at hand.
I agree with the principle but think the worry is out-of-proportion
here. Writing an empty string literal is trivial. Is it ideal? No,
but I prefer it much more than the other option, for the reasons I
stated.
> I do see your point about MATCHER_D bringing its own issues. It takes longer
> to type, and the name is less intuitive then MATCHER. The overall costs
> depend on how often users will need to write custom descriptions. If the
> simple case is not really that common, it makes sense to just go with
> three-argument MATCHER macro. But I think it is going to be quite common.
Don't forget how often MATCHERS* will be used as a whole. My
experience is that gtest and gmock's built-in assertions and matchers
are adequate most of the time, judging from the number of requests we
receive. Thus even if all uses of MATCHER* are the simple case, you
may not see a lot of benefit.
The bottom line is that the bar for macros are high. We don't add
them unless the pay-off is big. Given how often I expect MATCHERS* to
be used, I don't think the tiny benefit of MATCHER*_D justifies them.
Thanks,
--
Zhanyong
-Chandler