Regex-free notation and quotation marks

305 views
Skip to first unread message

Vagif Abilov

unread,
Aug 21, 2012, 10:32:31 AM8/21/12
to spec...@googlegroups.com
Hi,

I have revised most of my feature files to use new convention introduced in SpecFlow 1.9 (without regex), and it looks it does not handle well quotation marks.

Here is Gherkin step

Given I have this

Then the following step definition works fine:

[Given]
public void I_have_PARAM(string param)
{
    ScenarioContext.Current.Pending();
}

Now I want to pass a quoted string, so I change the Gherkin statement:

Given I have "this"

I expect the string "this" (with both surrounding quotes) to be sent to the step definition method. But instead the string this" (with trailing but not leading quote) is sent.

Looks like SpecFlow only matches text starting with alphanumeric letters, leading punctuation marks are discarded.

Vagif

Gáspár Nagy

unread,
Aug 21, 2012, 10:53:25 AM8/21/12
to SpecFlow
Well. My expectation would be to send the string *this* (so w/o the
quotes), but it is still buggy this way. Could you please file a
ticket on github?

Why do you want to receive the quotes?

On Aug 21, 4:32 pm, Vagif Abilov <vagif.abi...@gmail.com> wrote:
> Hi,
>
> I have revised most of my feature files to use new convention introduced in
> SpecFlow 1.9 (without regex), and it looks it does not handle well
> quotation marks.
>
> Here is Gherkin step
>
> *Given I have this
> *
> Then the following step definition works fine:
>
> [Given]
> public void I_have_PARAM(string param)
> {
>     ScenarioContext.Current.Pending();
>
> }
>
> Now I want to pass a quoted string, so I change the Gherkin statement:
>
> *Given I have "this"
> *
> I expect the string *"this"* (with both surrounding quotes) to be sent to
> the step definition method. But instead the string *this"* (with trailing

Darren Cauthon

unread,
Aug 21, 2012, 11:04:42 PM8/21/12
to spec...@googlegroups.com
You refactored existing steps to fit the new format?

Why am I getting the sinking feeling that in a year from now, most SpecFlow users are going to use the C# syntax over "regex?"

(shaking my head...)



Darren

Vagif Abilov

unread,
Aug 22, 2012, 2:51:39 AM8/22/12
to spec...@googlegroups.com
Yes Darren, I spent some time to refactor existing steps. First, to get a feeling of how it is now, then to get rid of string almost-duplicates.
I recently attended CukeUp conference and several people expressed their dissatisfaction with regex. What is your view on this?

Vagif

Vagif Abilov

unread,
Aug 22, 2012, 2:52:32 AM8/22/12
to spec...@googlegroups.com
Yes, I will file the ticket. If I get some time, I can even look at it.

Vagif

Darren Cauthon

unread,
Aug 22, 2012, 10:19:35 AM8/22/12
to spec...@googlegroups.com

My view is, honestly, that people are making a mountain out of a mole-hill.  

The simplest thing to do in the world is to match a string to a string.  If I see this:

    Given my name is Darren

and

   Given("my name is Darren")

I'm using the powers gained in elementary school.  I read a sentence in English, I read the other sentence in English, and there's no conflict.  But If I match it to:

    public void my_name_is_NAME(string name)

I've left the real world and jumped into programmer world.  My brain has to process this sentence much differently.  Many programmers might say, "Oh, but I can take that stuff easily," but they're wrong -- they still have to process that sentence.  Even subconsciously, they're doing more work.  And if anybody wants to claim differently, I'd be happy to take a popular blog post, convert the entire thing to this_is_a_name type of syntax, and we can see what it really takes to read it.

Plus, to make it worse, it's completely out of the scope of regular C# syntax.  Though I write much more Ruby these days and I prefer the underscore syntax, that's not the camel case that C# is written in.  C# devs expect to see 

    public void MyNameIsName(string name)

and not

    public void my_name_is_NAME(string name)

It's not a natural fit.  We're already trying to bend C# to fit a natural language, and it's so awkward to do so we have to resort to breaking regular C# conventions to the point where we're assigning meaning to ALL CAPS.  These non-regex method statements ooze with insider-knowledge, where any dev would look at it and know there are hidden rules and conventions.  

But regex isn't a natural fit, either, right?  Well, that's one of the neat things about SpecFlow:  You don't have to know regex to write it, you just need to use (.*) as a wildcard.  I've written thousands of steps with SpecFlow, yet I've never gone past this:

   Given("my first name is '(.*)' and my last name is '(.*)'")

This is one case where a static language is useful, as it can be leveraged to convert a string to any primitive type.  And if one prefers, SpecFlow even offers the ability to control how that string is converted to *ANY* type. No need to do what I've seen the Cucumber guys do, with their fancy regex statements to turn strings to whatever.

SpecFlow was a big part of my personal advancement as a developer.  Early in my SpecFlow days, I was one of those guys who preferred the C# syntax over the natural language, but after using it for a while I started to see the benefits.  I started learning more about Cucumber, and everything I read in books, wikis, and blog posts could be ported to SpecFlow.  I've always advocated and taught other devs to "stick with the basics," to give up today's personal preferences for future gain, but I'm afraid most are going to forgo all of that for the new C# syntax.  They'll be closer to MSpec than Cucumber.

Haha, in fact, I can already see it now:  People adjusting the way they write the their steps in the feature file to better fit the C# syntax.  (shedding a tear)


Darren

Vagif Abilov

unread,
Aug 23, 2012, 4:07:01 AM8/23/12
to spec...@googlegroups.com
Hi Darren,

First of all, thank you for a comprehensive explanation. You definitely have a point here. But let's have a look another look at step definitions.

Gherkin:

Given my name is Darren

C#
[Given("my name is (.*)")]
public void Given_my_name_is(string name)
{
}

Here we see three variations of the same phrase: "Given my name is XXX". First in Gherkin, then as an attribute property, and finally as a method name.

You suggest keeping it in an attribute propery. Alright, but it was burnt into a method name in some form, we don't need to keep it that way, we can just rename the method to Given_<some guid>, in fact, if SpecFlow used this naming convention for generated C# code, I probably would not bother getting rid of attribute properties. But I can not help myself when I see three (almost) copies of the same phrase, I want to do something with them, I want to get rid of at least one instance. So I clean the Given property and put binding logic into a method name. Moreover, we developers have closer relationship to method names than attribute properties, so I guess it can be more of a habit - what to choose to express the meaning.

So even though step definitions are auto-generated, they are manually implemented and maintained. So we developers will always react on how they look, although you are absolutely right that from practical point of view it's not necessary to pay attention to method names and just focus on text attached to attributes.

When it comes to method naming convention, I agree that C# has different naming rules, but this should be possible to do something about.

Vagif

Vagif Abilov

unread,
Aug 23, 2012, 4:15:58 AM8/23/12
to spec...@googlegroups.com
Well, after reading your reponse in another thread I tend to agree even more: method names should not matter. The confusing part was having both method names and attributes repeating the same phrase. I will probably follow in future your approach and just rename methods. But I am pretty sure many developers will be sensitive about method names and negative about regexes. This is why it's good that they can now choose.

Vagif

Vagif Abilov

unread,
Aug 23, 2012, 4:38:49 AM8/23/12
to spec...@googlegroups.com
We just discussed this internally, and one other point came: method names can be important for stack traces, automated report generations, etc. So treating method names as meaningless identifiers work for you, it's fine. However I can imagine that some teams would have use for method names in their development process.

Vagif

Darren Cauthon

unread,
Aug 23, 2012, 12:35:33 PM8/23/12
to spec...@googlegroups.com

Ah, but stack traces, automated report generation.... do those fall outside of the scope of SpecFlow?  

Or to put it another way... would a team focused on such things be wrong?

Of course, this is totally my opinion, and I don't mean to sound gruff.  I just feel that SpecFlow is superbly, awesomely great at what it does -- turning business-readable features into automated tests.  Write the feature, get a yellow. Introduce the assertions, get the red.  Make code work, get green.  SpecFlow is a great way to lead development down that path.

There's something deep in the soul of the programmer (and I say this as one) that sometimes causes us to take situations like that work and... we tinker.  We have this happy path, but a dev will say "Oh, but I like C# syntax better."  So they stop the happy train moving just long enough to refactor their strings to C# syntax.  Then someone says, "Oh, but what about how it appears in the stack trace?" so we stop the happy train and tinker with that.  Or someone says, "I have this report generator that reads specflow, but it reads the C# syntax instead of the feature file itself.  So we... you get the point.  We start putting more work and complexity into something that could have been much, much simpler.

I don't know the situation, perhaps there are cases where it matters.  But I'm suspicious of those cases. :)


Darren

Vagif Abilov

unread,
Aug 24, 2012, 3:22:34 AM8/24/12
to spec...@googlegroups.com
"Or to put it another way... would a team focused on such things be wrong?"

I don't know, maybe. But that's a different topic, isn't it? It's just that we discussed various alternatives and one of my colleagues who spends a lot of time with reflection and call graphs said he would definitely choose method name clarity over the attribute. I don't question his reasons, but this is the way it is for many developers: attributes are considered to be second-class citizens comparing to methods. A lot of stuff is built around type information.

Anyway, I think your points adjusted my initial reaction on this new feature in SpecFlow 1.9. I first jumped on it and what I saw in my code made me quite happy, now I thinks that choosing the right alternative is context-dependent. For example, as Pedro pointed out, if steps are written in non-English languages (especially multiple), it makes much more sense to keep using attribute properties. And combination of other factors (like focus on reflection for some reasons) can favour method name convention.

Just my 2 copecks.

Vagif
Reply all
Reply to author
Forward
0 new messages