Import scenario outline "examples" from CSV?

8,024 views
Skip to first unread message

Jon Kruger

unread,
Dec 10, 2010, 11:09:03 PM12/10/10
to Cukes
Is there a way to import the "Examples" section of a Scenario Outline
from a CSV file? There are two reasons I want to do this:

1) If I have a lot of data, it's messy if it's in the .feature file
2) If it's in CSV, I can give the CSV file to a business person and
they can fill in values using Excel

If there isn't a way to do this, I'd be willing to take a stab at it
if someone can point me in the right direction.

Jon

Matt Wynne

unread,
Dec 11, 2010, 4:02:34 AM12/11/10
to cu...@googlegroups.com

This has come up more than once, and I think it would be a sweet feature in Cucumber. To implement it would involve making a change to gherkin[1], which is the library that parses your .feature files. Gherkin uses Ragel[2] to parse the files, so the change would be to alter this ragel code:

https://github.com/aslakhellesoy/gherkin/blob/master/ragel/lexer_common.rl.erb

So that it accepts either an inline table or a reference to a CSV file after the Examples: keyword.

Hacking on gherkin isn't entirely straightforward (see the readme), as it produces three versions (pure Ruby, C and Java) across all 40 spoken languages, but if you jump on #cucumber we could give you a hand getting up and running.

[1] https://github.com/aslakhellesoy/gherkin
[2] http://www.complang.org/ragel/

>
> --
> You received this message because you are subscribed to the Google Groups "Cukes" group.
> To post to this group, send email to cu...@googlegroups.com.
> To unsubscribe from this group, send email to cukes+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/cukes?hl=en.
>

cheers,
Matt

ma...@mattwynne.net
07974 430184

Mike Sassak

unread,
Dec 11, 2010, 9:02:33 AM12/11/10
to cu...@googlegroups.com
On Sat, Dec 11, 2010 at 3:02 AM, Matt Wynne <ma...@mattwynne.net> wrote:
>
> On 11 Dec 2010, at 04:09, Jon Kruger wrote:
>
>> Is there a way to import the "Examples" section of a Scenario Outline
>> from a CSV file?  There are two reasons I want to do this:
>>
>> 1) If I have a lot of data, it's messy if it's in the .feature file
>> 2) If it's in CSV, I can give the CSV file to a business person and
>> they can fill in values using Excel
>>
>> If there isn't a way to do this, I'd be willing to take a stab at it
>> if someone can point me in the right direction.
>>
>> Jon
>
> This has come up more than once, and I think it would be a sweet feature in Cucumber. To implement it would involve making a change to gherkin[1], which is the library that parses your .feature files. Gherkin uses Ragel[2] to parse the files, so the change would be to alter this ragel code:
>
> https://github.com/aslakhellesoy/gherkin/blob/master/ragel/lexer_common.rl.erb
>
> So that it accepts either an inline table or a reference to a CSV file after the Examples: keyword.
>
> Hacking on gherkin isn't entirely straightforward (see the readme), as it produces three versions (pure Ruby, C and Java) across all 40 spoken languages, but if you jump on #cucumber we could give you a hand getting up and running.
>
> [1] https://github.com/aslakhellesoy/gherkin
> [2] http://www.complang.org/ragel/
>

Whoah there! This doesn't need to be implemented in the Gherkin lexer.
If you place a URL or some other kind of identifier in the name of the
Examples section or in its description, Gherkin will gladly send that
straight on to Cucumber, where it can be retrieved, parsed and
inserted into the internal representation of the feature.

I spent some time myself on plugins for Cucumber that would allow
something like this, but gave up after a time because 1) it was
becoming very frustrating, and 2) realizing that having Cucumber read
streams of Gherkin-formatted text from STDIN was a more elegant
solution. I haven't had the time to implement #2, but I think it's a
better way to do this sort of thing.

$0.02
Mike

Matt Wynne

unread,
Dec 11, 2010, 10:58:18 AM12/11/10
to cu...@googlegroups.com

Don't you think it would be nice if either form surfaced out of Gherkin as a Table object though?

i.e.

Examples:
@foo/bar.csv

and

Examples:
| a | b |
| c | d |

Jon Kern

unread,
Dec 11, 2010, 1:50:39 PM12/11/10
to cu...@googlegroups.com
Since you can write code in the steps, you could simply use ruby code?

Feature:
    Scenario: Search by multiple properties
        Given I enter search <vehicle_id>, <start_date>, <end_date> criteria  in "drives.csv"
        When I search
        Then I should see the proper drives listed

NOTE: the bracketed text is not really used in my example...

Steps:
Given /^I enter search <vehicle_id>, <start_date>, <end_date> criteria  in "([^"]*)"$/ do |csv_file|
  FasterCSV.foreach(File.dirname(__FILE__)+"/"+csv_file, {:headers => :first_row, :return_headers => true}) do |row|
    puts "#{row["vehicle_id"]}: from #{row[1]} through #{row[2]}"
  end
end

File:
vehicle_id, start_date, end_date
WAUBB1234, 11/15/10, 11/30/10
XZAUBB1234, 10/15/10, 11/01/10

Output:
jonsmac2-2:DART jon$ cucumber --tags @wip
Feature: Drive Search
  As a user
  I want to be able to search for drives by various properties
  So that I can review the details of that drive...

  Scenario: Search by multiple properties
vehicle_id: from  start_date through  end_date
WAUBB1234: from  11/15/10 through  11/30/10
XZAUBB1234: from  10/15/10 through  11/01/10
    Given I enter search <vehicle_id>, <start_date>, <end_date> criteria  in "drives.csv"
    Searching
    When I search
    Drives found...
    Then I should see the proper drives listed


jon
blog: http://technicaldebt.com
twitter: http://twitter.com/JonKernPA

Jon Kruger said the following on 12/10/10 10:09 PM:

Richard Lawrence

unread,
Dec 11, 2010, 2:06:58 PM12/11/10
to cu...@googlegroups.com
That approach will only run once for all the data, while a scenario
outline runs the scenario once per row in the examples table with
values from that row. It uses a CSV file as a table argument, which
could be useful, but that solves a different problem.

Richard

Jon Kruger

unread,
Dec 11, 2010, 3:42:26 PM12/11/10
to cu...@googlegroups.com
Mike,

I wouldn't have to change the lexer code if we wanted to do it like this:

   Examples: my_values.csv

... but we would have to change it if we wanted to do something like this:

   Examples:
      @file: my_values.csv

Also, I would rather bring in a CSV file because part of the point of this is that I want to give the CSV file to a non-technical person who can edit it in Excel and then I could have those values define what my code is supposed to do.

Matt,

Any preference on how the syntax should look?

Jon

Matt Wynne

unread,
Dec 11, 2010, 4:26:45 PM12/11/10
to cu...@googlegroups.com
Jon, Mike,
I personally think it would make sense to do this in Gherkin, so that Cucumber doesn't need to care - it just gets a table from Gherkin the same as it would if the table were specified in the feature file as normal. But I'd like to hear more about Mike's concerns. Mike - are you worried it will make Gherkin dirty? Don't you think we'll end up with more of a hack in Cucumber if we did it the other way?

If you feel confident, I'd just give it a crack, and we can see how it looks.

As far as syntax, I think what you've suggested above looks fine. Bear in mind that each keyword in gherkin has space for a multiline description after it, so you could have...

    Examples: This is the example name
      This is the example's description
      and so it this
      because it can span over multiple lines.

      @file: my_values.csv

Mike Sassak

unread,
Dec 12, 2010, 1:19:13 AM12/12/10
to cu...@googlegroups.com

I was at least partially confused about what was being proposed, so
let's see if I can't clear up some of that confusion. When I responded
I thought Gherkin already happily consumed an Examples section without
a table following it, or an Examples table with only a multiline
description, e.g.:

Examples: Blah
Here is the multiline description of examples

Turns out I was completely mistaken. That's not the case at all and
all you get is a lexing error. Seeing as how I like the ability to do
that so much that I implemented it in my head, I think modifying the
lexer to make this work is a wonderful idea! I'll give as much help as
I can to get this done.

The second lexer question is whether we're talking about adding
support for an include keyword or token into the language itself, or
just parsing the multiline description for it. I think the latter is
good enough for now. Adding something to the lexer is a real vote of
confidence that I don't think is warranted at the moment. This also
has shades of GivenScenario that I'm uncomfortable with.

This leaves where to assign the responsibility of parsing the
multiline description, fetching the CSV contents and turning that into
a Table of one kind or another. I think this belongs in Cucumber, at
least for the time being. Two reasons: 1) my impression is that it
would simply be easier to slap something into GherkinBuilder than to
do it even half-way well in Gherkin, and 2) Gherkin doesn't really do
any IO at the moment--it's remarkably shy about its environment--and I
think that's a *huge* plus. Having it grab the contents of a CSV is a
step in the wrong direction. I think a keyword indicating a hook or
callback of some sort might be a cool way to get around this--a way to
get Gherkin to relinquish control to whatever is calling it and then
resume later with updated input, but that will make the lexer
considerably more complex. It might be worth it, but it's a much
bigger job than inserting a CSV.

> If you feel confident, I'd just give it a crack, and we can see how it
> looks.
> As far as syntax, I think what you've suggested above looks fine. Bear in
> mind that each keyword in gherkin has space for a multiline description
> after it, so you could have...
>     Examples: This is the example name
>       This is the example's description
>       and so it this
>       because it can span over multiple lines.
>       @file: my_values.csv
>

I don't think we should add this as an "official" part of the language
for the reasons mentioned above, but whatever is decided, I don't like
using @file because @ already means "tag", and this is not a tag.
There's no reason to overload our operators. If we do want some
special syntax I'd propose -> or a word like "include". Also, I think
we should use URLs to describe where something is. They're universal,
well understood, and supported everywhere. You would end up with
something like:

Examples: The Client Maintains these ones
-> file:///path/to/the.csv

or

Examples: Remotely Served
include(http://www.example.com/my_sweet_info.csv)

Hopefully this clarifies things from my end,
Mike

> cheers,
> Matt
> ma...@mattwynne.net
> 07974 430184
>

Jon Kruger

unread,
Dec 12, 2010, 4:59:37 AM12/12/10
to cu...@googlegroups.com
In most cases I'm probably just going to put the CSV file in my
features folder. I would much rather be able to say something like
include(myfile.csv) or include(/path/to/myfile.csv) if I really needed
an absolute path. I guess I'm fine with supporting both this and a
url, that way it would work for more people.

Also, would we need to put some kind of symbol in front of include,
like #include? If we just used a word, then I could see difficulties
parsing it or people getting confused and having the "include" end up
getting parsed as a multiline description.

Some other things to think of too...
- We would have to be able to parse file paths with spaces
- Would we want to support Windows file paths that use backslashes
instead of forward slashes? Lots of people are using cucumber on
Windows these days
- Would we support multiple #include directives in the same Examples section?

Jon

Mike Sassak

unread,
Dec 13, 2010, 10:17:51 AM12/13/10
to cu...@googlegroups.com
On Sun, Dec 12, 2010 at 3:59 AM, Jon Kruger <j...@jonkruger.com> wrote:
> In most cases I'm probably just going to put the CSV file in my
> features folder.  I would much rather be able to say something like
> include(myfile.csv) or include(/path/to/myfile.csv) if I really needed
> an absolute path.  I guess I'm fine with supporting both this and a
> url, that way it would work for more people.
>
> Also, would we need to put some kind of symbol in front of include,
> like #include?  If we just used a word, then I could see difficulties
> parsing it or people getting confused and having the "include" end up
> getting parsed as a multiline description.
>
> Some other things to think of too...
> - We would have to be able to parse file paths with spaces
> - Would we want to support Windows file paths that use backslashes
> instead of forward slashes?  Lots of people are using cucumber on
> Windows these days
> - Would we support multiple #include directives in the same Examples section?
>

Hi Jon,

We can hash out all the syntax questions later. If this is ever going
to work, the first step is to make tableless Examples legal in
Gherkin. I've created a branch [0] that contains a pending spec in
spec/gherkin/shared/lexer_group.rb. If you can get that to pass, we'll
be on our way.

Mike

[0] https://github.com/msassak/gherkin/tree/tableless-examples

Jon Kruger

unread,
Dec 13, 2010, 10:25:02 AM12/13/10
to cu...@googlegroups.com

Sounds good, I'll take a look at your branch.

Jon

Matt Wynne

unread,
Dec 13, 2010, 10:51:58 AM12/13/10
to cu...@googlegroups.com
Mike, Jon,

Mike, I don't mean to be a pain, but I would like to just re-iterate my preference for making this a change (or rather an extension) to the table syntax, rather than putting it into the name or description.

Right now, the name and description are all about human-readable stuff. We've recently made some changes so that the description can be in markdown format, but it doesn't have to be, and it's intended to be a free-text field.

What I'm suggesting is that change things so we can say: "there are two ways to declare a table in gherkin: you can either express it inline, using pipes, or you can reference an external place where the table data is to be read from".

As you say, the precise syntax isn't important right now, but I feel like this point is important. I feel like it would be a mistake to start putting machine-readable stuff into the name or description fields, but that it would be OK, in the specific case of a table, to just have two ways to define it.

So I'm not talking about a generic include mechanism - that's a lot more than we need here. I'm simply talking about an extension to the Gherkin language to allow you to declare a table whose data is read in from somewhere else.

Am I making sense?

Jon Kruger

unread,
Dec 13, 2010, 11:11:16 AM12/13/10
to cu...@googlegroups.com

I totally agree, that's what I had in mind too.

George Dinwiddie

unread,
Dec 13, 2010, 11:26:44 AM12/13/10
to cu...@googlegroups.com
Agreeing with Matt...

On 12/13/10 10:51 AM, Matt Wynne wrote:
> So I'm not talking about a generic include mechanism - that's a lot
> more than we need here. I'm simply talking about an extension to the
> Gherkin language to allow you to declare a table whose data is read
> in from somewhere else.

Perhaps something like this:

Scenario Outline: eating
Given there are <start> cucumbers
When I eat <eat> cucumbers
Then I should have <left> cucumbers

Examples in path/to/some/file

--
Dec. 14 - Agile Richmond in Glen Allen, VA
http://georgedinwiddie.eventbrite.com/
----------------------------------------------------------------------
* George Dinwiddie * http://blog.gdinwiddie.com
Software Development http://www.idiacomputing.com
Consultant and Coach http://www.agilemaryland.org
----------------------------------------------------------------------

Matt Wynne

unread,
Dec 13, 2010, 11:38:34 AM12/13/10
to cu...@googlegroups.com

On 13 Dec 2010, at 16:26, George Dinwiddie wrote:

> Agreeing with Matt...
>
> On 12/13/10 10:51 AM, Matt Wynne wrote:
>> So I'm not talking about a generic include mechanism - that's a lot
>> more than we need here. I'm simply talking about an extension to the
>> Gherkin language to allow you to declare a table whose data is read
>> in from somewhere else.
>
> Perhaps something like this:
>
> Scenario Outline: eating
> Given there are <start> cucumbers
> When I eat <eat> cucumbers
> Then I should have <left> cucumbers
>
> Examples in path/to/some/file

That's very nice and readable.

I still think it would be OK to have the same keyword, name and description as normal, but just instead of the inline table, have a reference to some external source, e.g.

Scenario Outline: eating
Given there are <start> cucumbers
When I eat <eat> cucumbers
Then I should have <left> cucumbers

Examples: Small Numbers of Cucumbers

These are just some simple examples:

| start | eat | left |
| 10 | 1 | 9 |
| 2 | 2 | 0 |

Examples: European Union Cucumber Quantities

These are the trade figures from 2009-2010.

|>> file://eu_figures.csv <<|

Here I've used two examples tables, one with an inline table, and one with an externally sourced table.

Even though the actual Gherkin source is a bit less readable than George's example, this still allows people to use the name and description for each table of examples, as they'll be familiar with, and the parsed Gherkin objects passed to Cucumber will have the same interfaces as before.

>
> --
> Dec. 14 - Agile Richmond in Glen Allen, VA
> http://georgedinwiddie.eventbrite.com/
> ----------------------------------------------------------------------
> * George Dinwiddie * http://blog.gdinwiddie.com
> Software Development http://www.idiacomputing.com
> Consultant and Coach http://www.agilemaryland.org
> ----------------------------------------------------------------------
>

Mike Sassak

unread,
Dec 13, 2010, 1:26:27 PM12/13/10
to cu...@googlegroups.com

Hi Matt,

I think defining conventions for multiline description formats is an
easy and effective way to prototype new behaviors without reaching too
deeply into Gherkin to make changes that we might not have the best
feel for, but that's a syntactical question, and it's far less
important to me than the question of how the loading and parsing of
CSVs will be implemented. Implementation, particularly the assignment
of responsibility, is 95% of my concern.

So for the sake of argument let's assume we settle on the syntax
above: |>> file://eu_figures.csv <<|. The lexer will not need to be
modified to lex the content of that cell. What class do you think
should be responsible for loading and parsing that CSV? My aim is to
keep that responsibility out of the lexer. I'm not convinced it should
live in Gherkin anywhere (as opposed to Cucumber), but I can see how
that might make sense. It's the Lexer "knowing" about CSVs that really
bothers me. I can't think of a sane way to implement that at the Lexer
level that doesn't involve significant changes to it.

Mike

Matt Wynne

unread,
Dec 13, 2010, 1:35:14 PM12/13/10
to cu...@googlegroups.com

I'm with you. So thinking about Aslak's idea of filters, can we imagine a layer of filters which Gherkin would throw that lexed single-cell table to, and allow the filter to transform it into another table, effectively expanding it by going and fetching the contents from somewhere and returning a new, populated Gherkin::Table?

That would mean we don't need to change the lexer, and we don't need to change Cucumber either, apart from perhaps configuring it to tell Gherkin to use the CsvTableExpander filter, which could be a totally independent library, or maybe come with gherkin.

WDYT?

Mike Sassak

unread,
Dec 13, 2010, 1:51:45 PM12/13/10
to cu...@googlegroups.com

I think I just opened my browser to write you an email along the lines
of "Wait a second I think I can write a filter in about twenty lines
of code." Heh. So, a filter it is. It would be nice if they were
configurable. :-)

tpo - Tomáš

unread,
Dec 13, 2010, 1:54:37 PM12/13/10
to Cukes
Um, this might be bikeshedding from my part, but I'd suggest, that
*if* new syntax be added to Cucumber, then it should be as nicely
human (read: customer) readable as the current Cucumber syntax. And
IMHO "|>> file://eu_figures.csv <<|" doesn't have anything in common
with a human readable syntax. It's occult programmer's language in
pure form (a mixture between an URL and shell). So for whatever that's
worth, my cent balance:

matt_syntax -= 1
george_syntax += 1

IMHO
*t

Aslak Hellesøy

unread,
Dec 13, 2010, 1:51:59 PM12/13/10
to cu...@googlegroups.com
Guys,

It would be far simpler to add a generic  preprocessor #include directive don't you think? More versatile too.

#include file:foo.txt

It would be substituted with the content behind the URL and it would be done prior to lexing. It could be done in either gherkin or cucumber. Additionally an include directive could specify a translator in case the URL contains a MIME type that is not text/plain:

#include file:foo.xls, xls2txt

-where we could supply some simple converters ootb, and make it easy for people to write their own.

Aslak

cheers,
Matt

07974 430184

Mike Sassak

unread,
Dec 13, 2010, 2:24:21 PM12/13/10
to cu...@googlegroups.com

I thought of a pre-processor at first, but rejected the idea because
unless we're including gherkin source (a silly idea, I think), this
adds unnecessary translation and parsing steps to the process. The
external data would need to be retrieved, parsed in its native format,
converted into gherkin, then parsed by Gherkin and converted into a
series of events. The filter way allows you to synthesize gherkin
events from a native representation without needing to convert it to
gherkin first.

> Additionally an include directive could specify a translator in case the URL
> contains a MIME type that is not text/plain:
> #include file:foo.xls, xls2txt
> -where we could supply some simple converters ootb, and make it easy for
> people to write their own.

Why would we want to translate input for people? Seems easier to me to
just say, "Hey, if you point us at crap, everything will fail. Don't
point us at crap." The ability to insert a filter would side-step this
nicely. Sounds to me like a job for the much talked about but never
implemented stackable filters a la Rack API.

Mike

Matt Wynne

unread,
Dec 13, 2010, 3:03:24 PM12/13/10
to cu...@googlegroups.com

I think Aslak sent this before we came up with the filter idea. Assuming we're understanding each other correctly, Mike, it seems like a great genesis of our ideas.

The only concern is Tomáš' about the non-readability of it. I'm sure we can improve on the |>> file here <<| thing - it took me about 10 seconds to invent that.

aslak hellesoy

unread,
Dec 13, 2010, 6:34:42 PM12/13/10
to cu...@googlegroups.com

#include:

1) retrieve CSV
2) parse CSV
3) convert to gherkin
4) parse gherkin
5) emit gherkin events

We'd have to do 1-3. 4-5 is already implemented.

> The filter way allows you to synthesize gherkin
> events from a native representation without needing to convert it to
> gherkin first.
>

filters:

1) parse gherkin
2) retrieve CSV
3) parse CSV
4) emit gherkin events

We'd have to do 2-4

Both approaches requires retrieving and parsing of CSV. They differ by
whether we turn the parsed CSV into gherkin text or emit events.

In terms of performance I suppose the #include approach would be a
little slower than the filter approach. However, I don't think the
overhead would be noticeable, so I don't think speed is a strong
argument here.

I think it's more important to compare how easy it will be to
implement either architecture. We could implement a CSV to gherkin
translator (as #source or as filter events) and bundle it with gherkin
or cucumber. However, I'm sure some people would want to use other
formats, such as Excel, Google Spreadsheets or some proprietary Wiki.
That means they'll have to implement their own translator.

Implementing a filter based translator requires knowledge of the
Gherkin API. Implementing an #include based translator only requires
knowledge of the output gherkin.

For this reason I'm leaning towards #include. It would also be useable
anywhere in a gherkin file, not only for tables. Who knows, maybe
somebody wants to suck in pystrings?

>> Additionally an include directive could specify a translator in case the URL
>> contains a MIME type that is not text/plain:
>> #include file:foo.xls, xls2txt
>> -where we could supply some simple converters ootb, and make it easy for
>> people to write their own.
>
> Why would we want to translate input for people? Seems easier to me to
> just say, "Hey, if you point us at crap, everything will fail.

I can imagine some people might want to use google docs or excel.

> Don't
> point us at crap." The ability to insert a filter would side-step this
> nicely. Sounds to me like a job for the much talked about but never
> implemented stackable filters a la Rack API.
>

The two approaches are not mutually exclusive - in theory we could
support both. I just think #include is simpler in this case...

Aslak

Gregory Hnatiuk

unread,
Dec 13, 2010, 6:54:05 PM12/13/10
to cu...@googlegroups.com
Or re-implement GivenScenario ;)

Mike Sassak

unread,
Dec 14, 2010, 1:10:42 AM12/14/10
to cu...@googlegroups.com

I agree they're not mutually exclusive, but I disagree about their
relative complexity and usefulness. :-)

If you're leaning toward #include though, why not side-step a lot of
the issues surrounding it (to start I think it's too hacky to be the
official way of doing this) and add a --stdin flag to Cucumber? Right
around the time I was getting really frustrated with the plugins I was
lucky to have a chat with Dan North about what I was working on, and
his suggestion was to forget about plugins entirely and just make it
easy to pipe content into Cucumber like this:

$ wget http://example.com/feature.html | html2gherkin | cucumber --stdin

I thought this was such a good idea I wrote something to split apart
features passed in via stdin and wired that up quick and dirty to a
--stdin flag in Cucumber, and it worked pretty darn great for a
night's work. Unfortunately I haven't had the time or inclination
since then to finish the job. I still have the feature scanner here
though: https://gist.github.com/460971. I'd do things a bit
differently now, but the core packs a wallop in very few lines of
code, in my opinion. This way we can easily let a thousand formats and
converters bloom, and if any of them prove to be indispensable, we can
fold the best stuff into Gherkin proper.

WDYT?
Mike

Matt Wynne

unread,
Dec 14, 2010, 5:02:14 AM12/14/10
to cu...@googlegroups.com

I think this sounds good in principle. Can you give me an example of how the OP will use it to get his CSV file into the Examples table?

Mike Sassak

unread,
Dec 14, 2010, 8:31:38 AM12/14/10
to cu...@googlegroups.com

$ my-csv-expander | cucumber --stdin

my-csv-expander could easily read in the features according to what
the OP would like and expand CSV references into Gherkin tables. To
process a subset of features he could filter on a tag and have
Cucumber run the others directly, or he could pass them through
unchanged. There's not going to be much of a speed difference either
way.

Best of all from my point of view is that we wouldn't need to decide
on an official way to do this. If the OP wanted to publish
my-csv-expander on Github and provide support for doing it his way, he
could without having to listen to all those loud mouths on the
Cucumber ML. ;-) If after a time my-csv-expander works so well
everyone who Cukes uses it, we can include it or the best parts of it
(and maybe include the best parts of some competing solutions) into
Cucumber/Gherkin proper. Like Rails does with plugins and what-not.
Named scope, nested attributes, and err... Merb were all developed
elsewhere but proved they were so good they should take their place
among the included batteries.

Gherkin then could have as one of its responsibilities making this
type of thing easy (with filter composer and builder APIs), but if
that never materializes, well so what? Munging streams of text by hand
is definitely the Bell Labs solution, but there are advantages to it
that MIT never dreamed of. I bet you could even do crazy stuff in
Windows PowerShell with this approach. :-)

Mike

Matt Wynne

unread,
Dec 14, 2010, 8:45:54 AM12/14/10
to cu...@googlegroups.com

I see. So my-csv-expander is a black box that reads the files in the features directory, looks for some placeholder in the .feature files and expands them into valid gherkin features, then spits them out of stdout?

This would work nicely for supporting George's syntax, I guess.

What does the OP think? Would you like this?

Mike Sassak

unread,
Dec 14, 2010, 9:48:56 AM12/14/10
to cu...@googlegroups.com

So it's not quite so much a black box: https://gist.github.com/740509. :-)

Jon Kruger

unread,
Dec 14, 2010, 11:13:17 AM12/14/10
to Cukes
The stdin approach is technically possible, but it doesn't seem to be
as user-friendly. I want my QA people to be able to write cucumber
tests. I want my QA people to be able to look at a book on Cucumber
and have it make sense to them. People like that wouldn't necessarily
know how to pipe something to a command, or where to pull the CSV
translator from github.

To me, either the preprocessor directive or the filter idea are better
because it encapsulates all of this inside of Cucumber/Gherkin so that
users don't need to know about the implementation (or how to call
it). It's really easy then for a user to read a book or a blog post
and say, "Hey, I can put #include myfile.csv (or whatever syntax we
end up with) in my feature file and it will create examples from my
CSV file!" It requires us to put the CSV parsing inside Cucumber/
Gherkin, but I think that's a good thing. I wouldn't think that that
code would be that hard to write.

I don't have a problem with also implementing the stdin approach
because then it would let people theoretically parse a feature file
and transform it in any way, but that's something different than what
I'm asking for. But if we were to implement this in Cucumber/Gherkin
for CSV, we're already most of the way to supporting pipe-delimited
files, Excel files, and other kinds of files (if that's something
people wanted down the road).

Jon

On Dec 14, 8:45 am, Matt Wynne <m...@mattwynne.net> wrote:
> On 14 Dec 2010, at 13:31, Mike Sassak wrote:
>
> > On Tue, Dec 14, 2010 at 4:02 AM, Matt Wynne <m...@mattwynne.net> wrote:
>
> >> On 14 Dec 2010, at 06:10, Mike Sassak wrote:
>
> >>> On Mon, Dec 13, 2010 at 5:34 PM, aslak hellesoy
> >>> <aslak.helle...@gmail.com> wrote:
> >>>> On Mon, Dec 13, 2010 at 7:24 PM, Mike Sassak <msas...@gmail.com> wrote:
> >>>>> On Mon, Dec 13, 2010 at 12:51 PM, Aslak Hellesøy
> >>>>> <aslak.helle...@gmail.com> wrote:
>
> >>>>>> On Dec 11, 2010, at 9:26 PM, Matt Wynne <m...@mattwynne.net> wrote:
>
> >>>>>> Jon, Mike,
> >>>>>> On 11 Dec 2010, at 20:42, Jon Kruger wrote:
>
> >>>>>> On Sat, Dec 11, 2010 at 9:02 AM, Mike Sassak <msas...@gmail.com> wrote:
>
> >>>>>>> On Sat, Dec 11, 2010 at 3:02 AM, Matt Wynne <m...@mattwynne.net> wrote:
>
> >>>>>>>> On 11 Dec 2010, at 04:09, Jon Kruger wrote:
>
> >>>>>>>>> Is there a way to import the "Examples" section of a Scenario Outline
> >>>>>>>>> from a CSV file?  There are two reasons I want to do this:
>
> >>>>>>>>> 1) If I have a lot of data, it's messy if it's in the .feature file
> >>>>>>>>> 2) If it's in CSV, I can give the CSV file to a business person and
> >>>>>>>>> they can fill in values using Excel
>
> >>>>>>>>> If there isn't a way to do this, I'd be willing to take a stab at it
> >>>>>>>>> if someone can point me in the right direction.
>
> >>>>>>>>> Jon
>
> >>>>>>>> This has come up more than once, and I think it would be a sweet feature
> >>>>>>>> in Cucumber. To implement it would involve making a change to gherkin[1],
> >>>>>>>> which is the library that parses your .feature files. Gherkin uses Ragel[2]
> >>>>>>>> to parse the files, so the change would be to alter this ragel code:
>
> >>>>>>>>https://github.com/aslakhellesoy/gherkin/blob/master/ragel/lexer_comm...
> >>>>>> #includehttp://foo/bar.txt
> >>> $ wgethttp://example.com/feature.html| html2gherkin | cucumber --stdin
> >>>>>> m...@mattwynne.net
> >>>>>> 07974 430184
>
> >>>>>> --
> >>>>>> You received this message because you are subscribed to the Google Groups
> >>>>>> "Cukes" group.
> >>>>>> To post to this group, send email to cu...@googlegroups.com.
> >>>>>> To unsubscribe from this group, send email to
> >>>>>> cukes+un...@googlegroups.com.
> >>>>>> For more options, visit this group at
> >>>>>>http://groups.google.com/group/cukes?hl=en.
>
> >>>>>> --
> >>>>>> You received this message because you are subscribed to the Google Groups
> >>>>>> "Cukes" group.
> >>>>>> To post to this group, send email to cu...@googlegroups.com.
> >>>>>> To unsubscribe from this group, send email to
> >>>>>> cukes+un...@googlegroups.com.
> >>>>>> For more options, visit this group at
> >>>>>>http://groups.google.com/group/cukes?hl=en.
>
> >>>>> --
> >>>>> You received this message because you are subscribed to the Google Groups "Cukes" group.
> >>>>> To post to this group, send email to cu...@googlegroups.com.
> >>>>> To unsubscribe from this group, send email to cukes+un...@googlegroups.com.
> >>>>> For more options, visit this group athttp://groups.google.com/group/cukes?hl=en.
>
> >>>> --
> >>>> You received this message because you are subscribed to the Google Groups "Cukes" group.
> >>>> To post to this group, send email to cu...@googlegroups.com.
> >>>> To unsubscribe from this group, send email to cukes+un...@googlegroups.com.
> >>>> For more options, visit this group athttp://groups.google.com/group/cukes?hl=en.
>
> >>> --
> >>> You received this message because you are subscribed to the Google Groups "Cukes" group.
> >>> To post to this group, send email to cu...@googlegroups.com.
> >>> To unsubscribe from this group, send email to cukes+un...@googlegroups.com.
> >>> For more options, visit this group athttp://groups.google.com/group/cukes?hl=en.
>
> >> cheers,
> >> Matt
>
> >> m...@mattwynne.net
> >> 07974 430184
>
> >> --
> >> You received this message because you are subscribed to the Google Groups "Cukes" group.
> >> To post to this group, send email to cu...@googlegroups.com.
> >> To unsubscribe from this group, send email to cukes+un...@googlegroups.com.
> >> For more options, visit this group athttp://groups.google.com/group/cukes?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups "Cukes" group.
> > To post to this group, send email to cu...@googlegroups.com.
> > To unsubscribe from this group, send email to cukes+un...@googlegroups.com.
> > For more options, visit this group athttp://groups.google.com/group/cukes?hl=en.
>
> cheers,
> Matt
>
> m...@mattwynne.net
> 07974 430184

aslak hellesoy

unread,
Dec 14, 2010, 6:50:41 PM12/14/10
to cu...@googlegroups.com
On Tue, Dec 14, 2010 at 4:13 PM, Jon Kruger <goo...@jonkruger.com> wrote:
> The stdin approach is technically possible, but it doesn't seem to be
> as user-friendly.  I want my QA people to be able to write cucumber
> tests.  I want my QA people to be able to look at a book on Cucumber
> and have it make sense to them.  People like that wouldn't necessarily
> know how to pipe something to a command, or where to pull the CSV
> translator from github.
>
> To me, either the preprocessor directive or the filter idea are better
> because it encapsulates all of this inside of Cucumber/Gherkin so that
> users don't need to know about the implementation (or how to call
> it).  It's really easy then for a user to read a book or a blog post
> and say, "Hey, I can put #include myfile.csv (or whatever syntax we
> end up with) in my feature file and it will create examples from my
> CSV file!"  It requires us to put the CSV parsing inside Cucumber/
> Gherkin, but I think that's a good thing.  I wouldn't think that that
> code would be that hard to write.
>
> I don't have a problem with also implementing the stdin approach
> because then it would let people theoretically parse a feature file
> and transform it in any way, but that's something different than what
> I'm asking for.  But if we were to implement this in Cucumber/Gherkin
> for CSV, we're already most of the way to supporting pipe-delimited
> files, Excel files, and other kinds of files (if that's something
> people wanted down the road).
>

You're bringing up some good points Jon.

It looks like we have three directions that:

a) Can be used to solve the same problem of sucking in external content
b) Can be implemented independently without affecting the other
c) All have their value

I therefore propose we do them all. My suggestion:

* stdin: Mike, do it
* include: Aslak, do it
* filter: Matt, Mike or someone else, do it

Common for them all is that they require fairly little code, so I'm
not too worried about redundancy. I think they all have their own
merits.

One problem we have to be aware of with all solutions is line numbers.
As you all know, Cucumber reports gherkin files and line numbers, and
they still need to be correct. It would be unacceptable if cucumber
starts reporting incorrect line numbers and/or incorrect files. For
example, if an error happens in a file that came from a CSV, then the
error should point to that file, and (ideally) the right line within
it.

Aslak

Jon Kruger

unread,
Dec 14, 2010, 6:57:42 PM12/14/10
to cu...@googlegroups.com

I'm willing to do some of the work if someone can point me in the right direction.

Jon

Matt Wynne

unread,
Dec 16, 2010, 7:31:00 AM12/16/10
to cu...@googlegroups.com
On 14 Dec 2010, at 23:57, Jon Kruger wrote:

I'm willing to do some of the work if someone can point me in the right direction.

Jon

Good man! Which option would you prefer to work on?

Jon Kruger

unread,
Dec 16, 2010, 8:19:31 AM12/16/10
to cu...@googlegroups.com


On Dec 16, 2010 7:31 AM, "Matt Wynne" <ma...@mattwynne.net> wrote:
>
>
> On 14 Dec 2010, at 23:57, Jon Kruger wrote:
>
>> I'm willing to do some of the work if someone can point me in the right direction.
>>
>> Jon
>
> Good man! Which option would you prefer to work on?

I can do the filter option... are there any examples of filters in cucumber now?

Jon

Mike Sassak

unread,
Dec 16, 2010, 10:07:25 AM12/16/10
to cu...@googlegroups.com
On Thu, Dec 16, 2010 at 7:19 AM, Jon Kruger <j...@jonkruger.com> wrote:
>
> On Dec 16, 2010 7:31 AM, "Matt Wynne" <ma...@mattwynne.net> wrote:
>>
>>
>> On 14 Dec 2010, at 23:57, Jon Kruger wrote:
>>
>>> I'm willing to do some of the work if someone can point me in the right
>>> direction.
>>>
>>> Jon
>>
>> Good man! Which option would you prefer to work on?
>
> I can do the filter option... are there any examples of filters in cucumber
> now?
>

Look in Gherkin. There are a few layers in there, and I'm not sure
where exactly Matt was thinking to put the filter in, but if you poke
around you should see how Gherkin works by composing classes that
send, receive and transform streams of events. Any filter will do
something very similar. I imagine we'll also need to modify the lexer,
unless we want to go with the special table cell syntax Matt used as
an example. I'm partial to using the multiline description to store
stuff like this, at least at first. It's a low-cost way to play around
with extensions to the core language and imposes few burdens on other
users of the library.

Mike

Andrew Premdas

unread,
Dec 18, 2010, 6:43:29 AM12/18/10
to cu...@googlegroups.com
This is a real long thread that I've skipped through, and these questions might be a bit out of order, but anyhow here goes.

What is the business benefit of doing this in cucumber?

Why can't we just delegate this to step definitions instead?

This is based on the following ideas

1. Any table defined imperative feature can be expressed as a non-tabular declarative feature that encapsulates the table in a concept and can pass the processing of the table down to a step definition.
2. Any feature that needs to import a table into its examples will produce output that is unreadable by the business. 

All best

Andrew






Mike Sassak

unread,
Dec 18, 2010, 2:01:26 PM12/18/10
to cu...@googlegroups.com
On Sat, Dec 18, 2010 at 5:43 AM, Andrew Premdas <apre...@gmail.com> wrote:
> This is a real long thread that I've skipped through, and these questions
> might be a bit out of order, but anyhow here goes.
> What is the business benefit of doing this in cucumber?

The big benefit as I see it is to make it easier for the business side
of the product team to contribute to the feature suite. One of the
major aims of Cucumber has always been for non-developers to
contribute to and write features on their own, but so far that's been
the exception to the rule. These additions address part of that
problem.

> Why can't we just delegate this to step definitions instead?
> This is based on the following ideas
> 1. Any table defined imperative feature can be expressed as a non-tabular
> declarative feature that encapsulates the table in a concept and can pass
> the processing of the table down to a step definition.

This is true in terms of computation, but it abstracts away from the
expressive possibilities of inline tables. Knowing when to inline a
table vs. when to encapsulate it in a step is something that every
team can decide on its own according to their own context and best
judgment. I see these additions in much the same light. These changes
will give product teams using Cucumber more choices for how they fit
it into their workflow. Right now Cucumber dictates more of a workflow
than it needs to, and for many teams a tool that makes you change your
workflow to fit the tool, rather than vice verse, is a flawed tool.

> 2. Any feature that needs to import a table into its examples will produce
> output that is unreadable by the business.

I don't think I understand your point. Can you elaborate on this? If
you pull in a CSV in the manner we're describing the pretty formatter
should inline the contents of that CSV, properly indented and
formatted, right into the output.

Mike

> All best
> Andrew

Gregory Hnatiuk

unread,
Dec 18, 2010, 4:34:58 PM12/18/10
to cu...@googlegroups.com
I actually think it's less that the output will be unreadable than that the input, as a Gherkin feature file,
no longer contains the full description of the expected behavior. When I share and 
collaborate on features, it's usually at the input level, not at the output. 

Anyone else concerned that we might be losing something by splitting that specification into more than one place? 

While I'm quite interested in being able to specify features in ways other than straight-up Gherkin 
(and Mike's got some cool Builder stuff going which should make that easier),
it doesn't feel quite right to me that feature files be the place to do that.

Greg

Matt Wynne

unread,
Dec 18, 2010, 6:16:23 PM12/18/10
to cu...@googlegroups.com
I'm with you Greg, but I think in the special case of examples tables, I can see that it would be nice to use a 'native' table editor like Excel for large sets of data rather than faffing around formatting text tables in gherkin files.

If I used this, I would still want the team to keep the CSV files in source control with the features.


 
Mike

> All best
> Andrew
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Cukes" group.
> To post to this group, send email to cu...@googlegroups.com.
> To unsubscribe from this group, send email to
> cukes+un...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/cukes?hl=en.
>

--
You received this message because you are subscribed to the Google Groups "Cukes" group.
To post to this group, send email to cu...@googlegroups.com.
To unsubscribe from this group, send email to cukes+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/cukes?hl=en.



--
You received this message because you are subscribed to the Google Groups "Cukes" group.
To post to this group, send email to cu...@googlegroups.com.
To unsubscribe from this group, send email to cukes+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/cukes?hl=en.

Matt Wynne

unread,
Dec 18, 2010, 6:21:46 PM12/18/10
to cu...@googlegroups.com
Hi Andrew,

On 18 Dec 2010, at 11:43, Andrew Premdas wrote:

> This is a real long thread that I've skipped through, and these questions might be a bit out of order, but anyhow here goes.
>
> What is the business benefit of doing this in cucumber?
>
> Why can't we just delegate this to step definitions instead?
>
> This is based on the following ideas
>
> 1. Any table defined imperative feature can be expressed as a non-tabular declarative feature that encapsulates the table in a concept and can pass the processing of the table down to a step definition.

I think you might not have noticed, but this thread is specifically about reading Scenario Outline Examples tables from an external file. If you have a bunch of different examples to pump through the same scenario (however declarative) I can see that it could be handy to store and edit those examples in a spreadsheet rather than a text file. We've been asked for it a few times, and it's something that other tools like Fit do really well, I believe.

> 2. Any feature that needs to import a table into its examples will produce output that is unreadable by the business.

Yeah there is a danger that you'd end up with e really wide table I suppose, but there doesn't seem too much harm in giving people enough rope to hang themselves with :)

>
> All best
>
> Andrew

Andrew Premdas

unread,
Dec 18, 2010, 9:05:34 PM12/18/10
to cu...@googlegroups.com
My point is that if the table is big enough to need to be stored externally, then the output of the feature will be to big for someone to read. If nobody is going to read the output of the feature to see that each of the rows passes, then you might as well encapsulate all the data in the table into one concept and have a feature that says the concept passes. This will have greater business value than loads of lines of table data. So I guess I'm putting forward the idea that using the expressive possibilities of inline tables is actually pretty detrimental to actually expressing business value. I also don't think that the tables are actually that useful for anything else either including debugging.

I realise this might be considered a bit radical and I don't want to be dogmatic about it. I just thought it might be an idea to try and prompt a little think about these things before committing to quite a big chunk of work

Andrew

Andrew Premdas

unread,
Dec 18, 2010, 9:13:37 PM12/18/10
to cu...@googlegroups.com
On 18 December 2010 23:21, Matt Wynne <ma...@mattwynne.net> wrote:
Hi Andrew,

On 18 Dec 2010, at 11:43, Andrew Premdas wrote:

> This is a real long thread that I've skipped through, and these questions might be a bit out of order, but anyhow here goes.
>
> What is the business benefit of doing this in cucumber?
>
> Why can't we just delegate this to step definitions instead?
>
> This is based on the following ideas
>
> 1. Any table defined imperative feature can be expressed as a non-tabular declarative feature that encapsulates the table in a concept and can pass the processing of the table down to a step definition.

I think you might not have noticed, but this thread is specifically about reading Scenario Outline Examples tables from an external file. If you have a bunch of different examples to pump through the same scenario (however declarative) I can see that it could be handy to store and edit those examples in a spreadsheet rather than a text file. We've been asked for it a few times, and it's something that other tools like Fit do really well, I believe.

No I noticed that, and I know that its been asked for in Cukes a number of times. I don't think it belongs in Cukes features and that it would be more effective to use load the table in a step definition

> 2. Any feature that needs to import a table into its examples will produce output that is unreadable by the business.

Yeah there is a danger that you'd end up with e really wide table I suppose, but there doesn't seem too much harm in giving people enough rope to hang themselves with :)

:), but it wouldn't do much harm to have another little think about whether this quite large chunk of work is really beneficial

Andrew

aslak hellesoy

unread,
Dec 18, 2010, 9:15:24 PM12/18/10
to cu...@googlegroups.com

Are you assuming here that the main reason for storing a table
externally is that it's big?
-And by big, do you mean wide or tall?

I can think of at least one different reason than size: usability.

I know a bunch of people who'll happily edit a table in Excel, but
won't even go near a text editor.

Aslak

George Dinwiddie

unread,
Dec 18, 2010, 9:29:52 PM12/18/10
to cu...@googlegroups.com
On 12/18/10 9:15 PM, aslak hellesoy wrote:
> I can think of at least one different reason than size: usability.
>
> I know a bunch of people who'll happily edit a table in Excel, but
> won't even go near a text editor.

Yeah, I've seen that one a lot. Another reason is that the Excel table
already exists, and is actively maintained.

There's no one best way to reach satisfaction, and that makes our job
harder.

- George

--
----------------------------------------------------------------------
* George Dinwiddie * http://blog.gdinwiddie.com
Software Development http://www.idiacomputing.com
Consultant and Coach http://www.agilemaryland.org
----------------------------------------------------------------------

Andrew Premdas

unread,
Dec 19, 2010, 7:46:21 AM12/19/10
to cu...@googlegroups.com
Yes
 
-And by big, do you mean wide or tall?

Either 
 
I can think of at least one different reason than size: usability.

I know a bunch of people who'll happily edit a table in Excel, but
won't even go near a text editor.

I have a number of answers to this:

1. I'm not saying that tables in Excel shouldn't be used with Cucumber. I am saying the place  to use them is in stepdefs. 

2. Whilst making things more useful for your Excel people, pity the poor developer who has to debug the feature when it breaks.

3. What about CI and the location of the table. If its in the project  and under source control then your Excel people probably won't be able to get to it or commit their changes. If its outside the project/source control then 
   3.1. things start to get tricky when an edit to the table causes the feature to break
   3.2 locating the table becomes difficult when the feature is run in a different context e.g. ci

4. If you process the table in a step definition you have the full power of ruby available to you to deal with location, users and other aspects of that table. For example if the table is stored outside the project  it might be a pretty good idea to use FasterCSV to create a fixture that mirrors the data in the table and use this mirror unless the sheet has been edited recently.

So overall I would question again the usefulness of table imports into features.

Jon Kruger

unread,
Dec 19, 2010, 8:17:45 AM12/19/10
to cu...@googlegroups.com

Here's an example: we want to make a spreadsheet (with help from the business) that says which roles are able to access a screen.  Breaking that up into lots of individual step defs doesn't make sense (there's not much to say in each one), and if I create one step def that does all the work, then I lose the benefit of examples where it shows which ones pass and which ones fail.

Jon

Andrew Premdas

unread,
Dec 20, 2010, 5:37:43 AM12/20/10
to cu...@googlegroups.com
Thanks for you example, it provoked some thinking ... 

If you sheet was wibble_role_access_rules.excel then you feature might be something like

Scenario "Access wibble screen"
  Given I am a <role>
  When I view wibble screen
  Then I should see wibble
   
 Example: import wibble_role_access_rules.excel

My feature, which looks worse would be:

Scenario "Access wibble screen"
   When accessing wibble we should use "wibble_role_access_rules.excel"

If there were a hundred access rules then I'd speculate that cucumber output would not be read. If there were only 10 rules then I'd suggest that the problems of using an external artefact in the feature are greater than the cost of writing the example table in the feature.

If there was an error with a hundred output rules then I'd be concerned about the 99 output rules hiding the error. I think my feature could provide better error detection if the step definition was well written. Overall I think benefits of having the detailed output in the feature are marginal. However I'm probably prejudiced by my feeling that driving development using artefacts outside of a project in a proprietary format is a really bad thing.

Some things to consider about this scenario are:

- How does the application relate to the spreadsheet, is it using the spreadsheet to implement its rules?
- What are the effects of a bad edit in the spreadsheet
- Should the spreadsheet be defining the application behaviour, or should the application perhaps generate the spreadsheet as an artefact showing how it implements role/screen access.

Overall I would still suggest that the table import is a nice to have, rather than a must have; that using a scenario like the one I've shown is an acceptable work around for a messy use case where an external artefact has to be used;  and finally that encouraging the use of table imports in features is not a good thing for doing BDD with Cucumber. But hey thats only my opinion
 
All best

Andrew

byrnejb

unread,
Dec 21, 2010, 1:37:27 PM12/21/10
to Cukes


On Dec 18, 4:34 pm, Gregory Hnatiuk <ghnat...@gmail.com> wrote:

>
> I actually think it's less that the output will be unreadable than that the
> input, as a Gherkin feature file, no longer contains the full description
> of the expected behavior. When I share and collaborate on features,
> it's usually at the input level, not at the output.
>
> Anyone else concerned that we might be losing something by splitting that
> specification into more than one place?

+1

I have an uneasy sense that what is happening here is a fundamental
departure from the very thing that makes Cucumber so useful. Its
plain descriptive language. I really seriously question what anything
is doing in a feature or scenario that requires a spreadsheet or csv
file to handle. This, to me, has the odour of procedural coding. For
that same reason I also have reservations about in-line scenario
tables.

When I started out with Cucumber I tended (overwhelmingly) to use
imperative scenario statements with lots of explicit variables. Now I
have gotten to the point that if I see an obvious variable in a
feature then I check if there is some other way of accomplishing the
same thing. Scenario tables and ancillary files seem to me simply
variables writ large; and suspect for that very reason.

Russell Blandamer

unread,
Jan 9, 2012, 12:57:06 PM1/9/12
to cu...@googlegroups.com
Hi,

I'm new here so apologies if this is a dumb question or the wrong forum or anything like that.

Does anyone know if any of the solutions proposed in this thread was implemented?  I ask because I want to do something like this:

Scenario Outline:

Given I am browsing the <brand> website
When I check the product page for <product_id>
Then the product description html should be <description>

Examples: "./product_description_examples.csv"

...where product_description_examples.txt is a large (1000s of lines) pipe-delimited table: exactly the format I would have used had I written it directly into the feature file.  I want to store it as an external file to make it easier to (1) edit in Excel, and (2) automatically update regularly as products appear and disappear on the website.

I'm running the above on Windows 7 x64, using Ruby 1.8.7-p352 and Cucumber 1.1.2.  I get a gherkin lexing error as follows:

editorial.feature: Lexing error on line 43: '%%_FEATURE_END_%%→♥☺'. See http://wiki.github.com/cucumber/gherkin/lexingerror for more information. (Gherkin::Lexer::LexingError)
C:/Ruby187/lib/ruby/gems/1.8/gems/gherkin-2.6.4-x86-mingw32/lib/gherkin/lexer/i18n_lexer.rb:23:in `scan'
C:/Ruby187/lib/ruby/gems/1.8/gems/gherkin-2.6.4-x86-mingw32/lib/gherkin/lexer/i18n_lexer.rb:23:in `scan'
C:/Ruby187/lib/ruby/gems/1.8/gems/gherkin-2.6.4-x86-mingw32/lib/gherkin/parser/parser.rb:31:in `parse'
C:/Ruby187/lib/ruby/gems/1.8/gems/cucumber-1.1.2/bin/../lib/cucumber/feature_file.rb:37:in `parse'
C:/Ruby187/lib/ruby/gems/1.8/gems/cucumber-1.1.2/bin/../lib/cucumber/runtime/features_loader.rb:28:in `load'
C:/Ruby187/lib/ruby/gems/1.8/gems/cucumber-1.1.2/bin/../lib/cucumber/runtime/features_loader.rb:26:in `each'
C:/Ruby187/lib/ruby/gems/1.8/gems/cucumber-1.1.2/bin/../lib/cucumber/runtime/features_loader.rb:26:in `load'
C:/Ruby187/lib/ruby/gems/1.8/gems/cucumber-1.1.2/bin/../lib/cucumber/runtime/features_loader.rb:14:in `features'
C:/Ruby187/lib/ruby/gems/1.8/gems/cucumber-1.1.2/bin/../lib/cucumber/runtime.rb:132:in `features'
C:/Ruby187/lib/ruby/gems/1.8/gems/cucumber-1.1.2/bin/../lib/cucumber/runtime.rb:45:in `run!'
C:/Ruby187/lib/ruby/gems/1.8/gems/cucumber-1.1.2/bin/../lib/cucumber/cli/main.rb:43:in `execute!'
C:/Ruby187/lib/ruby/gems/1.8/gems/cucumber-1.1.2/bin/../lib/cucumber/cli/main.rb:20:in `execute'
C:/Ruby187/lib/ruby/gems/1.8/gems/cucumber-1.1.2/bin/cucumber:14
C:/Ruby187/bin/cucumber:19:in `load'

C:/Ruby187/bin/cucumber:19


Incidentally, the characters following FEATURE_END_%% on the first line of the error change every time I try to run this.

I suspect the file isn't being read at all, and that the error is trying to tell me that line 43 of the file should have the examples table on it, whereas it actually has nothing on it (it doesn't exist: the Examples keyword is on the final line of the file, which is 42).

I'm hoping it's just that my syntax is wrong somewhere, because getting this working would be a really big gain for me, and I think it would be a good solution for the perennial ecommerce tester's problem of how to avoid their tests relying on product data that will sooner or later disappear.

Any help appreciated!

Cheers,

Russell

Matt Wynne

unread,
Jan 10, 2012, 6:56:10 AM1/10/12
to cu...@googlegroups.com
Sorry to deliver bad news, this feature has never got past the 'nice idea' stage.

cheers,
Matt

--
Freelance programmer & coach

Andrew Premdas

unread,
Jan 11, 2012, 6:44:01 PM1/11/12
to cu...@googlegroups.com
On 10 January 2012 11:56, Matt Wynne <ma...@mattwynne.net> wrote:
>
> On 9 Jan 2012, at 17:57, Russell Blandamer wrote:
>
> Hi,
>
> I'm new here so apologies if this is a dumb question or the wrong forum or
> anything like that.
>
> Does anyone know if any of the solutions proposed in this thread was
> implemented?  I ask because I want to do something like this:
>
> Scenario Outline:
>
> Given I am browsing the <brand> website
> When I check the product page for <product_id>
> Then the product description html should be <description>
>
> Examples: "./product_description_examples.csv"
>

Instead write

Given the products are loaded
When I check all the products
Then they should all have the correct descriptions

or

even

When I check all the products
Then there should be no product errors

You don't need cucumber to load a CSV and to loop. Cucumber is not
designed for this kind of stuff. However ruby is great at doing this,
so do the looping and loading in the step definitions. If you create a
module step helper you can do this no probs.


module ProductCheckHelper
def load_all_products
...

def visit_all_products
# produces a report
...

def report_product_errors
...
end
World(ProductCheckHelper)

implement the methods and call them from the appropriate steps and voila

HTH

Andrew

> --
> You received this message because you are subscribed to the Google Groups
> "Cukes" group.
> To post to this group, send email to cu...@googlegroups.com.
> To unsubscribe from this group, send email to
> cukes+un...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/cukes?hl=en.

--
------------------------
Andrew Premdas
blog.andrew.premdas.org

Russell Blandamer

unread,
Jan 17, 2012, 6:05:16 AM1/17/12
to Cukes
Thanks for the help guys 8-)

Originally I didn't want to do the looping inside the Ruby code,
because I didn't want the whole test to bail on me the first time a
particular item out of the thousands I wanted to test failed. I was
also worried that the RSpec failure message wouldn't mean a lot, as it
wouldn't tell me which product had failed. However, I think I will
probably get round this by raising a custom exception after all the
products have been tested, which says exactly which ones failed,
rather than using a simple RSpec matcher.

Cheers,

Russ

Ian Harrigan

unread,
Mar 23, 2015, 10:45:57 AM3/23/15
to cu...@googlegroups.com, russell....@gmail.com
I realise this is a very old thread, but im going to want to do something like this myself very soon. My issues are a little different. But basically:

* Using cucumber to test a messaging framework (and webapp)
* Performing 100s of different scenarios
* Testing 100s of message in examples
* Various scenarios for various parts of the framework and webapp

Currently there is simply huge amounts of table data in our "Examples:" section and that data itsself is duplicated all over the place. Some tests require certain bits of data, others require other bits, but generally there is a cross over (for instance <message name>). One reason it wouldnt make sense to move it all to step defs is that we would then need to write new step defs for each "thing" we wanted to test for all messages, rather than reusing our "building block" step defs in scenario outlines (we are testing 100s of "things" on 100s of messages).

I think the best approach (which out changing the lexer) is:

  Scenario Outline:
   
When I do something with <x>
   
And I do something with <y>
   
Then the result should be <z>
 
Examples: import from messages if domain is "something"
   
| x | y | z |

So I would read in the "messages" csv file (which would probably be an alias to an absolute location), and filter it by the domain field (using "something") and build the examples table in a ruby hook but only extracting the x, y, and z columns from the csv. I think this would help readability in the console output, and also, it seems examples: must have at least one entry.

I was wondering what anyone here thought about this approach. Im almost certain this is something we will have to do regardless, so im just sanity checking my approach.

Cheers,
Ian


Steve Tooke

unread,
Mar 23, 2015, 1:49:10 PM3/23/15
to cu...@googlegroups.com
I think the big problem with this approach is that it hides the Examples
from the _readers_ of this file.

Cucumber is primarily a tool that enables communication, as such the
examples form an import of the discussion that happens between
developers, testers and the product team. They aren't going to go
looking in csv files to find the data that they want.

The other issue with having a single data file that you use across many
features and scenarios is that those scenarios are now coupled to each
other through that data. If you need to change the data file for one
scenario, you can potentially break other scenarios - not due to a bug,
just because the data is no longer what was expected.

As an aside, I think having a lot Scenario Outlines is a smell. Cucumber
isn't the best tool to _thoroughly_ verify that your application works.
Many scenario outlines might indicate that you are trying to test all of
the code paths using end-to-end cucumber tests, you may have fallen into
the Cucumber Test Trap[1].

Cheers,
Steve

[1]: http://tooky.co.uk/the-cucumber-test-trap/
--
E: st...@boxjump.co.uk
T: +44 7919 337 463
http://tooky.co.uk | http://kickstartacademy.io |
https://twitter.com/tooky

> I was wondering what anyone here thought about this approach. Im almost
> certain this is something we will have to do regardless, so im just
> sanity
> checking my approach.
>
> Cheers,
> Ian
>
>
> --
> Posting rules: http://cukes.info/posting-rules.html
> ---
> You received this message because you are subscribed to the Google Groups
> "Cukes" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cukes+un...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Avinash Duggirala

unread,
Oct 24, 2016, 4:58:50 AM10/24/16
to Cukes
So is this concluded that this feature of passing examples through csv will not be available for Cucumber ??

I agree this may bring gap between the understandings of the stake holders. But here we are not actually hiding the scenario outline main content, but the data injection part alone.

In the projects I works in different companies in real time, no upper stakeholder is really interested to fetch the data and place it in the examples section they will just give you a high level scenario . It will be the job of SDET/Automation Engineer to handle test data. A functionality like this will be a boon to SDETS.

There will be some critical test cases for some projects which needs to be run the same test case on thousands of data combinations. Are we having any solutions from cucumber end to the poor souls like tat ?


Regards,
Avinash Duggirala

Wolfgang Hierl

unread,
Aug 20, 2018, 9:38:15 PM8/20/18
to Cukes
There is also a maven plugin which imports csv data in feature files. It also supports environment specific test data. All delails you can find https://bitbucket.org/idensitylab/cucumber-features-pp-maven-plugin/src/master/

Michael Cunningham

unread,
Nov 27, 2018, 10:24:58 AM11/27/18
to Cukes
@Wolfgang I have dowloaded your project from bitbucket but cannot get it working due to not being able to locate dependencies at runtime: 'No implementation for org.apache.maven.repository.RepositorySystem was bound while locating org.apache.maven.project.DefaultProjectBuildingHelper.  I got that error while running testPlugin()
Do you have plans to put your artefact in the central maven repository.

Wolfgang Hierl

unread,
Nov 28, 2018, 1:04:25 PM11/28/18
to Cukes
Hi,

the artefact is already at maven central:

I will look at the problem hopefully next days.
Can you add full error message, please?

kind regards,
Wolfgang

Michael Cunningham

unread,
Nov 30, 2018, 6:24:34 AM11/30/18
to Cukes
@Wolfgang
Many thanks for the link to mvnrepository. Your plugin works fine out the box. Well done.
regards,

Michael

Arindam Chakraborty

unread,
Aug 9, 2019, 7:44:44 AM8/9/19
to Cukes
Hi
I was trying to use this, however I did see that my feature file is giving me error for the "#" used. I did see in the issue section that someone closed it with changing the replace identifier in POM, but I don't see such thing in my POM.
Also, running with the error (Syntax Error: Expected one of comment, row, ta but got scenario) finds no scenarios. Here is the portion of the Runcukes file, I have used @Arindam1 for the scenario I am testing.

@RunWith(Cucumber.class)
@CucumberOptions(
        plugin = { "junit:target/cucumber-reports/test-report.xml", "json:target/cucumber-reports/test-report.json" },
        features = { "src/it/resources" }, tags = { "@TestArindam1","not @Ignore" },
        glue = { "com.jackson.jds.steps" },
        snippets = SnippetType.CAMELCASE)

Here is the feature file:
 @GetDocument @TestArindam1
  Scenario Outline: Client application is connected to JDS Client webservice and User submits a document request to get document details
    Given a valid document request with documentID as "<documentID>" and clientTrackingID as "<clientTrackingID>"
    When the document request is submitted with the documentID to get the document metadata
    Then the response returns "documentClassName" as "AGENT"
    And the response returns "documentId" as "2000024786"
    And Property "BATCHNAME" value is "TEST_02"
    And Property "STATE" value is "MI"
    And Property "COMPANY" value is "01"
    And Property "BOXNAME" value is "TEST_BOX"
    
    Examples:
    #@csv:testdata1.csv

The csv 
documentID;clientTrackingID
2000024786;e8cf190b-a39d-b0a1-aa2d-2134cea6c91t
Reply all
Reply to author
Forward
0 new messages