Beyond machinist

3 views
Skip to first unread message

Xavier Shay

unread,
Mar 16, 2010, 2:54:18 AM3/16/10
to rails-...@googlegroups.com
In which I wave my hands a lot. I haven't thought this through totally,
but thought I'd throw it out there.

Problem:
- My specs are slow oh no!
- Setting up objects/object trees takes heaps of time
- There's a *lot* of duplication in object creation in most test suites

Solution:
- Add 'smarts' to machinist* so that it auto-generates fixtures for
common objects and keeps them in the DB

Downside:
- You can't rely on your DB being 'clean'
... but if you're using should change_by and friends that often
won't be an issue
- It doesn't actually exist and may even be impossible


Idea:
- Either some sort of profiling, where machinist records object creation
and builds up a library (so first run will be just like normal, second
run will be optimized)

it 'a' do
user = User.make
end

it 'b' do
user = User.make
end

it 'c' do
user = [User.make, User.make]
end

This spec should only require 2 creations, not 4. This example would be
trivial to implement - keep track of how many users are created in each
example, then before all the transactional stuff for the next suite run,
create a few users that can be used (fixtures) rather than creating new
ones. You can have smarts so that if a particular object is only needed
once or twice, then just create it on the fly rather than
pre-generating. For more complex cases may need some 'hints' or what
what for it to work. Have to come up with more examples.

I (and I know a few others) use a before(:all_transactional) hook to
fake this behavior in some of our specs - if object creation takes 1s
and you want to spec 15 attributes, that's a 15x speed up.

I imagine this would subtley change how you write specs, so may not be
able to be retrofitted on to big projects - but I reckon this could be a
big win for large functional suites.

Discuss! Already done it? Handy idea? What examples break it?

Xav

* where ever I say machinist just insert "some sort of object factory thing"

Ben Hoskings

unread,
Mar 16, 2010, 3:48:20 AM3/16/10
to rails-...@googlegroups.com
On 16/03/2010, at 5:54 PM, Xavier Shay wrote:

> In which I wave my hands a lot. I haven't thought this through totally, but thought I'd throw it out there.
>
> Problem:
> - My specs are slow oh no!
> - Setting up objects/object trees takes heaps of time
> - There's a *lot* of duplication in object creation in most test suites

I was thinking about this exact problem just this afternoon. I've been writing a few specs for a project that has these same speed issues. Single specs taking 2-3 seconds to run :)


> Solution:
> - Add 'smarts' to machinist* so that it auto-generates fixtures for common objects and keeps them in the DB

I like this idea. I think it's doable.

A pool of fixture records, which specs can "check out" for use. If the pool dries up, machinist lazily creates more.

Time for a hack night? :)

—ben_h

Pat Allan

unread,
Mar 16, 2010, 3:55:36 AM3/16/10
to rails-...@googlegroups.com
I think all is there to make it happen, you just need to track object
changes -which AR does with #changed? and such.

Would love to see it happen, though I'm afraid I don't have time atm
to help.

--
Pat

> --
> You received this message because you are subscribed to the Google
> Groups "Ruby or Rails Oceania" group.
> To post to this group, send email to rails-...@googlegroups.com.
> To unsubscribe from this group, send email to rails-oceani...@googlegroups.com
> .
> For more options, visit this group at http://groups.google.com/group/rails-oceania?hl=en
> .
>

Alex Pooley

unread,
Mar 16, 2010, 7:20:34 AM3/16/10
to Ruby or Rails Oceania
This sounds like an interesting challenge. But, I like to keep my
tests super stupid and your hand waving solution hurts my head. What
about throwing the tests at a cluster or something?

*hands waving* Buuut, say I was to bite... an easy win would be to
take a snapshot at the end of a setup block. In theory, that's all
your duplicated creations done with, and each test is then doing
something unique?

Alex.

David Lee

unread,
Mar 16, 2010, 9:00:48 AM3/16/10
to rails-...@googlegroups.com
It might be worth thinking about this in a similar vein to the db:seed task (intended to be used to install data required for development / prod, but I've used it for test "fixtures" before).If you combine some records crafted with 'plausible / memorable scenarios' in mind with a sugary accessor or two, eg

User["impatient_customer"] # => User.find_by_username 'impatient.customer'

- then you're hopefully improving spec comprehensibility, as well as streamlining fixture creation. 

Set up your fixtures up front once (rspec's before_suite block IIRC), set a named (postgresql) transaction savepoint here and snap back to it before :each spec .. should be pretty snappy - sounds less magic than what you were talking about, but maybe worth considering.

Finally, even with something like machinist, I find its often pretty helpful (for some models anyway) to write a create_foo(options) method which has sensible defaults and deals with associations and preconditions for you - especially if you need to build a bunch of other records as prerequisites.

Theres still heaps of scope for improvement around spec fixtures / setup - it's a noble goal.

Hope that was passably coherent ...

--
cheers,
David Lee

Pete Yandell

unread,
Mar 16, 2010, 11:29:52 PM3/16/10
to rails-...@googlegroups.com
On 16 March 2010 17:54, Xavier Shay <xavie...@rhnh.net> wrote:
> - Either some sort of profiling, where machinist records object creation and
> builds up a library (so first run will be just like normal, second run will
> be optimized)

Now that would be cool! But I have a question first:

Has anyone done any profiling on their Rails app tests to work out
where the time goes? I've always wondered how much of the time testing
Rails apps is wasted on the database syncing disk writes, etc. If you
threw your test database on a ramdisk, how much performance would you
gain?

- Pete

Anthony Richardson

unread,
Mar 16, 2010, 11:44:46 PM3/16/10
to rails-...@googlegroups.com
On Wed, Mar 17, 2010 at 1:59 PM, Pete Yandell <pe...@notahat.com> wrote:

Has anyone done any profiling on their Rails app tests to work out
where the time goes? I've always wondered how much of the time testing
Rails apps is wasted on the database syncing disk writes, etc. If you
threw your test database on a ramdisk, how much performance would you
gain?


I tried this once on our application (8min rspec runtime) and it made no difference. 
 A) I have a SSD drive
 B) I don't use Swap

I suspect that because the transactions aren't committed that they never hit the disk anyway. 

This is with Oracle, other DB's may behave differently.

Cheers,

Anthony 

Ben Hoskings

unread,
Mar 17, 2010, 12:32:02 AM3/17/10
to rails-...@googlegroups.com
On 17/03/2010, at 2:29 PM, Pete Yandell wrote:

Has anyone done any profiling on their Rails app tests to work out
where the time goes? I've always wondered how much of the time testing
Rails apps is wasted on the database syncing disk writes, etc. If you
threw your test database on a ramdisk, how much performance would you
gain?

Negligible time is wasted on disk: http://vimeo.com/10224566

This spec run causes some serious blueprint cascading. Although the DB is on fire (check the log whooshing by), the disk isn't really being touched (check reads in/out per sec).

The reads and writes hover around 30-60, and peak at about 160. That's on an SSD that can do about 1000 reads/sec, so it's not really having an impact.

—ben_h

David Lee

unread,
Mar 17, 2010, 2:35:25 AM3/17/10
to rails-...@googlegroups.com
I've got something else to throw in: the sequential nature of cucumber stories is kind of one of its selling points for testing certain resource-intensive tasks.

You can write a feature as a long chain of little ops & assertions that assume and build upon the state of the previous ones - sometimes a feature runs lots faster than a comparable suite.

If you find yourself doing a lot of expensive setup :each time you run a series of specs, consider whether it might be better done as a cucumber story.

(I'm largely talking to myself here; i think I use cucumber less than I should).

John Barton

unread,
Mar 17, 2010, 4:57:29 PM3/17/10
to Ruby or Rails Oceania

> Has anyone done any profiling on their Rails app tests to work out
> where the time goes? I've always wondered how much of the time testing
> Rails apps is wasted on the database syncing disk writes, etc. If you
> threw your test database on a ramdisk, how much performance would you
> gain?
>
> - Pete

I've tried using ruby-prof with rspec a number of times - both to test
spec speed, and to assert things about app speed as well and it was
very painful.

Because of the way rspec builds up it's sets of examples and
everything you end up with a very muddy call tree profile it's really
tricky to get anything meaningful out of it.

My next avenue of exploration (when i get some free time) will to open
up rspec and add hooks to start stop ruby-prof at will, so it will
only show me my own example code rather than all that rspec guff.

-jb

Xavier Shay

unread,
Mar 17, 2010, 5:03:44 PM3/17/10
to rails-...@googlegroups.com
In one of our apps, common objects (that cause a number of cascades)
takes about 1s to .make (there's a lot of callbacks and whatnot going on
here also). Objects close to the bottom of the tree take 0.1s to create.

In a new app, each object creation is adding about 0.02s (sounds tiny
buts add up quick).

As a percentage these numbers are far bigger than most other operations
in our suite (we do have some app tasks that take a while, but we know
where those are).

Nicholas Faiz

unread,
Mar 19, 2010, 3:31:52 AM3/19/10
to Ruby or Rails Oceania
In the past I've relied upon transactional fixtures, as a compromise,
to overcome this problem. As above, I typically prefer making sure
there is no data in the db before a test run, but when I've had to
handle a lot of data transactional fixtures have sped things up
greatly.

Pete Yandell

unread,
Mar 22, 2010, 7:06:05 PM3/22/10
to rails-...@googlegroups.com
I like having an empty database for each test too, but I'd be more
than willing to trade that for faster tests.

If it's not solved before then, anyone fancy spending some time at
Railscamp working on making this happen?

- Pete

Ben Schwarz

unread,
Mar 22, 2010, 7:10:59 PM3/22/10
to rails-...@googlegroups.com
Has anyone in this thread tried using an in-memory database rather than looking at using SSD or re-inventing machinist?
Both sqlite and mysql have in-memory databases or tables. I'm sure that other rdbms implementations would also.

I've used sqlite :memory: in the past when using datamapper and found it to greatly increase the runtime speed for the testing environment.
Surely this would prove greater gains than re-thinking machinst?

Of course, I don't want to get in the way of your hacking project.

David Lee

unread,
Mar 22, 2010, 7:44:23 PM3/22/10
to rails-...@googlegroups.com
Last time i benchmarked it vs postgresql tuned for testing (no fsync, etc) PG came out on top on the suite i was running by a reasonable margin. If memory serves ... i think it was partly because sqlite is so crippled (no subselects, etc) it had to run (sometimes vastly) more queries to do the equivalent work.

That was a couple years ago - if anyone has better datapoints throw them in ... 

Interesting to know mysql has the same feature, as it's less handicapped than sqlite. Maybe I'll try it out - I'm certain few of my current projects would work w/ sqlite, but should on mysql.
--
cheers,
David Lee

Simon Russell

unread,
Mar 22, 2010, 7:48:53 PM3/22/10
to rails-...@googlegroups.com
Hi there, my first post.

The Mysql in-memory database is pretty basic; sort of like MyISAM, but
with even fewer features. To me it sort of defeats most of the
purposes of testing to test against a completely different database
engine (that doesn't even try to support the same features). I think
some time ago (somewhere else) there was mention of another database
engine that could do in-memory stuff, I can't remember what it is.

Ben Schwarz

unread,
Mar 22, 2010, 8:21:31 PM3/22/10
to rails-...@googlegroups.com
Other option may be to still use the same database engine but have a transparent caching layer in between.
I really wanted to get people to step back away from the problem with a solution and back to the problem itself.

Julio Cesar Ody

unread,
Mar 22, 2010, 8:31:38 PM3/22/10
to rails-...@googlegroups.com
If you're hand-writing complex queries that for one reason or another
rely on features that are only available to one SGDB in particular,
then here's part of the cost for doing so.

I wouldn't say it means the idea of in-memory databases is a bad one
unless you're leaning a lot of your logic against the DB as opposed to
relying on a ORM. If you don't, then you can have a production system
running on PostgreSQL, in-memory SQLite databases for tests, and so
on.

--
http://crazyhollywood.org

James Sadler

unread,
Mar 22, 2010, 11:15:56 PM3/22/10
to rails-...@googlegroups.com
Another possibility would be to create a RAM disk setup for testing -
that way you get proper database features but nice and fast access and
tests. I am not sure how straightforward this is, but I'm confident
that it's doable.

--
James

Ben Schwarz

unread,
Mar 22, 2010, 11:20:14 PM3/22/10
to rails-...@googlegroups.com
Doable, but not really practical for a development environment.
Most of the laptops used by the ruby community will be SSD in a year, right?

James Sadler

unread,
Mar 22, 2010, 11:25:38 PM3/22/10
to rails-...@googlegroups.com
Anything scriptable should be practical, assuming it doesn't take too
much time to run the script. But you should only have to run the
script when you boot up your machine, which is infrequent. It's just
some config that says "put your tables in this directory (on the RAM
disk) instead of the usual place".

Point taken regarding the SSDs, but RAM is still faster (perhaps not
meaningfully so, unless your test suite is ridiculously large).

Anthony Richardson

unread,
Mar 22, 2010, 11:44:32 PM3/22/10
to rails-...@googlegroups.com
Guys, as noted earlier in this conversation a couple of us have done this and it made no appreciable difference. The databases engines just weren't dumb enough to write to the disk when the transactions are all being rolled backed at the end of each test anyway.

Cheers,

Anthony

Xavier Shay

unread,
Mar 23, 2010, 3:23:19 AM3/23/10
to rails-...@googlegroups.com
On 17/03/10 2:29 PM, Pete Yandell wrote:
> On 16 March 2010 17:54, Xavier Shay<xavie...@rhnh.net> wrote:
>> - Either some sort of profiling, where machinist records object creation and
>> builds up a library (so first run will be just like normal, second run will
>> be optimized)
>
> Now that would be cool!
hey guys stop trying to come up with big picture solutions!
code can solve all your problems!

http://github.com/xaviershay/machinist/tree/smart-fixtures

Ben H and I had a crack this evening. Totally Not Done. So broken on any
real project we have. However!

It works for:

before(:suite) do
Machinist.use_smart_fixtures = true
end

# Only two INSERT statement will be generated
100.times do |x|
it("test #{x}") { User.make }
end

after(:each) do
Machinist.smart_fixture_teardown!
end

...and we've pushed dinner back about as far as we can stand.

This version builds up the fixtures as the test suite runs, we also
tried one that profiled the app and dumped out fixture data to a file
but that was more complicated. See commit history if you're interested.

Xav + Ben

Pete Yandell

unread,
Mar 24, 2010, 8:54:04 PM3/24/10
to rails-...@googlegroups.com
Cool stuff, Xavier and Ben.

Hope you're at the Ruby meet tonight so we can chat about it!

- Pete

Reply all
Reply to author
Forward
0 new messages