Question on implementing rspec test using concurrency with threads, etc.

709 views
Skip to first unread message

David Luu

unread,
Nov 13, 2015, 9:41:36 PM11/13/15
to rspec
Say I have a test design like below, for sample code, how should I structure it to get it working correctly within rspec? I'm still a relative novice to rspec and ruby. I know the way I've written the example code, the it test block is not valid and needs to be put in correct scope. But I'm not aware what's the best approach to restructure based on the intent of the test. As regular Ruby code, that should work fine, just not as an rspec test.

To complicate matters more, the concurrency solution should be compatible with an rspec test of this kind of data driven design (unless you can provide an alternate data driven design approach to the linked example): http://stackoverflow.com/questions/31375083/structuring-a-csv-data-driven-test-with-rspec-the-basic-simple-way

require 'rspec'

describe "actual case scenario using concurrency in test process", :simple do

  def produce_stuff
    # the code, note that this code itself is not Ruby specific
    # but rather Ruby library code or even like system shell calls
    # to call some infrastructure stuff that produces streaming data
  end

  def consume_and_validate_the_stuff
    # test validation code within this block
    it "description here" do
      #consume/fetch the produced streaming live data, etc.
      #then assert/validate the actual data/state against expected
      expect(false).to eql(true)
    end
  end

=begin
Test a scenario that requires performing some actions
and concurrently validating the actions in tandem.

Assume not possible to validate after the action as the data
is processed live and not queued for test access after the fact,
e.g. live streaming data in test environment with multiple consumers
(besides the current test). And fetching that data as old archived data
via offsets is problematic to get it right in sync (e.g. using the right
offsets, etc.)
=end
  pt=Thread.new{produce_stuff()}
  ct=Thread.new{consume_and_validate_the_stuff()}
  pt.join
  ct.join

end

Myron Marston

unread,
Nov 14, 2015, 2:55:28 AM11/14/15
to rs...@googlegroups.com
It's hard for me to answer your question because your code confuses so I'm not sure what your intent is.  If you can update it to be a little more concrete that would help.  Here's a few thoughts, though

  • It looks like you are trying to run an RSpec spec inside your `consume` method in a thread.  That's not going to work.  RSpec takes responsibility for running your specs at an appropriate time in the runner's lifecycle.
  • For this kind of logic, I find it works best to treat the threading/concurrency as an implementation detail that my tests are unaware of.  Instead, my test treats a bit of logic as a black box, invoking it via some public API and then making assertions after it completes about the return value or the produced side effects.  With this strategy, the threading would all be internal to your implementation, and in your spec you would just run a synchronous `perform_work` (or whatever) method that would do any threading stuff it wants internally, and then join the threads such that it doesn't return until the threads are complete.
  • If you're dealing with a problem that fits a producer/consumer pattern, consider using Ruby's Queue class.  It's threadsafe and makes communication between threads very easy.
HTH,
Myron

--
You received this message because you are subscribed to the Google Groups "rspec" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rspec+un...@googlegroups.com.
To post to this group, send email to rs...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rspec/9ea305ef-7601-48c8-a691-dcf4ecc4e751%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

David Luu

unread,
Nov 14, 2015, 2:49:45 PM11/14/15
to rspec
Thanks for the response. I'll look into your current suggestions.

This is my black box test scenario, for testing a stream processing microservice:

Test input via Rspec > microservice > output to validate with Rspec

This is testable sequentially via the message bus' queue (and using offsets to fetch from queue to be exact and safe), when tested under isolation. That is feed input, then afterwards fetch outputvia offsets to validate.

But in actual complete test environment with multiple microservices connected to form the system, the isolated test approach doesn't work reliably, as other consumers consume the data live, and using offsets doesn't appear to sync quite right compared to the isolated environment. Plus we will eventually implement scalability enhancements utilizing the message bus technology that make the consume by offset method even harder.

So I was considering running the code that feeds the test data and the output validation concurrently, validate as output comes out based on the input fed in. The input thread could finish first and main thread will wait for output validation thread to complete to exit, although in general both input, output threads should complete relatively around the same time, as the processing by the microservice happens nearly in real time.

The queue class suggestion, at least in the example in the docs, doesn't quite work for me, at least in the sense of adding/removing from the queue, because I don't control the queue, the queue itself is the (Apache Kafka) message bus that the data flows through which the microservice processes. The rspec test simply utilizes Ruby code to attach to the bus to produce and consume the data. So my reliability of dealing with the queue is the reliability of dealing with kafka in Ruby. And kafka is meant as a real time message bus, so I should be producing & consuming in the same way for testing, ideally.

It occurred to me that I could also just use parts of Rspec like the expectations and matchers and not use the full rspec framework to make structuring this test easier, although I do prefer to keep it full rpsec with the describe and it blocks.

I hope that adds clarity to my test intentions. So I'm not actually or directly testing concurrency but that I need to utilize it to facilitate the testing.

Myron Marston

unread,
Nov 15, 2015, 12:00:26 AM11/15/15
to rs...@googlegroups.com
This is my black box test scenario, for testing a stream processing microservice:
Test input via Rspec > microservice > output to validate with Rspec

So you're trying to use RSpec in your production code to validate input and output, as a way of doing runtime assertions?  That's not a very common approach (I've not heard of any one doing it before, honestly!) but it's certainly doable with rspec-expectations.  Just `include RSpec::Matchers` in your class and the typical expectation syntax (e.g. `expect(value).to whatever`) will be available in the instance methods of your class.

You also mention Apache Kafka causing difficulties for testing.  Here's my suggestion to deal with that:
  • Come up with an abstract interface that provides access to the minimal set of Apache Kafka features your app needs.
  • Write an implementation of that interface that works by wrapping Kafka and delegating to it internally.
  • Write an alternate implementation that instead uses something far simpler, easier to reason about and easier to control such as Ruby's Queue library.
  • Use this alternate implementation for your unit tests so that you can easily control the input to your system in your isolated unit tests.
  • Use the implementation that wraps Kafka in production.
  • Write a small number (potentially as few as 1!) integration tests that use the Kafka implementation.  Use these tests to verify that your code works with the kafka implementation and that it works properly when all wired together, but don't make detailed assertions about your logic -- that's what the unit tests are for.
Myron

Reply all
Reply to author
Forward
0 new messages