So, yeah.
I'd like to break this out into a line-by-line test. The following code
doesn't work because each_line is a generator function, not an array:
def test.to_html
# test the Calendar.to_html method
cal = Calendar.new(@date_test)
line_no = 0
@html_reference.each_line.zip(cal.to_html.each_line).each do |test_line|
line_no += 1
assert_equal( test_line[0], cal_line[1], "Calendar#to_html output
incorrect at line #{line_no}"
# Test::Unit will then emit the reference line and the failed test
line.
end
Okay, so I want these each_lines done up as arrays.
ref_lines = []; @html_reference.each_line {|line| ref_lines << line }
cal_lines = []; cal.to_html.each_line {|line| cal_lines << line }
line_no = 0
ref_lines.zip(cal_lines).each do |test_line|
line_no += 1
assert_equal( test_line[0], test_line[1], "Calendar.to_html
incorrect on line #{line_no}")
end
This smells very kludgy and procedural. I'm using an index/counter
variable, explicit Array declarations, and zipping two arrays together
only to split them apart again. Ugh!
Can someone help me make this more elegant? C++ has pooped Ruby's pants.
-dB
--
David Brady
ruby...@shinybit.com
C++ Guru. Ruby nuby. Apply salt as needed.
> cal_lines = []; cal.to_html.each_line {|line| cal_lines << line }
require 'enumerator'
cal_lines = cal.to_html.to_enum(:each_line).to_a
-Levin
You don't need the to_a but still zip will create 3 arrays.
interesting...
Simon
> require 'enumerator'
> cal_lines = cal.to_html.to_enum(:each_line).to_a
>
>
Innnteresting. This condenses the loop initialization into a single
line... but it's pretty hairy:
@html_reference.to_enum(:each_line).zip(cal.to_html.to_enum(:each_line)).each_with_index
do |test_line, line_no|
assert_equal( test_line[0], test_line[1], "Calendar#to_html
incorrect on line #{line_no+1}")
end
I don't think we've really cleaned much up, though. We've moved from
C++ pooping Ruby's pants to Ruby pooping its own pants. It's idiomatic,
but we haven't reached elegance yet.
this does not, but its still not very ruby like:
-------------------------------------------------------------
s1 = ['this', 'is', 'a', 'test'].join("\n")
s2 = ['this', 'is', 'the', 'test'].join("\n")
q1, q2 = SizedQueue.new(1), SizedQueue.new(1)
Thread.new {s1.each_line {|l| q1 << l}; q1 << nil}
Thread.new {s2.each_line {|l| q2 << l}; q2 << nil}
line_no = 0
while true
line_no += 1
t1, t2 = q1.pop, q2.pop
break if !t1 || !t2
puts "difference on line #{line_no}" if t1 != t2
end
-------------------------------------------------------------
perhaps better, but still kind of hardcore:
-------------------------------------------------------------
require 'enumerator'
require 'thread'
module Enumerable
def each_zip(other)
q = SizedQueue.new(1)
Thread.new {other.each{|l| q << l}}
self.each{|a| yield a, q.pop}
end
end
s1 = ['this', 'is', 'a', 'test'].join("\n")
s2 = ['this', 'is', 'the', 'test'].join("\n")
s1.to_enum(:each_line).enum_for(:each_zip,
s2.to_enum(:each_line)).each_with_index do |t, l|
puts "difference on line: #{l+1}" if t[0] != t[1]
end
-------------------------------------------------------------
food for thoughts...
Simon
> food for thoughts...
require 'generator'
s1 = %w{this is the first line}.join("\n")
s2 = %w{this is the second line}.join("\n")
compare = SyncEnumerator.new(s1.to_enum(:each_line), s2.to_enum(:each_line))
compare.each { |a,b|
# ...
}
-Levin
Yes, perfekt, that's what i was looking for (and implemented :) )
the trivial solution is of course:
----------------------------------------------------------
s1 = ['this', 'is', 'a', 'test'].join("\n")
s2 = ['this', 'is', 'the', 'test'].join("\n")
(0...s1.size).inject(1) do |l, i|
puts "difference on line: #{l}" or break if s1[i] != s2[i]
l = (s1[i] == 10) ? l+1 : l
end
----------------------------------------------------------
buts that no fun..
Simon
Why not something simple like so:
require 'enumerator'
def compare_by_line( expected, actual, message )
expected_lines = expected.to_enum(:each_line)
actual_lines = actual.to_enum(:each_line)
expected_lines.zip(actual_lines).each_with_index do |pair, index|
assert_equal( *pair, "#{message} on line #{index + 1}")
end
end
compare_by_line( @html_reference, cal.to_html, "Calendar.to_html incorrect" )
There's no need for threads...
Jacob Fugal
(Sorry, for those in non-tree-threading clients, that was in reply to
ruby-talk:154785)
Jacob Fugal
because zip converts each argument to an array, which isn't what you
want if there are many lines.
> There's no need for threads...
True SyncEnumerator does it with continuations but i wasn't ready for
this mind bending stuff (yet). (and now i don't need to make knots in my
brain, because Levin was so nice to tell us about SyncEnumerator)
But what exactly isn't thread safe? SizedQueue is.
> Jacob Fugal
Threads aren't evil.
cheers
Simon
What's the second to last line for?
l = (s1[i] == 10) ? l+1 : l
That will always return false (since s1[i] is always a string, and
string/integer comparison is always false), so your line count will
always be 1.
Jacob Fugal
> [...]
> What's the second to last line for?
>
> l = (s1[i] == 10) ? l+1 : l
>
> That will always return false (since s1[i] is always a string, and
> string/integer comparison is always false), so your line count will
> always be 1.
s1[i] is in fact never a string:
str[fixnum] => fixnum or nil
http://www.ruby-doc.org/core/classes/String.html#M001328
so it is for counting newline chars. It could be written shorter
(s1[i] == 10) ? l+1 : l
without the assignment.
> Jacob Fugal
cheers
Simon
Ah, yes. I'd probably do it with SyncEnumerator rather than zip as
well, then[1]. I just used zip since its what you started with and you
didn't specify performance as an issue.
> True SyncEnumerator does it with continuations but i wasn't ready for
> this mind bending stuff (yet). (and now i don't need to make knots in my
> brain, because Levin was so nice to tell us about SyncEnumerator)
Very true. Thanks, Levin.
> But what exactly isn't thread safe? SizedQueue is.
Unless SizedQueue is built with threads in mind and does alternating
blocking on pushes (<<) and pops. If it does, I apologize.
Assuming I'm not wasting time by not looking at the SizedQueue
documentation, the problem is that there's no guarantee that the main
thread won't run several pops before returning control to the thread
that's pushing into the queue.
> Threads aren't evil.
No, but naive non-thread-safe code is. Well, it's broken, at least...
not necessarily *evil*. :)
Jacob Fugal
Again, I need to proofread my posts better. That should read "Only if"
instead of "Unless".
Jacob Fugal
Ah, sorry, I missed the fact that you were iterating over characters
instead of lines. 10 == ?\n. Gotcha.
But isn't that different than the original functionality? He
originally wanted a warning (specifically assert_equal as part of
Test::Unit) for each line that differed. Your code will only match the
first such case then break out of the loop.
Jacob Fugal
It is a _sized_ queue wich i initailized to be of size 1, so there is
only 1 item max in the queue at any given time. This comes from a lib
called 'thread' so i would assume that it is in fact "built with threads
in mind". It blocks if there is an item in it and you want to write and
it block if there is no such item and you want to read.
cheers
Simon
hmm, true .. on the other hand assert_equal throws an
AssertionFailedError, so this is kind of breaking out of the loop also.
(But right, its not the same thing)
cheers
Simon
Performance would probably be the wrong motivation to switch to
SyncEnumerator. On my machine it is about 100 times slower then
Enumerable#zip.
sync-enum: 1.676
zip-enum: 0.018
sizedqueue: 0.226
-Levin
Indeed, I forgot about that as well. Most my programming has been in
PHP lately (*shudder*) and the testing framework I use continues past
failed assertions, so you can have many failed assertions in one test.
So if you replaced the puts "..." or break if ... with an assertEqual,
you'd pretty much have it right on. :) And since you fail at the first
mismatched character, your prior newlines are sure to have matched up.
Cool beans.
Jacob Fugal
Yeah, but that performance is probably linear (again, I'm shooting
from my hip here). So in the extreme cases, SyncEnumerable would
outperform #zip. And even before it takes the lead in execution speed,
it definitely has an advantage in memory, which I was more concerned
about. SizedQueue however seems to be a much better choice than
SyncEnumerable for those situations.
Jacob Fugal.
Yeah, I should have looked into that before my first post. Again, my
apologies :)
Jacob Fugal
class Array
# The differences between 2 arrays of equal size.
def dif( other )
( (0...self.size).to_a.zip( self ) -
(0...self.size).to_a.zip( other ) ).map{|x| x<<other[x[0]] }
end
end
@html_reference.to_enum(:each_line).dif(
cal.to_html.to_enum(:each_line) ).each{ |x|
# Line numbering starts with 1.
x[0] += 1
print "Incorrect at line "; puts x; puts
}
Thanks for bringing this up.
The variant with each_zip in Enumerable needs only one thread:
user system total real
sync-enum: 1.594000 0.500000 2.094000 ( 2.110000)
zip-enum: 0.016000 0.000000 0.016000 ( 0.015000)
sizedqueue: 0.250000 0.000000 0.250000 ( 0.250000)
each_zip: 0.125000 0.000000 0.125000 ( 0.125000)
but no match against zip-enum in terms of speed.
Still interesting that threads are 10 times faster in this case than
continuations.
cheers
Simon
> So I have a function that generates like 300 lines of text and I
> want to unit test it. If it fails, it barfs a dozen pages of text
> then says there's a difference in there somewhere and we wish you
> the best of luck trying to find it.
Use unit_diff in ZenTest.
http://rubyforge.org/projects/zentest
--
Eric Hodel - drb...@segment7.net - http://segment7.net
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04