Twice this week, I've gone looking for the Ruby equivalent to a simple Perl module and had trouble finding what I was after. Both times I've peeked inside the source and been surprised at how trivial the operations are. "I could port that in no time," I thought. This quiz is my thinly disguised attempt to pass my homework on to others. :)
Seriously, this quiz is *not* intended to be a lot of work. Don't underestimate the power of a simple library. (See the "Rethinking Memoization" thread where we are trying to improve a very helpful library that is literally 10 lines of code, in one of the forms presented.)
Given all that, this is a build-it-yourself Ruby Quiz. Most of us are familiar with another language. Go into their libraries and find something you like, that is also simple, and port the library to Ruby. (You might want to search the RAA and RubyForge first, just to make sure someone hasn't done similar work already.) If a library is over 200 lines, forget it. This one is for the little guys!
If you'll allow a brief aside here, it can be interesting to consider what the word "port" means. Obviously, the goal of this is to build a library that does the same things for Ruby. Don't think that means you should copy every method, verbatim though. If you don't think a method is needed, leave it out. See a better way to do something, use your way. Most important though, remember to Rubyize the interface. It's fine to port your favorite Java library, but Ruby programmers don't want to call methodsNamedLikeThis(). Watch for chances to use blocks and jump on them *when they lead to a better experience*. Just remember the adage, "If it ain't broke, don't fix it."
A few more details: Please tell us what your library does and show an example of simple usage in your submission email. Be kind to your quiz summarizer. ;) Also, please credit the original library and author who worked so hard to give you something cool to play with!
Now, if you have no idea what to port, here are two suggestions. (Please feel free to post other suggestions to Ruby Talk. These are *not* spoilers!)
File::ReadBackwards
This is a Perl module (by Uri Guttman) for reading a file in reverse, line-by-line. This can often be helpful for things like log files, where the interesting information is usually at the end.
Don't worry about the Perl interface on this one, copy Ruby's File instead. Heck, all I really want is a foreach() iterator. Anything else is extra.
This module is so well commented, you should be able to understand how it works, even if you aren't familiar with Perl. Here's a link straight to the source:
This is another Perl module (by Gisle Aas) and it is actually over the 200 line limit. Trust me though, it doesn't need to be. :)
The idea here is that many web sites provide a /robots.txt file, telling spider programs which pages they should not visit. This module gives you a way to parse these rules and make queries about what you are allowed to visit. You can learn all about the interface and even the file format of /robots.txt at:
(By my calculations it's gone 48 hours now? Hope so, anyway...)
I did "Ruby Murray" - a port of Johan Lodin's Sub::Curry from Perl. It's not so useful in Ruby I guess but it's fun and pretty flexible. Below is the uncommented version, but since I can't sleep when I have undocumented code I did Rdoc it and make the commented version available at http://roscopeco.co.uk/code/ruby-quiz-entries/64/ .
Ruby Murray is about a hundred lines for the main Curry class, another forty or so for convenience methods and the like, and about 70 lines of tests. It could be smaller but I like (reasonably) readable code and the TDD makes it more verbose I guess...
Here's a couple of quick examples. See the cookbook and tests (either below or in the Rdoc linked above if formatting is broken) for more.
Obviously it's completely different from the Perl original under the hood but I tried to make it familiar enough while making good use of Ruby.
There's a few other ideas I'd like to have tried but I didn't want to get too far into it ;). One advantage this version has over Perl's is that it's easy to make custom Spice argument types (HOLE, BLACKHOLE, etc) so maybe there's some scope for hacking around in there...
=====[CURRY.RB]===== require 'singleton'
class Curry WHITEHOLE = Object.new ANTIHOLE = Object.new def WHITEHOLE.inspect #:nodoc: "<WHITEHOLE>" end def ANTIHOLE.inspect #:nodoc: "<ANTIHOLE>" end
class SpiceArg def initialize(name) @name = name end def spice_arg(args_remain) raise NoMethodError, "Abstract method" end def inspect "<#{@name}>" end end
class HoleArg < SpiceArg #:nodoc: all include Singleton def initialize; super("HOLE"); end def spice_arg(args_remain) a = args_remain.shift if a == ANTIHOLE [] else [a] end end end
class BlackHoleArg < SpiceArg #:nodoc: all include Singleton def initialize; super("BLACKHOLE"); end def spice_arg(args_remain) if idx = args_remain.index(WHITEHOLE) args_remain.slice!(0..idx)[0..-2] else args_remain.slice!(0..args_remain.length) end end end
class AntiSpiceArg < SpiceArg #:nodoc: all include Singleton def initialize; super("ANTISPICE"); end def spice_arg(args_remain) args_remain.shift [] end end
def initialize(*spice, &block) block = block || (spice.shift if spice.first.respond_to?(:call)) raise ArgumentError, "No block supplied" unless block @spice, @uncurried = spice, block end
def call(*args, &blk) @uncurried.call(*call_spice(args), &blk) end
# This would be an alias, but it's documented along with call and # I couldn't :nodoc: an alias - how do we do that ? def [](*args) # :nodoc: call(*args) end
def new(*spice) Curry.new(*merge_spice(spice), &@uncurried) end
def to_proc @extern_proc ||= method(:call).to_proc end
private
def merge_spice(spice) largs = spice.dup
res = @spice.inject([]) do |res, sparg| if sparg.is_a?(SpiceArg) && !largs.empty? res + sparg.spice_arg(largs) else res << sparg end end
res + largs end
def call_spice(args) sp = merge_spice(args) sp.map do |a| if a.is_a? SpiceArg nil else a end end end end
# Undocumented alias for Perl familiarity module Sub #:nodoc: all Curry = ::Curry end
module Curriable def curry(*spice) Curry.new(self, *spice) end end
def test_perlish s = "str" s = Sub::Curry.new(s.method(:+), "ing") assert_equal "string", s.call end end
if ARGV.member?('--doc') || !File.exist?('doc') ARGV.reject! { |a| a == '--doc' } system("rdoc #{__FILE__} #{'currybook.rdoc' if File.exists?('currybook.rdoc')} --main Curry") end end
> Now, if you have no idea what to port, here are two suggestions. (Please feel > free to post other suggestions to Ruby Talk. These are *not* spoilers!)
> File::ReadBackwards
See ruby-talk:13185 and the following discussion. I was such a noob then.
> Seriously, this quiz is *not* intended to be a lot of work.
I just know people aren't going to believe me on this, so here's my attempt to put my code where my mouth is. This is my port of File::ReadBackwards. Translating the heart of the algorithm took me well under an hour, though I did spend a bit more time adding interface methods and documentation.
James Edward Gray II
#!/usr/local/bin/ruby -w
# elif.rb # # Created by James Edward Gray II on 2006-01-28. # Copyright 2006 Gray Productions. All rights reserved.
# # A File-like object for reading lines from a disk file in reverse order. See # Elif::new and Elif#gets for details. All other methods are just interface # conveniences. # # Based on Perl's File::ReadBackwards module, by Uri Guttman. # class Elif # The size of the reads we will use to add to the line buffer. MAX_READ_SIZE = 1 << 10 # 1024
# Works just line File::foreach, save that the lines come in reverse order. def self.foreach( name, sep_string = $/ ) open(name) do |file| while line = file.gets(sep_string) yield line end end end
# Works just line File::open. def self.open( *args ) file = new(*args) if block_given? begin yield file ensure file.close end else file end end
# # Works just line File::readlines, save that line Array will be in # reverse order. # def self.readlines( name, sep_string = $/ ) open(name) { |file| file.readlines(sep_string) } end
# # The first half of the Elif algorithm (to read file lines in reverse order). # This creates a new Elif object, shifts the read pointer to the end of the # file, and prepares a buffer to hold read lines until they can be returned. # This method also sets the <tt>@read_size</tt> to the remainer of File#size # and +MAX_READ_SIZE+ for the first read. # # Technically +args+ are delegated straight to File#new, but you must open the # File object for reading for it to work with this algorithm. # def initialize( *args ) # Delegate to File::new and move to the end of the file. @file = File.new(*args) @file.seek(0, IO::SEEK_END)
# Record where we are. @current_pos = @file.pos
# Get the size of the next of the first read, the dangling bit of the file. @read_size = @file.pos % MAX_READ_SIZE @read_size = MAX_READ_SIZE if @read_size.zero?
# A buffer to hold lines read, but not yet returned. @line_buffer = Array.new end
# # The second half on the Elif algorthim (see Elif::new). This method returns # the next line of the File, working from the end to the beginning in reverse # line order. # # It works by moving the file pointer backwords +MAX_READ_SIZE+ at a time, # storing seen lines in <tt>@line_buffer</tt>. Once the buffer contains at # least two lines (ensuring we have seen on full line) or the file pointer # reaches the head of the File, the last line from the buffer is returned. # When the buffer is exhausted, this will throw +nil+ (from the empty Array). # def gets( sep_string = $/ ) # # If we have more than one line in the buffer or we have reached the # beginning of the file, send the last line in the buffer to the caller. # (This may be +nil+, if the buffer has been exhausted.) # return @line_buffer.pop if @line_buffer.size > 2 or @current_pos.zero?
# # If we made it this far, we need to read more data to try and find the # beginning of a line or the beginning of the file. Move the file pointer # back a step, to give us new bytes to read. # @current_pos -= @read_size @file.seek(@current_pos, IO::SEEK_SET)
# # Read more bytes and prepend them to the first (likely partial) line in the # buffer. # @line_buffer[0] = "#...@file.read(@read_size)}#{@line_buffer[0]}" @read_size = MAX_READ_SIZE # Set a size for the next read.
# # Divide the first line of the buffer based on +sep_string+ and #flatten! # those new lines into the buffer. # @line_buffer[0] = @line_buffer[0].scan(/.*?#{Regexp.escape (sep_string)}|.+/) @line_buffer.flatten!
# We have move data now, so try again to read a line... gets(sep_string) end
# Works just line File#each, save that the lines come in reverse order. def each( sep_string = $/ ) while line = gets(sep_string) yield line end end alias_method :each_line, :each # Works just like File#each_line. include Enumerable # Support all the standard iterators.
# Works just line File#readline, save that the lines come in reverse order. def readline( sep_string = $/ ) gets(sep_string) || raise(EOFError, "end of file reached") end
# # Works just line File#readlines, save that line Array will be in # reverse order. # def readlines( sep_string = $/ ) lines = Array.new while line = gets(sep_string) lines << line end lines end
# Works just line File#close. def close @file.close end end
There aren't any particular libraries I've used anytime recently... all my work is in-house code. But I took a quick glance over CPAN for something relatively small and simple... the latter because I stopped coding in Perl years ago and don't remember all the syntax very well, especially the stuff that has been added for objects.
In any case, I found a simple library called Trampoline by Steven Lembark which allows you to create an object but delay actual construction... which is useful to have something with expensive construction cost ready to go but not actually constructed until used.
Below I provide a really basic implementation that is probably not rock-solid and could probably be done better ... I'm still such a n00b, especially when it comes to metaclasses (or eigenclasses, or whatever they want to be called this week). It also doesn't do everything the Perl lib did, just what I found useful and could understand.
Anyway, here's the code (trampoline.rb), with a couple of use examples at the bottom.
module Trampoline # Instance methods class Bounce def initialize(cons, klass, *args) @klass, @cons, @args = klass, cons, args end
def method_missing(method, *args) @obj = @klass.send(@cons, *@args) unless @obj @obj.send(method, *args) end end
# Class methods class << Bounce alias_method :old_new, :new
def new(*args) old_new(:new, *args) end
def method_missing(method, *args) old_new(method, *args) end end end
And now, example use. Obviously, this class is not in need of delayed construction; just using it as an example.
require 'trampoline' class Logger def initialize(prefix) puts 'Constructing Logger...' @prefix = prefix end
def Logger.make(prefix) Logger.new(prefix) end
def log(msg) puts "#{@prefix}: #{msg}" end end
puts "start" errors = Trampoline::Bounce.new(Logger, 'ERROR') puts "made bouncer, about to log message" errors.log('Hello, world!') puts "about to log second message" errors.log('Goodbye, world!') puts "message logged"
# This is really the same, but eventually calls Logger.make to construct. puts "start" warns = Trampoline::Bounce.make(Logger, 'WARNING') puts "made bouncer, about to log message" warns.log('Hello, world!') puts "about to log second message" warns.log('Goodbye, world!') puts "message logged"
Output from the example code:
start made bouncer, about to log message Constructing Logger... ERROR: Hello, world! about to log second message ERROR: Goodbye, world! message logged start made bouncer, about to log message Constructing Logger... WARNING: Hello, world! about to log second message WARNING: Goodbye, world! message logged
This is my first rubyquiz, and I am still learning Ruby. I decided to go with something simple but fun. So I did a search on the CPAN (I've used Perl before) for the Acme modules, and chose Acme::Bleach (http://search.cpan.org/~dconway/Acme-Bleach-1.12/lib/Acme/Bleach.pm) to implement. I couldn't find a Ruby version on either RAA or rubyforge.
Acme::Bleach is a module by Damian Conway, and it literally bleaches your program, whilst still leaving it in a runnable state. It's a really cool little module, and I stuck as close to the original as possible - even using nearly the same method names. Here it is in its entirety. Any suggestions, criticisms, etc. are highly welcome.
#Ruby port of Acme::Bleach - by Amran Gaye #You can use this by doing an "include 'Bleach' " at the top of your program
$tie="\t"*8
def whiten(laundry) #Change laundry to binary 1s and 0s... #then change those to tab and space characters. Finally add newlines after every 9th character result = laundry.unpack('b*').to_s.tr('01'," \t").gsub(/(.{9})/,"\\1\n") return $tie + result #Add a tie to the washed shirt, and return it end
def brighten(laundry) #Does the opposite of whiten laundry.sub!(/\t{8}/,'') #Remove tie laundry.tr!("\n",'') #Remove newlines laundry.tr(" \t",'01').to_a.pack('b*') #Change spaces and tabs to 0s and 1s, then repack them as binary end
def dirty?(laundry) #Laundry is dirty only if it contains non-space characters laundry =~ /\S/ end
def proper?(laundry) #shirt is proper if it contains a tie laundry =~ /^#$tie/ end
shirt = IO.readlines($0).to_s #Read in current program shirt.sub!("require 'Bleach'",'') #Remove require line
if(not dirty?(shirt) and proper?(shirt)) eval brighten(shirt) else file = File.new($0,"w") file.puts("require 'Bleach'") file.puts(whiten(shirt)) file.close end
Soon after I posted this, I saw Rubyquiz #34 (Whiteout) and - much to my chagrin - it was the same problem! :( Seems I arrived too late to the party. Still, I hope someone found it interesting.
On Jan 29, 2006, at 10:23 AM, James Edward Gray II wrote:
> This is my port of File::ReadBackwards.
I just noticed that everyone provided sample usage (just as I asked them too), but me! Egad. Here's Elif at work:
$ cat sample_data.txt This is line one. This is line two. This is line three. .. $ ruby -r elif -e 'puts Elif.readlines(ARGV.first)' sample_data.txt .. This is line three. This is line two. This is line one. $ ruby -r elif -e 'Elif.foreach(ARGV.first) { |line| puts line if line =~ /t[a-z]+.$/ }' sample_data.txt This is line three. This is line two.
rules = robots_data.split(/[\015\012]+/). map { |rule| rule.sub(/\s*#.*$/, "") } anon_rules = Array.new my_rules = Array.new current = anon_rules rules.each do |rule| case rule when /^\s*User-Agent\s*:\s*(.+?)\s*$/i break unless my_rules.empty?
current = if $1 == "*" anon_rules elsif $1.downcase.index(@user_agent) my_rules else nil end when /^\s*Disallow\s*:\s*(.*?)\s*$/i next if current.nil?
if $1.empty? current << nil else disallow = URI.parse($1)
next unless disallow.scheme.nil? or disallow.scheme == uri.scheme next unless disallow.port.nil? or disallow.port == uri.port next unless disallow.host.nil? or disallow.host.downcase == uri.host.downcase
I went browsing in CPAN to find something interesting, and came up with Algorithm::Merge. I don't use the perl version, but 3 way merging is something I do often since we allow concurrent access with our source control at work.
Merge.rb is a fairly straight port of the perl version. I did change a callback to a block, and added some symbols in place of numeric constants. I need to add better documentation, but I wanted to get this in before it was too late for the summary.
Usage: original= "Ok,\n this is a test sentence\n which will be edited." edited ="Ok,\n this is a sample phrase\n which has been edited." change="Hello World,\n this is a test phrase\n which I edited."
yields: Split by lines -------------------- ["r", "Ok,", "Ok,", "Hello World,"] ["c", " this is a test sentence", " this is a sample phrase", " this is a test phrase"] ["c", " which will be edited.", " which has been edited.", " which I edited"] Hello World, <!-- ------ START CONFLICT ------ --> this is a sample phrase which has been edited. <!-- ---------------------------- --> this is a test phrase which I edited. <!-- ------ END CONFLICT ------ -->}
Split by words -------------------- ["r", "Ok,", "Ok,", "Hello"] ["r", nil, nil, "World,"] ["u", "this", "this", "this"] ["u", "is", "is", "is"] ["u", "a", "a", "a"] ["l", "test", "sample", "test"] ["o", "sentence", "phrase", "phrase"] ["u", "which", "which", "which"] ["c", "will", "has", "I"] ["c", "be", "been", nil] ["u", "edited.", "edited.", "edited."] Hello World, this is a sample phrase which << has been | I >> edited.
Bugs: - Merge::diff3(original,edited, change) - does a character-based diff, but returns inconsistent results (lines like [u, e, s, e]). I think this is because the callback_map has some no-ops where it should have valid callbacks, but it could be due to a porting error. I am still struggling to completely grok the use of the callback_map, with hopes of simplifying/clarifying it.
Question: Can I add to or replace the Perl license with the ruby one?
Source: ---- Merge.rb ----- module Merge
# Module Merge # Three-way merge and diff # # based on perl's Algorithm::Merge # by James G. Smith, <jsm...@cpan.org> # Copyright (C) 2003 Texas A&M University. All Rights Reserved. # This module is free software; you may redistribute it and/or # modify it under the same terms as Perl itself. # ported to Ruby # by Adam Shelly <adam.she...@gmail.com>
require 'diff/lcs'
# Given references to three lists of items, diff3 performs a # three-way difference. # This function returns an array of operations describing how the # left and right lists differ from the original list. In scalar # context, this function returns a reference to such an array. # # Given the following three lists, # original: a b c e f h i k # left: a b d e f g i j k # right: a b c d e h i j k # # merge: a b d e g i j k # # we have the following result from diff3: # # [ 'u', 'a', 'a', 'a' ], # [ 'u', 'b', 'b', 'b' ], # [ 'l', 'c', undef, 'c' ], # [ 'o', undef, 'd', 'd' ], # [ 'u', 'e', 'e', 'e' ], # [ 'r', 'f', 'f', undef ], # [ 'o', 'h', 'g', 'h' ], # [ 'u', 'i', 'i', 'i' ], # [ 'o', undef, 'j', 'j' ], # [ 'u', 'k', 'k', 'k' ] # # The first element in each row is the array with the difference: # c - conflict (no two are the same) # l - left is different # o - original is different # r - right is different # u - unchanged # The next three elements are the lists from the original, left, # and right arrays respectively that the row refers to (in the synopsis, #
def Merge::diff3( pivot, doc_a, doc_b) ret = []
no_change = proc do |args| ret << ['u', pivot[args[0]], doc_a[args[1]], doc_b[args[2]] ] end
conflict = proc do |args| p= pivot[args[0]] if args[0] a= doc_a[args[1]] if args[1] b= doc_b[args[2]] if args[2] ret << ['c', p, a, b] end
diff_a = proc do |args| case args.size when 1 ret << ['o',pivot[args[0]], nil, nil] when 2 ret << ['o',nil, doc_a[args[0]], doc_b[args[1]]] when 3 ret << ['o', pivot[args[0]], doc_a[args[1]], doc_b[args[2]]] end end
diff_b = proc do |args| case args.size when 1 ret << ['l', nil, doc_a[args[0]], nil] when 2 ret << ['l', pivot[args[0]], nil, doc_b[args[1]]] when 3 ret << ['l', pivot[args[0]], doc_a[args[1]], doc_b[args[2]]] end end
diff_c = proc do |args| case args.size when 1 ret << ['r', nil, nil, doc_b[args[0]]] when 2 ret << ['r', pivot[args[0]], doc_a[args[1]], nil] when 3 ret << ['r', pivot[args[0]], doc_a[args[1]], doc_b[args[2]]] end end
traverse_sequences3(pivot, doc_a, doc_b, {:NO_CHANGE=>no_change, :CONFLICT=>conflict, :A_DIFF=> diff_a, :B_DIFF=>diff_b, :C_DIFF=>diff_c} ) return ret end
#callbacks for Diff::LCS class LCS_Traverse_Callbacks def initialize diffs @diffs = diffs end def [] l,r @diffs[@left=l]=[] @diffs[@right=r]=[] self end def match *args end def discard_a event @diffs[@left]<<event.old_position end def discard_b event @diffs[@right]<<event.new_position end end
# constants for traverse_sequences D=nil AB_A=32 AB_B=16 AC_A=8 AC_C=4 BC_B=2 BC_C=1 CB_B=5 #not used in calculations CB_C=3 #not used in calculations @base_doc = {AB_A=>:A,AB_B=>:B,AC_A=>:A,AC_C=>:C,BC_B=>:B,BC_C=>:C}
# callbacks#match:: Called when +a+ and +b+ are pointing # to common elements in +:A+ and +:B+. # callbacks#discard_a:: Called when +a+ is pointing to an # element not in +:B+. # callbacks#discard_b:: Called when +b+ is pointing to an # element not in +:A+. # The methods for <tt>callbacks#match</tt>, <tt>callbacks#discard_a</tt>, # and <tt>callbacks#discard_b</tt> are invoked with an event comprising # the action ("=", "+", or "-", respectively), the indicies +ii+ and # +jj+, and the elements <tt>:A[ii]</tt> and <tt>:B[jj]</tt>. Return # values are discarded by #traverse_sequences.
if (bc_different_len) Diff::LCS::traverse_sequences(cdoc, bdoc, ts_callbacks[CB_C,CB_B]) Diff::LCS::traverse_sequences(bdoc, cdoc, ts_callbacks[BC_B,BC_C])
if diffs[CB_B] != diffs[BC_B] || diffs[CB_C] != diffs[BC_C] puts "Diff::diff is not symmetric for second and third sequences - results might not be correct";
#trim to equal lengths and try again b_len, c_len = bdoc.size, cdoc.size bdoc_save = bdoc.slice!(target_len..-1) cdoc_save = cdoc.slice!(target_len..-1) Diff::LCS::traverse_sequences(bdoc, cdoc, ts_callbacks[BC_B,BC_C])
#mark the trimmed part as different and then restore diffs[BC_B] += (target_len..b_len).to_a if target_len < b_len diffs[BC_C] += (target_len..c_len).to_a if target_len < c_len bdoc.concat bdoc_save cdoc.concat cdoc_save end
else # not bc_different_len Diff::LCS::traverse_sequences(bdoc, cdoc, ts_callbacks[BC_B,BC_C]) end pos = {:A=>0,:B=>0,:C=>0} sizes ={:A=>adoc.size, :B=>bdoc.size, :C=>cdoc.size} matches=[] noop = proc {}
# Callback_Map is indexed by the sum of AB_A, AB_B, ..., as indicated by @matches # this isn't the most efficient, but it's a bit easier to maintain and # read than if it were broken up into separate arrays # half the entries are not noop - it would seem then that no # entries should be noop. I need patterns to figure out what the # other entries are.
The great element of porting a library is that you get examine another programmer's ideas. If you're lucky, that may teach you a new trick or two. I'll use my experience as an example.
Using Buffers
When I decided to port File::ReadBackwards, the first question I asked myself was, how do you read a file backwards? I decided that you would need to put the read head at the end of the file, then work it backwards bit by bit. You can't return a line until you have the whole thing, so I knew I would need to buffer the reads. I guessed I would be sure I had a line when I had seen two line endings (whatever is between them is a line) or run into the beginning of the file (no more data). That actually turns out to be the rough process, but, luckily for me, Uri Guttman is smarter than me and reading Uri's code taught me a couple of tricks.
Here's a simplified version of my port, showing only the interesting methods:
# Based on Perl's File::ReadBackwards module, by Uri Guttman. class Elif MAX_READ_SIZE = 1 << 10 # 1024
You can see the first trick Uri taught me in initialize(). Working the pointer backwards can be messy. You have to keep track of where you are, shift the read pointer back, read some, but always make sure you have that many bytes left. There's an easier way.
If you pick some number of bytes you are going to read, you can consider the file to be in chunks that size. For example, if we are going to read ten bytes at a time and have a twenty-four byte file, we can deal with it in chunks of ten, ten, and four. The only odd chunk size is at the end, where we start, so we can deal with that immediately and then all future reads can be whatever chunk size we selected.
That's what initialize() is doing: Open the file, jump to the end, and set that initial read size to handle the dangling partial chunk.
Now we can tackle gets() and I'll tell you about the other lesson Uri taught me. First, the function of gets() is pretty basic: If we know we have a line or that we are out of lines, return it or nil. Otherwise, read some more data, trying to make one of those two exit conditions true, and recurse. The only hard part is deciding when we have a line.
I was dreaming up a complicated solution to this, when reading Uri's code showed me the light. You can store the data in the buffer exactly like it is in the file (or whatever we've read of it so far). We will break it at lines though, because that's what we're interested in reading. At any given time, it's very likely the buffer holds a partial line (because we haven't seen the rest). However, if we have at least two lines buffered, we can return one immediately. One of them is likely a partial sure, but the last one, the one the user wants, must be full now. This is easy to code, as you can see above.
In gets(), I prepend each read to the first (likely partial) line. Then I use scan() to find all the lines in there, creating an Array, then flatten!() to fold those lines back into the buffer, discarding the extra Array.
Thanks for the lessons Uri!
You really pick these insights up from reading the code of others, which is why I think reading code is important. This is one of the big perks of running the Ruby Quiz. I get to read a lot of code and the submissions are always teaching me things. This week was no different, so now let me tell you what I learned from others.
Lazy Evaluation
This next library came at just the right time for me. I'm reading Higher-Order Perl, by Mark Jason Dominus, and trying to apply the ideas I am learning there to my Ruby usage. One of the big concepts in the book is "lazy" code, which is just a fancy way of saying, I want to run this... later.
There are a lot of advantages to something like this. If an operation is expensive in computational terms, we can assure that it doesn't happen until it is needed. The advantage to that is that it may not be needed at all, which keeps us from wasting time.
Another less-used example is that we can delay evaluation until we have more information. The standard PStore library is a great example here. You pass it the path to the cache file in initialize(), but it waits until transaction() to actually open the file. The reason is that you can start a "read-only" transaction, or a normal transaction that allows reading and writing. By waiting to open the file, PStore knows the right mode to use, the right level of file locking to apply, etc.
The tricky part to lazy evaluation, for me at least, is just getting your head around it. When you decide you're ready though, here is an excellent first step:
module Trampoline # Instance methods class Bounce def initialize(cons, klass, *args) @klass, @cons, @args = klass, cons, args end
def method_missing(method, *args) @obj = @klass.send(@cons, *@args) unless @obj @obj.send(method, *args) end end
# Class methods class << Bounce alias_method :old_new, :new
def new(*args) old_new(:new, *args) end
def method_missing(method, *args) old_new(method, *args) end end end
That's Matthew Moss's port of the Perl Trampoline module, by Steven Lembark. Let's break down what this does.
First, the instance methods. It seems that initialize() doesn't do anything except store some information. We will find out what for in method_missing().
Remember that method_missing() will be called for any message we haven't defined. In this case, that's pretty much everything. (More on that in a minute...) When called, method_missing() makes sure @obj is defined. If it's not, it is created by calling the proper cons(tructor) with the args initialize() tucked away. Then the message is just forwarded to @obj. That means the object is built just before the first method call, then reused to handle all future method calls.
The only problem with the above strategy is that Bounce includes some default Ruby methods inherited from Object. This means that something like a to_s() call isn't forwarded. You can fix this by adding something like the following to Bounce:
instance_methods.each { |m| undef_method m unless m =~ /^__/ }
Now let's look at the class methods. You can see that new() is moved and redefined, to change its interface for calling code. Now we see another method_missing(), this time for the class itself. It works just like the redefined new(), forwarding the message to the constructor. Remember, not all objects are constructed with new(). Singleton objects often use instance(), for example. This method allows for that. Whatever is called will later be used to build the object.
That's a great introduction to lazy evaluation. When that sinks in a bit and you're ready for more, see the excellent lazy.rb library by MenTaLguY:
Another interesting technique discussed in Higher-Order Perl was also represented in the solutions to this quiz. Ross Bamford ported Perl's Sub::Curry module by Johan Lodin. Currying is basically the process of using functions to manufacture other functions, as seen in these simple examples:
require "curry"
scale = lambda { |size, object| object * size }
puts "3 scaleded to a size of 10 is #{scale[10, 3]}." # 30 puts
puts "4 doubled is #{double[4]}." # 8 puts "1 tripled is #{triple[1]}." # 3 puts "Half of 10 is #{halve[10]}." # 5.0
The great side of this library is that it can handle much more complicated argument setups than this. For example, what if the arguments to scale had been reversed? No problem:
scale = lambda { |object, size| object * size }
puts "3 scaleded to a size of 10 is #{scale[10, 3]}." # 30 puts
# we can leave "holes" in the argument list double = scale.curry(Curry::HOLE, 2) triple = scale.curry(Curry::HOLE, 3) halve = scale.curry(Curry::HOLE, 0.5)
puts "4 doubled is #{double[4]}." # 8 puts "1 tripled is #{triple[1]}." # 3 puts "Half of 10 is #{halve[10]}." # 5.0
That works exactly the same, but if you're not impressed yet we can get even fancier. The library already supports a bunch of "spices", like HOLE above, but you can also add your own:
log_now["First Message."] # => [12:47:53 PM 02/01/06] First Message. sleep 3 log_now["Second Message."] # => [12:47:56 PM 02/01/06] Second Message.
Notice how the LazySpice isn't evaluated until the time of the call. That lazy execution makes sure our message is stamped with the time it was actually logged.
I'm not going to show the library here, since I've gone on long enough, but
On Thu, 2006-02-02 at 22:37 +0900, Ruby Quiz wrote: > The library already supports a bunch of "spices", like HOLE above, but > you can also add your own:
> Notice how the LazySpice isn't evaluated until the time of the call. That lazy > execution makes sure our message is stamped with the time it was actually > logged.
This is very cool, thanks for the great write-up. However, there is a small bug (in curry.rb, not this code) that can cause problems with this, e.g:
a = [1,3,5] b = [2,4,6]
l = lambda do |ary,aa,ba| ary + [aa,ba] end.curry(Curry::HOLE,Curry::HOLE,TestSpice.new { b.shift })
This only shows up when special spices are at the end of the argument list, and happens because there are no args_remain left by the time they're seen, and I took a short-cut in the implementation. The attached patch (against the documented version from http://roscopeco.co.uk/code/ruby-quiz-entries/64/curry.rb) fixes things. There maybe a slight impact on performance, though (if that matters).
I'm considering maybe packaging Ruby Murray up as a gem and releasing it on RubyForge. I'll add James' LazySpice, and would like any suggestions others may have (esp if anyone has something we could legitimately call "SportySpice" ;D)
Thanks again for the quiz. Cool entries everyone :)
On Fri, 2006-02-03 at 00:03 +0900, Ross Bamford wrote: > l = lambda do |ary,aa,ba| > ary + [aa,ba] > end.curry(Curry::HOLE,Curry::HOLE,TestSpice.new { b.shift })
(Oops, TestSpice is a simplified LazySpice I used in the tests. Substitute LazySpice there.)
In article <20060202133728.KIME613.centrmmtao03.cox....@localhost.localdomain>, Ruby Quiz <ja...@grayproductions.net> wrote:
> Wrap Up
>Please take some time to look into the other solutions I didn't cover here. All >of them were interesting ideas and I hope their authors will consider packaging >them up for all to use.
>Myself, and others, were worried that this would not be a popular quiz. It far >exceeded my expectations though and I owe a big thank you to all who made that >happen! You are all so clever it makes even me look good.
It would be cool to have a monthly Port-a-library exercise that might run in parallel to the Ruby Quizes. Lots of libraries are too large to port in a weekend. it might also be interesting to have a 'hit-list' of libraries to port each month based on input from ruby-talk and other sources.
> In article > <20060202133728.KIME613.centrmmtao03.cox....@localhost.localdomain>, > Ruby Quiz <ja...@grayproductions.net> wrote:
>> Wrap Up
>> Please take some time to look into the other solutions I didn't >> cover here. All >> of them were interesting ideas and I hope their authors will >> consider packaging >> them up for all to use.
>> Myself, and others, were worried that this would not be a popular >> quiz. It far >> exceeded my expectations though and I owe a big thank you to all >> who made that >> happen! You are all so clever it makes even me look good.
> It would be cool to have a monthly Port-a-library exercise that > might run in > parallel to the Ruby Quizes. Lots of libraries are too large to > port in a > weekend.
Interesting thought. If you could divide up the bigger libraries, maybe the whole group could help do one at a time.
> it might also be interesting to have a 'hit-list' of libraries to port > each month based on input from ruby-talk and other sources.
Ooo, I really like that idea.
Sounds like a project you just have to start, to me. :)
Ruby Quiz <ja...@grayproductions.net> wrote: > The great element of porting a library is that you get examine another > programmer's ideas. If you're lucky, that may teach you a new trick or two. > I'll use my experience as an example.
I also love the way that code seems to vanish when you port it to ruby :)
> -----Original Message----- > From: James Edward Gray II [mailto:ja...@grayproductions.net] > Sent: Thursday, February 02, 2006 6:16 PM > To: ruby-talk ML > Subject: Re: [SUMMARY] Port a Library (#64)
> > It would be cool to have a monthly Port-a-library exercise that > > might run in > > parallel to the Ruby Quizes. Lots of libraries are too large to > > port in a > > weekend.
> Interesting thought. If you could divide up the bigger libraries, > maybe the whole group could help do one at a time.
> > it might also be interesting to have a 'hit-list' of > libraries to port > > each month based on input from ruby-talk and other sources.
> Ooo, I really like that idea.
> Sounds like a project you just have to start, to me. :)