I've got a routine that works fine at building an array of upper-case strings extracted from a string:
aNewList = [] s = StringScanner.new sNewList upper = /[A-Z]+/ not_upper= Regexp.new( upper.source.sub( /\[/, '[^' ) ) while not s.eos? case when s.skip(upper); aNewList << s.matched else s.skip(not_upper) end end
But the not_upper Regexp definition is really a kludge. It somewhat camouflages what is really /[^A-Z]+/
I'd like to DRY it by expressing it as something like !upper. I need something like !~ we use normally with string searches.
<RichardDummyMailbox58...@uscomputergurus.com> wrote: > upper = /[A-Z]+/ > not_upper= Regexp.new( upper.source.sub( /\[/, '[^' ) ) [snip] > But the not_upper Regexp definition is really a kludge. It somewhat > camouflages what is really /[^A-Z]+/
> I'd like to DRY it by expressing it as something like !upper. I need > something like !~ we use normally with string searches.
This is somewhat better, but still not real obvious: not_upper=/(?:.(?!#{upper}))+/ #untested, tho
Myself, I'd just write not_upper=/[^A-Z]/.... for something this short, is it really worth trying all that hard to be DRY?
<RichardDummyMailbox58...@uscomputergurus.com> wrote: > I've got a routine that works fine at building an array of upper-case > strings extracted from a string:
> aNewList = [] > s = StringScanner.new sNewList > upper = /[A-Z]+/ > not_upper= Regexp.new( upper.source.sub( /\[/, '[^' ) ) > while not s.eos? > case > when s.skip(upper); aNewList << s.matched > else s.skip(not_upper) > end > end
OTOH, you can rewrite it like this, and not have to even mention the complement of the match you're interested in:
aNewList = [] s = StringScanner.new sNewList upper = /[A-Z]+/ aNewList<< s.matched while s.skip_until(upper)
(Not tested real thoroughly, corner cases may break.)
On Nov 12, 7:37 pm, Caleb Clausen <vikk...@gmail.com> wrote:
> On 11/12/09, James Edward Gray II <ja...@graysoftinc.com> wrote:
> > Well, you don't really need a StringScanner for this simple task. Your code > > really just rebuilds String#scan():
> > a_new_list = s_new_list.scan(/[A-Z]+/)
> ooh! that's even better.
You're right. I didn't NEED to DRY that simple thing. I'm just trying to improve my coding generally, especially to write things that don't break easily when the inevitable changes are made.
But cutting out 90% of the code, wow! That's DRY!!
Thank you very much for your ideas. I haven't tested it yet, but it looks right to me.
> On Nov 12, 2009, at 5:00 PM, RichardOnRails wrote:
> > I've got a routine that works fine at building an array of upper-case > > strings extracted from a string:
> > aNewList = [] > > s = StringScanner.new sNewList > > upper = /[A-Z]+/ > > not_upper= Regexp.new( upper.source.sub( /\[/, '[^' ) ) > > while not s.eos? > > case > > when s.skip(upper); aNewList << s.matched > > else s.skip(not_upper) > > end > > end
> > But the not_upper Regexp definition is really a kludge. It somewhat > > camouflages what is really /[^A-Z]+/
> > I'd like to DRY it by expressing it as something like !upper. I need > > something like !~ we use normally with string searches.
> > Any ideas?
> Well, you don't really need a StringScanner for this simple task. Your code really just rebuilds String#scan():
> a_new_list = s_new_list.scan(/[A-Z]+/)
> Note that I've also switched your variable naming style to the snake_case that we Rubyists prefer.
> Hope that helps.
> James Edward Gray II
Hi James,
As I said to Caleb, cutting my 10-liner down to 1 is extreme DRYing!! Thanks for that.
As far as underscoring vs. Camel-case goes, I know Rubyists' preference, but I bow to Shakespeare's notion that "a rose by any other name is just as sweet." I spent a couple decades writing/ maintaining Window's application for clients using C and C++, so I've a fondness for Polish notation (at least that's what I think it was called.) Typing extra hyphens vs pressing the shift key lets me write code faster, and the a/s/h prefix for arrays/strings/hashes helps me avoid a lot of interpreter complaints. And fellow programmers of almost any stripe knows what I mean. Finally, I retired curmudgeon, and you know how we old folks are :-)
Seriously, your insight was very helpful and will help me avoid a bunch of wasteful code.
With your insights, I was able to cut down 18 lines of somewhat obscure code to 6 lines that I find very readable. That's such and improvement on the quality of the code.
Though I expect you guys are tired ot this thread, I included the new and old code below, along with results that both of them produce.
Again, thank you very much for your insights.
Best wishes, Richard
# Accept a new list as a string; extract an array of contiguous upper- case letters as stock symbols, ignoring any duplicates (Test data) # Delete any symbol in the current list that occurs here sNewList = %{TMxxx CSCO COL INTC BRCM FDX AA CAT BUR FSLR MSFT', PNC HPQ CSCO AMAT ORCL FCX ABX PVTB XHB CSCO TM FDX}
#=============== # New technique #=============== aRawNewList = sNewList.scan(/[A-Z]+/) aNewList = Set.new(aRawNewList ).to_a.sort nDeleted = 0 aNewList.each { |sym| hCurrentList.delete sym and nDeleted += 1 if hCurrentList[sym] } show_array( aNewList, 10, "New List (unique:%d, dups:%d, deleted:%s)" % [aNewList.size, aRawNewList.size - aNewList.size, nDeleted] , true)
#============================ # Old technique; No longer used #============================ aNewList = [] s = StringScanner.new sNewList upper = /[A-Z]+/ non_upper= Regexp.new( upper.source.sub( /\[/, '[^' ) ) nNewSyms = nCurrSymsDeleted = 0 while not s.eos? case when s.skip(upper) nNewSyms+=1 aNewList << s.matched unless aNewList.include? s.matched ( hCurrentList.delete s.matched and nCurrSymsDeleted += 1) if hCurrentList[s.matched] else s.skip(non_upper) end end show_array( aNewList.sort, 10, "New List (%d unique; %d dups; %d curr. deleted)" % [aNewList.size, nNewSyms - aNewList.size, nCurrSymsDeleted] )
#======= # Output #======= ===== New List (unique:19, dups:4, deleted:3) ===== AA ABX AMAT BRCM BUR CAT COL CSCO FCX FDX FSLR HPQ INTC MSFT ORCL PNC PVTB TM XHB ===== =====
> As far as underscoring vs. Camel-case goes, I know Rubyists' > preference, but I bow to Shakespeare's notion that "a rose by any > other name is just as sweet." I spent a couple decades writing/ > maintaining Window's application for clients using C and C++, so I've > a fondness for Polish notation (at least that's what I think it was > called.)
> Typing extra hyphens vs pressing the shift key lets me write > code faster, and the a/s/h prefix for arrays/strings/hashes helps me > avoid a lot of interpreter complaints. And fellow programmers of > almost any stripe knows what I mean.
There's always something to be said for conventions. The issue with your notation is that it seems to be far less used among Ruby programmers than the snake case. Snake case for variables and methods also has the added advantage that classes and modules stand out immediately.
Side note: with modern IDE's I believe there is not much reason to use Hungarian Notation any more. I personally find it more difficult to spot certain variables when all variables of the same type start with the same letter. For me, PN actually _reduces_ readability.
> Finally, I retired curmudgeon, > and you know how we old folks are :-)
> With your insights, I was able to cut down 18 lines of somewhat > obscure code to 6 lines that I find very readable. That's such and > improvement on the quality of the code.
I believe you can go further. For example, these three lines:
> On Nov 13, 2:34 am, RichardOnRails > <RichardDummyMailbox58...@USComputerGurus.com> wrote: >> Hey Caleb & James,
>> With your insights, I was able to cut down 18 lines of somewhat >> obscure code to 6 lines that I find very readable. That's such and >> improvement on the quality of the code.
> I believe you can go further. For example, these three lines:
Basically the question is which of the two is larger. But if you do it this way round (i.e. iterate the Hash and check for existence in the new list then that should definitively be a Set).
Here's my suggestion
require 'set'
# dumy base current = {"CSCO" => 1, "COL" => 2, "INTC" => 3, "BRCM" => 4, "FOO" => 99}
# user input input = %{TMxxx CSCO COL INTC BRCM FDX AA CAT BUR FSLR MSFT PNC HPQ CSCO AMAT ORCL FCX ABX PVTB XHB CSCO TM FDX}
> As far as underscoring vs. Camel-case goes, I know Rubyists' > preference, but I bow to Shakespeare's notion that "a rose by any > other name is just as sweet."
It doesn't work that way in programming. Good naming practices are an important part of readable code. This is particularly so in a language like Ruby, in which "literate" interfaces are common.
I spent a couple decades writing/
> maintaining Window's application for clients using C and C++, so I've > a fondness for Polish notation (at least that's what I think it was > called.)
Polish Notation is Łukasiewicz-style prefix notation, rather like what's used in Lisp. You mean Hungarian Notation.
But in any case, *you've been had*. Hungarian Notation as developed by Charles Simonyi is extremely useful in non-OO code (I've used it in PHP with great success). Hungarian Notation as the term is usually understood is a very stupid thing indeed, which has unfortunately been foisted by Microsoft on huge numbers of Windows programmers who really should know better. :) It is (at best) marginally useful in statically typed languages like C, and downright misleading in dynamically typed languages like Ruby.
The difference is that Simonyi's original concept encodes information *outside the scope* of the variable's type (which, after all, the interpreter or compiler already knows about). For example, in a mapping system, you might have kmDistance and ftCorrection. It's entirely clear from those names that kmDistance + ftCorrection would be adding kilometers and feet without a conversion, and thus it's immediately clear that that operation is wrong.
OTOH, legions of misled Windows developers would simply call those two variables intDistance and intCorrection, incorporating no new useful information and making the names harder to read.
Systems Hungarian, BTW, is bad enough in C, where you should be able to refer to your variable declarations. If your functions are so long that you can't refer easily to declarations, then you need to refactor to shorter methods for overall readability anyway -- methods should be short. Systems Hungarian has no use at all in Ruby, since although objects are typed, variables are not, so it's perfectly possible to do intValue = 1 # later intValue = {:foo => 'bar'}
Even Apps Hungarian is not a great idea in OO code. Instead, just use the type system, so that distance would be a Kilometer object and correction would be a Foot object. Kilometer.+(foot) could then either raise an exception or invoke a conversion.
In summary, then, Hungarian Notation of either sort is inappropriate in Ruby. Drop the habit.
> Typing extra hyphens vs pressing the shift key lets me write > code faster, and the a/s/h prefix for arrays/strings/hashes helps me > avoid a lot of interpreter complaints.
If you care about removing characters from variable names, start with removing the Hungarian warts. As I explained above, they serve no useful purpose in Ruby at all. And I have to say, I don't find wordsRunTogether as easy to read as words_with_underscores -- the underscores look more like spaces and delineate the words better to my eye. WouldYouRatherReadThisClauseHere, or would_you_rather_read_this_clause_here?
In any case, "snake_case" is the prevailing style in Ruby, and virtually every Ruby library uses it (including the standard library and Rails) -- your code will look strange if you don't follow suit. The examples in Programming Ruby tend to use camelCase, but that's more of a flaw in the book than an indicator of Ruby practice.
> And fellow programmers of > almost any stripe knows what I mean. Finally, I retired curmudgeon, > and you know how we old folks are :-)
Age is not an excuse. If you're going to learn a language, take the time to learn the idioms and the "spirit" of the language, not just the bare essentials of syntax. I've seen far too many people try to write C, Java, or PHP in Ruby -- avoid the temptation!
> Seriously, your insight was very helpful and will help me avoid a > bunch of wasteful code.
On Nov 13, 2009, at 10:11 PM, Marnen Laibow-Koser wrote:
> RichardOnRails wrote: > [...] >> As far as underscoring vs. Camel-case goes, I know Rubyists' >> preference, but I bow to Shakespeare's notion that "a rose by any >> other name is just as sweet."
> It doesn't work that way in programming. Good naming practices are an > important part of readable code.
As the saying goes, "When in Rome, do as the Romans do." You're speaking our language now and you want to learn to speak it like us, even with our slang. That allows you to communicate with us better so we can learn from each other.
> Even Apps Hungarian is not a great idea in OO code. Instead, just use > the type system, so that distance would be a Kilometer object and > correction would be a Foot object. Kilometer.+(foot) could then either > raise an exception or invoke a conversion.
I would like to see us move away from considering classes to be types at all in Ruby. Who knows what modules an object has mixed into it and who knows what singleton methods are defined on it. A class, which is what people traditionally take for the type, is just one piece of an object's identity.
>> Even Apps Hungarian is not a great idea in OO code. Instead, just use >> the type system, so that distance would be a Kilometer object and >> correction would be a Foot object. Kilometer.+(foot) could then either >> raise an exception or invoke a conversion.
> I would like to see us move away from considering classes to be types at > all in Ruby. Who knows what modules an object has mixed into it and who > knows what singleton methods are defined on it.
Do you make much use of singleton mixins or singleton methods in your code? I know I don't.
> A class, which is what > people traditionally take for the type, is just one piece of an object's > identity.
You're right. But with a proper class system, my point about not needing Apps Hungarian in Ruby still stands, I think. Do you disagree?
On 14/11/2009, at 15:21, James Edward Gray II <ja...@graysoftinc.com> wrote:
>> Even Apps Hungarian is not a great idea in OO code. Instead, just >> use >> the type system, so that distance would be a Kilometer object and >> correction would be a Foot object. Kilometer.+(foot) could then >> either >> raise an exception or invoke a conversion.
> I would like to see us move away from considering classes to be > types at all in Ruby. Who knows what modules an object has mixed > into it and who knows what singleton methods are defined on it. A > class, which is what people traditionally take for the type, is just > one piece of an object's identity.
I would still look immediately to the class of the object in order to find out what it's supposed to do. From there, the class definition will probably list it's module inclusions prominently.
As a vim user, with very limited interactive debugging, my primary exploration technique will usually consist of at most a couple of 'obj.methods.grep' calls followed by grepping ~/gems which seems to emphasize the actual reading of the source for object identity info.
Python's integrated documentation would be really welcome in this case, i think. :)
I'm curious what you think the most correct way is to discover object identity.
David Turnbull wrote: > On 14/11/2009, at 15:21, James Edward Gray II <ja...@graysoftinc.com> > wrote: >> class, which is what people traditionally take for the type, is just >> one piece of an object's identity.
> I would still look immediately to the class of the object in order to > find out what it's supposed to do.
I would too. James is correct that it isn't the whole story, but it's the best place to start.
> From there, the class definition > will probably list it's module inclusions prominently.
> As a vim user, with very limited interactive debugging,
What? You can use ruby-debug interactively in a console session. I often do.
> my primary > exploration technique will usually consist of at most a couple of > 'obj.methods.grep' calls followed by grepping ~/gems which seems to > emphasize the actual reading of the source for object identity info.
> Python's integrated documentation would be really welcome in this > case, i think. :)
WTF? Aren't you familiar with RDoc? And didn't you know that running "gem server" will start a Web server with gem RDoc pages on port 8808?
> I'm curious what you think the most correct way is to discover object > identity.
Object identity? Well, for that, you need object_id. That's something different than object type. -- Posted via http://www.ruby-forum.com/.
> On 14/11/2009, at 15:21, James Edward Gray II <ja...@graysoftinc.com> > wrote: >> I would like to see us move away from considering classes to be >> types at all in Ruby. Who knows what modules an object has mixed >> into it and who knows what singleton methods are defined on it. A >> class, which is what people traditionally take for the type, is just >> one piece of an object's identity.
> I would still look immediately to the class of the object in order to > find out what it's supposed to do. From there, the class definition > will probably list it's module inclusions prominently.
A human looking to documentation to find out what an object of a partiular class is supposed to *do*, is one thing. But then there's the programmatic flipside where one could code a method to select between different behaviors based on the class-type of a given argument-object.
def foo(bar) if bar.is_a? Array do_array_thing(bar) elsif bar.is_a? String do_string_thing(bar) else ... # ? end end
I believe it's (variations on) the above that are viewed as unreasonably restrictive in ruby.
It's challenging, too, because even :respond_to? can be misleading.
I like Og (Object Graph), an Object Relational Mapping library in ruby providing high-level database access.
When Og is initialized, it searches ObjectSpace for classes like the above, and detects that they are intended to be Og-managed classes, and imbues them with certain basic features. (It also generates the SQL needed to create the database tables corresponding to such classes.)
An example is that, given nothing more than the above Address class declaration... I could now say:
result = Address.find_by_name_and_state("Bob Jones", "CA")
But..! The Address.find_by_name_and_state doesn't even exist until the time that it is called. Part of the magic with which an Og-managed class is imbued, is some method_missing logic which looks for particular method signatures, like /find_by_(.*)/ , and, at the moment such a method is called, is tested against the following, behind the scenes:
def method_missing(sym, *args, &block) if match = /find_(all_by|by)_([_a-zA-Z]\w*)/.match(sym.to_s) return find_by_(match, args, &block) elsif match = /find_or_create_by_([_a-zA-Z]\w*)/.match(sym.to_s) return find_or_create_by_(match, args, &block) else super end end
(Note: In this case, it appears Og _always_ handles the request via method_missing. But I've seen other code in Og (or maybe Nitro) that did *define* the method when it was first called, such that on subsequent invocations the method would now already be existing.)
. . . Anyway, the point being, Ruby is pretty dynamic.
:)
> Python's integrated documentation would be really welcome in this > case, i think. :)
I seem to recall mention awhile back on ruby-talk of a gem or module that integrated `ri` into `irb`, such that one could pull up the documentation from within irb. (I don't have any links for that, sorry.)
On Fri, Nov 13, 2009 at 11:39 PM, David Turnbull <dsturnb...@gmail.com> wrote: > On 14/11/2009, at 15:21, James Edward Gray II <ja...@graysoftinc.com> wrote:
>>> Even Apps Hungarian is not a great idea in OO code. Instead, just use >>> the type system, so that distance would be a Kilometer object and >>> correction would be a Foot object. Kilometer.+(foot) could then either >>> raise an exception or invoke a conversion.
>> I would like to see us move away from considering classes to be types at >> all in Ruby. Who knows what modules an object has mixed into it and who >> knows what singleton methods are defined on it. A class, which is what >> people traditionally take for the type, is just one piece of an object's >> identity.
> I would still look immediately to the class of the object in order to find > out what it's supposed to do. From there, the class definition will probably > list it's module inclusions prominently.
I've long (since at least 18 years) been an advocate of divorcing the notion of type from class in dynamically typed languages:
IMHO, viewing a variable as a 'role' to be filled with one or more objects is a powerful technique.
Alastair Cockburn recently told me that he still refers clients to the reference paper, which I wrote at IBM, when he and l were both there.
The view fits into a general approach to OO design which is called "responsibility based" or "role based" design. Rebecca Wirfs-Brock was one of the authors who published books on the approach
This was before the static-typing crowd (starting with C++) took over the conventional wisdom as to what it meant to be OO, in turn leading to a proliferation of "methodologies" using static typing.
Those of us in the dynamic typing/roles/responsibility based community see this as an unfortunate parallel to Gresham's Law.
With the re-birth of interest in dynamically typed languages I think that the role based view is preferable.
> (Note: In this case, it appears Og _always_ handles the > request via method_missing. But I've seen other code > in Og (or maybe Nitro) that did *define* the method when > it was first called, such that on subsequent invocations > the method would now already be existing.)
ActiveRecord from Rails works this way. If you would like to see the code it starts around line 1830 of this file:
> I seem to recall mention awhile back on ruby-talk of > a gem or module that integrated `ri` into `irb`, such > that one could pull up the documentation from within > irb. (I don't have any links for that, sorry.)
Here's what I have in my .irbrc file:
def ri(*names) system(%{ri #{names.join(" ")}}) end
On Nov 13, 2009, at 10:27 PM, Marnen Laibow-Koser wrote:
> James Edward Gray II wrote: > [...] >>> Even Apps Hungarian is not a great idea in OO code. Instead, just use >>> the type system, so that distance would be a Kilometer object and >>> correction would be a Foot object. Kilometer.+(foot) could then either >>> raise an exception or invoke a conversion.
>> I would like to see us move away from considering classes to be types at >> all in Ruby. Who knows what modules an object has mixed into it and who >> knows what singleton methods are defined on it.
> Do you make much use of singleton mixins or singleton methods in your > code? I know I don't.
I have been doing a lot more of mixing modules into individual objects, yes. I have been more than pleased with the results too. I think it's something we should all try to do more of. I gave a speech about this at LSRC this year which should show up here someday:
I recently ran across some code that had extensions to a core system. Each extension would reopen the core classes and edit away. Unfortunately, they had to duplicate a lot of the core code to make little changes to it. I rewrote the code to allow extensions to register modules with the core classes. Then when those classes produced objects, they would mix in any registered modules. This simple eliminated almost all of the duplication, because the modules were in the singleton class *in front of* the methods they were modifying. They could read the arguments and see if they needed to step in with their modified behavior, or just hand off to super().
I showed another example in my talk where I was trying to create a one instance configuration object. Originally I did it with a constant and some clever reopening of the singleton class, but that caused problems like not being able to easily document this object's API. I switched to just creating the one instance I needed and immediately mixing in a module that added the special functionality and it solved all the problems I had. You can document a module just fine. (The example is in my slides, if you want to see it: http://blog.grayproductions.net/articles/lone_star_rubyconf_slides.)
I think we should do more of this. For example, I think we could return an Array that mixes in a Paginated module instead of a PaginatedCollection object that inherits from Array. That feels more right to me. It's an Array and it has some extra functionality added in related to pagination. The uses go on and on.
>> A class, which is what >> people traditionally take for the type, is just one piece of an object's >> identity.
> You're right. But with a proper class system, my point about not > needing Apps Hungarian in Ruby still stands, I think. Do you disagree?
I was agreeing with you, yes. I was saying that adding an a_ or s_ to the beginning of a variable name, assumably to indicate Array or String, is a damaging practice, because that's not necessarily all you need to know about the object. I think it promotes the wrong kind of thinking about Ruby's types.
On Nov 13, 2009, at 10:39 PM, David Turnbull wrote:
> On 14/11/2009, at 15:21, James Edward Gray II <ja...@graysoftinc.com> wrote: >>> Even Apps Hungarian is not a great idea in OO code. Instead, just use >>> the type system, so that distance would be a Kilometer object and >>> correction would be a Foot object. Kilometer.+(foot) could then either >>> raise an exception or invoke a conversion.
>> I would like to see us move away from considering classes to be types at all in Ruby. Who knows what modules an object has mixed into it and who knows what singleton methods are defined on it. A class, which is what people traditionally take for the type, is just one piece of an object's identity.
> I would still look immediately to the class of the object in order to find out what it's supposed to do. From there, the class definition will probably list it's module inclusions prominently.
Sure, it's definitely part of the picture.
> As a vim user, with very limited interactive debugging, my primary exploration technique will usually consist of at most a couple of 'obj.methods.grep' calls followed by grepping ~/gems which seems to emphasize the actual reading of the source for object identity info.
Your use of grep() for methods catches a lot of things a class definition might not tell you.
> I'm curious what you think the most correct way is to discover object identity.
Well, if we just mix modules into objects as I recommended in my previous message, Ruby's type system just naturally handles all of the details.
>> o = Object.new
=> #<Object:0x10037f9f0>
>> module Magical >> def inspect >> "#<MagicalObject ##{object_id}>" >> end >> end => nil >> o.extend(Magical)
On Sat, 14 Nov 2009, Marnen Laibow-Koser wrote: > James Edward Gray II wrote: > [...] >>> Even Apps Hungarian is not a great idea in OO code. Instead, just use >>> the type system, so that distance would be a Kilometer object and >>> correction would be a Foot object. Kilometer.+(foot) could then either >>> raise an exception or invoke a conversion.
>> I would like to see us move away from considering classes to be types at >> all in Ruby. Who knows what modules an object has mixed into it and who >> knows what singleton methods are defined on it.
> Do you make much use of singleton mixins or singleton methods in your > code? I know I don't.
I write class methods sometimes (I know those are singleton with an asterisk next to them, but still), and I think that extending core objects with modules is a frequently overlooked and very powerful alternative to reopening core classes and adding methods.
This kind of thing:
class String def method_I_need_once_or_twice ...
is almost always overkill. It's sort of the core-functionality counterpart of using global variables. Extending an object is a much more precise operation -- and has the additional merit, I find, of really making you think about whether it's worth bothering to the extend the object instead of working with what the object can already do.
David
-- The Ruby training with D. Black, G. Brown, J.McAnally Compleat Jan 22-23, 2010, Tampa, FL Rubyist http://www.thecompleatrubyist.com