I'm building a HTML extractor on top of nokogiri which applies a collection of CSS search strings and more to build a logical extraction of data and I wanted to use something like this:
[1,2,3].send(:collect,Proc.new{|x| x.to_s + "!"})
This fails. Any ideas how I could work around this? How do you use Object#send (or similar) with a block?
On Mon, Jan 31, 2011 at 4:50 PM, Xavier Lange <xrla...@gmail.com> wrote: > I'm building a HTML extractor on top of nokogiri which applies a > collection of CSS search strings and more to build a logical > extraction of data and I wanted to use something like this:
On Mon, Jan 31, 2011 at 4:50 PM, Xavier Lange <xrla...@gmail.com> wrote: > I'm building a HTML extractor on top of nokogiri which applies a > collection of CSS search strings and more to build a logical > extraction of data and I wanted to use something like this:
I'm building a collection of transformations which can be described as chained method invocations. How do you extract related entities from an nokogiri document of an amazon review page? By applying these transformations: https://github.com/derdewey/amzn-scraper/blob/master/amazon_review.rb . Find a common node to describe the entity, then extract each of the elements and return a handy hash.
Ben,
So that works when it's being passed directly to send but it won't work when passed in from a splatted array!
On Mon, Jan 31, 2011 at 7:06 PM, Adam Grant <adam.jgr...@gmail.com> wrote: > Any reason why you aren't calling "collect" directly from the Array? > _ > Adam
> On Mon, Jan 31, 2011 at 4:50 PM, Xavier Lange <xrla...@gmail.com> wrote:
>> I'm building a HTML extractor on top of nokogiri which applies a >> collection of CSS search strings and more to build a logical >> extraction of data and I wanted to use something like this:
That doesn't work because you can't put a block in an array:
>> [:collect, &Proc.new{|x| x.to_s + '!'}]
SyntaxError: compile error
Instead of storing the call and args in an array, you might want to consider a hash so you can label and handle blocks special case. There's some ambiguity in just shoving it in at the end of the args (is it a user argument or a handler?).
> Instead of storing the call and args in an array, you might want to > consider a hash so you can label and handle blocks special case. > There's some ambiguity in just shoving it in at the end of the args > (is it a user argument or a handler?).
>> Instead of storing the call and args in an array, you might want to >> consider a hash so you can label and handle blocks special case. >> There's some ambiguity in just shoving it in at the end of the args >> (is it a user argument or a handler?).
Yeah, doesn't matter if you could curry it or not - it needs to be passed as a block. If you don't separate them, there's no way to detect it *should* be separated. *args can't work. That's why method missing takes them separate - blocks are considered their own part of the call, and you only get one.
On Mon, Jan 31, 2011 at 7:43 PM, Jordan Fowler <m...@jordanfowler.com> wrote: > Oh wait, I missed the part about needing an anonymous block. Hmm...
> On Mon, Jan 31, 2011 at 7:41 PM, Jordan Fowler <m...@jordanfowler.com> wrote:
>> You can also use a lambda: >> ruby-1.9.2-p136 :004 > [:collect, lambda { |x| x.to_s + "!" }] >> => [:collect, #<Proc:0x000001009a29b8@(irb):4 (lambda)>] >> On Mon, Jan 31, 2011 at 7:34 PM, Kevin Clark <kevin.cl...@gmail.com> >> wrote:
>>> > Ben,
>>> > So that works when it's being passed directly to send but it won't >>> > work when passed in from a splatted array!
>>> Instead of storing the call and args in an array, you might want to >>> consider a hash so you can label and handle blocks special case. >>> There's some ambiguity in just shoving it in at the end of the args >>> (is it a user argument or a handler?).
As my knowledge of Object#send stands now: it can't recreate the full breadth of method invocations in ruby. Object#send should accept an optional parameter:
On Mon, Jan 31, 2011 at 9:49 PM, Kevin Clark <kevin.cl...@gmail.com> wrote: > Yeah, doesn't matter if you could curry it or not - it needs to be > passed as a block. If you don't separate them, there's no way to > detect it *should* be separated. *args can't work. That's why method > missing takes them separate - blocks are considered their own part of > the call, and you only get one.
> On Mon, Jan 31, 2011 at 7:43 PM, Jordan Fowler <m...@jordanfowler.com> wrote: >> Oh wait, I missed the part about needing an anonymous block. Hmm...
>> On Mon, Jan 31, 2011 at 7:41 PM, Jordan Fowler <m...@jordanfowler.com> wrote:
>>> You can also use a lambda: >>> ruby-1.9.2-p136 :004 > [:collect, lambda { |x| x.to_s + "!" }] >>> => [:collect, #<Proc:0x000001009a29b8@(irb):4 (lambda)>] >>> On Mon, Jan 31, 2011 at 7:34 PM, Kevin Clark <kevin.cl...@gmail.com> >>> wrote:
>>>> > Ben,
>>>> > So that works when it's being passed directly to send but it won't >>>> > work when passed in from a splatted array!
>>>> Instead of storing the call and args in an array, you might want to >>>> consider a hash so you can label and handle blocks special case. >>>> There's some ambiguity in just shoving it in at the end of the args >>>> (is it a user argument or a handler?).
> Yeah, doesn't matter if you could curry it or not - it needs to be > passed as a block. If you don't separate them, there's no way to > detect it *should* be separated. *args can't work. That's why method > missing takes them separate - blocks are considered their own part of > the call, and you only get one.
> On Mon, Jan 31, 2011 at 7:43 PM, Jordan Fowler <m...@jordanfowler.com> wrote: >> Oh wait, I missed the part about needing an anonymous block. Hmm...
>> On Mon, Jan 31, 2011 at 7:41 PM, Jordan Fowler <m...@jordanfowler.com> wrote:
>>> You can also use a lambda: >>> ruby-1.9.2-p136 :004 > [:collect, lambda { |x| x.to_s + "!" }] >>> => [:collect, #<Proc:0x000001009a29b8@(irb):4 (lambda)>] >>> On Mon, Jan 31, 2011 at 7:34 PM, Kevin Clark <kevin.cl...@gmail.com> >>> wrote:
>>>>> Ben,
>>>>> So that works when it's being passed directly to send but it won't >>>>> work when passed in from a splatted array!
>>>> Instead of storing the call and args in an array, you might want to >>>> consider a hash so you can label and handle blocks special case. >>>> There's some ambiguity in just shoving it in at the end of the args >>>> (is it a user argument or a handler?).
On Mon, Jan 31, 2011 at 9:11 PM, Xavier Lange <xrla...@gmail.com> wrote: > As my knowledge of Object#send stands now: it can't recreate the full > breadth of method invocations in ruby. Object#send should accept an > optional parameter:
> That obviously wouldn't work because it's still ambiguous. Oh well. > I'll just extend the nokogiri class! Thanks for playing everyone!
Not quite. Send can pass the block just fine. But the way you're storing your information doesn't allow you (as the person calling send) to split it out.
I was saying the way you're storing what essentially amount to bound method calls is ambiguous:
REVIEW_EXTRACTION = { :most_common_node => [[:css, "a + br + div > div + div > span > span > span"], [:collect, &Proc.new{|x| x.parent_node.parent_node.parent_node.parent_node.parent_node}]],
[:collect, &...] could only express method(arg1, arg2, arg3, &myblock) if you strip off the first and last item, and set args equal to the rest. You couldn't just splat everything after collect and expect it to work (since blocks aren't really a positional argument).
Does that makes sense? You don't need to extend the class, you just need to tweak your data.
On Mon, Jan 31, 2011 at 9:19 PM, Mike O'Brien <mcob...@yahoo.com> wrote: > Hey man, what are you up to these days???
> Are you still with powerset? How's it working for the man? Are you still > in san jose?
> Sent from my iPhone
> On Jan 31, 2011, at 7:49 PM, Kevin Clark <kevin.cl...@gmail.com> wrote:
> > Yeah, doesn't matter if you could curry it or not - it needs to be > > passed as a block. If you don't separate them, there's no way to > > detect it *should* be separated. *args can't work. That's why method > > missing takes them separate - blocks are considered their own part of > > the call, and you only get one.
> > On Mon, Jan 31, 2011 at 7:43 PM, Jordan Fowler <m...@jordanfowler.com> > wrote: > >> Oh wait, I missed the part about needing an anonymous block. Hmm...
> >> On Mon, Jan 31, 2011 at 7:41 PM, Jordan Fowler <m...@jordanfowler.com> > wrote:
> >>> You can also use a lambda: > >>> ruby-1.9.2-p136 :004 > [:collect, lambda { |x| x.to_s + "!" }] > >>> => [:collect, #<Proc:0x000001009a29b8@(irb):4 (lambda)>] > >>> On Mon, Jan 31, 2011 at 7:34 PM, Kevin Clark <kevin.cl...@gmail.com> > >>> wrote:
> >>>>> Ben,
> >>>>> So that works when it's being passed directly to send but it won't > >>>>> work when passed in from a splatted array!
> >>>> Instead of storing the call and args in an array, you might want to > >>>> consider a hash so you can label and handle blocks special case. > >>>> There's some ambiguity in just shoving it in at the end of the args > >>>> (is it a user argument or a handler?).
> Mike...., you might want to contact Kevin directly and not via the ML ;) > (and super scoop, no he doesn't work for Powerset/Microsoft anymore ;))
> - Matt
> On Mon, Jan 31, 2011 at 9:19 PM, Mike O'Brien <mcob...@yahoo.com> wrote:
>> Hey man, what are you up to these days???
>> Are you still with powerset? How's it working for the man? Are you still >> in san jose?
>> Sent from my iPhone
>> On Jan 31, 2011, at 7:49 PM, Kevin Clark <kevin.cl...@gmail.com> wrote:
>> > Yeah, doesn't matter if you could curry it or not - it needs to be >> > passed as a block. If you don't separate them, there's no way to >> > detect it *should* be separated. *args can't work. That's why method >> > missing takes them separate - blocks are considered their own part of >> > the call, and you only get one.
>> > On Mon, Jan 31, 2011 at 7:43 PM, Jordan Fowler <m...@jordanfowler.com> >> wrote: >> >> Oh wait, I missed the part about needing an anonymous block. Hmm...
>> >> On Mon, Jan 31, 2011 at 7:41 PM, Jordan Fowler <m...@jordanfowler.com> >> wrote:
>> >>> You can also use a lambda: >> >>> ruby-1.9.2-p136 :004 > [:collect, lambda { |x| x.to_s + "!" }] >> >>> => [:collect, #<Proc:0x000001009a29b8@(irb):4 (lambda)>] >> >>> On Mon, Jan 31, 2011 at 7:34 PM, Kevin Clark <kevin.cl...@gmail.com> >> >>> wrote:
>> >>>>> Ben,
>> >>>>> So that works when it's being passed directly to send but it won't >> >>>>> work when passed in from a splatted array!
>> >>>> Instead of storing the call and args in an array, you might want to >> >>>> consider a hash so you can label and handle blocks special case. >> >>>> There's some ambiguity in just shoving it in at the end of the args >> >>>> (is it a user argument or a handler?).