belated follow-up to May meeting

2 views
Skip to first unread message

Andy Eggers

unread,
Jun 7, 2007, 3:08:29 PM6/7/07
to New Haven Ruby Brigade
I'm sorry I won't be able to come this Sunday, but I just wanted to
say "hi" to those who were at the May meeting I attended.
Particularly, I wanted to say thanks for helping me get my head around
some scoping issues as we looked at the camping source.

I mentioned a project I was doing about newspaper coverage of Katrina.
(I was using Ruby to automate keyword searches.) If you're interested,
you can see the poster we made of the results at
http://www.people.fas.harvard.edu/~aeggers/249poster.pdf
The bottom line is that papers that endorsed Bush in 2004 were less
likely to use words related to incompetence and failure, as well as
words related to race and poverty, in their coverage of Katrina soon
after the disaster. The discrepancy disappeared by the second weekend,
as Bush-endorsing papers increased their focus on these themes.

Anyway, I hope to make it to another nh.rb another meeting soon.

Andy

Gregory Brown

unread,
Jun 30, 2007, 1:19:10 AM6/30/07
to New-Haven-R...@googlegroups.com, aeg...@gmail.com
On 6/7/07, Andy Eggers <aeg...@gmail.com> wrote:
>
> I'm sorry I won't be able to come this Sunday, but I just wanted to
> say "hi" to those who were at the May meeting I attended.
> Particularly, I wanted to say thanks for helping me get my head around
> some scoping issues as we looked at the camping source.

Hi Andy. Sorry for the incredibly late reply, but it was great to see you.

> I mentioned a project I was doing about newspaper coverage of Katrina.
> (I was using Ruby to automate keyword searches.) If you're interested,
> you can see the poster we made of the results at
> http://www.people.fas.harvard.edu/~aeggers/249poster.pdf
> The bottom line is that papers that endorsed Bush in 2004 were less
> likely to use words related to incompetence and failure, as well as
> words related to race and poverty, in their coverage of Katrina soon
> after the disaster. The discrepancy disappeared by the second weekend,
> as Bush-endorsing papers increased their focus on these themes.

This is pretty interesting stuff. Is any of the software publicly
releasable? It'd be nice to see how all of that looks under the hood.

Andy Eggers

unread,
Jun 30, 2007, 11:59:20 PM6/30/07
to New Haven Ruby Brigade
Thanks for the note, Greg.

For this project I automated keyword searches using FireWatir, with
some Hpricot for parsing html. I find it easier to write scrapers
using FireWatir or Watir (both of which drive browsers) rather than,
say, WWW::Mechanize because you can write the scripts more
interactively and also you can let the browser handle cookies. In
general I would recommend Watir over FireWatir because the former is a
better-developed project. But there are probably better ways -- anyone
have input on this?

We did all of the stats in R.

Andy

On Jun 30, 1:19 am, "Gregory Brown" <gregory.t.br...@gmail.com> wrote:

Dan Bernier

unread,
Jul 1, 2007, 9:09:16 AM7/1/07
to New-Haven-R...@googlegroups.com
I've done a fair amount of scripting in Watir, building up a
sort-of-DSL for my company's flagship product. Scraping in Watir is
cool, but it's all done by exposing the IE DOM through the OLE
interface...very slow, if your pages are decent sized. To combat
that, I'm also using Hpricot: basically, each page-load, take the
IE.document.innerHTML, pass it to Hpricot, and have Hpricot do all
your read-only stuff.

Does Mechanize really not handle cookies for you? Since most of the
Watir code I did was metaprogrammed, I was thinking of throwing in an
"interactive" option, to generate either Watir code, or Mechanize code
(for build-time regression tests). Sounds like it might be more
involved than I thought...

And BTW, if anyone's having trouble introducing Ruby at work, I've
found that some Ruby code, driving Internet Explorer, filling in form
fields and stuff, is a great attention grabber. =)


--
For building invisible machines...
http://invisibleblocks.wordpress.com

Gregory Brown

unread,
Jul 1, 2007, 9:59:27 PM7/1/07
to New Haven Ruby Brigade

On Jul 1, 9:09 am, "Dan Bernier" <danbern...@gmail.com> wrote:

> Does Mechanize really not handle cookies for you? Since most of the
> Watir code I did was metaprogrammed, I was thinking of throwing in an
> "interactive" option, to generate either Watir code, or Mechanize code
> (for build-time regression tests). Sounds like it might be more
> involved than I thought...

Mechanize does automatically handle cookies, but I suppose with the
browser, you'd have a more 'natural' way of handling them. But if you
need to say, maintain a cookie based authentication, Mech will do that
for you.

Reply all
Reply to author
Forward
0 new messages