Parsing XML file with no style info with Hpricot

Allan Last

unread,

Mar 7, 2010, 7:10:42 AM3/7/10

to rubyonra...@googlegroups.com

Hello,

I've been trying for hours to parse an XML using Hpricot. Usually it's
not a problem. Here's my simple code:

#This works and outputs the proper xml data
@url1 = 'http://www.sportingnews.com/stories/sportingnews/MLB/rss.xml'
@page1 = Hpricot(open(@url1))
<%= @page 1 %>

#This does not work, and I'm scratching my head
@url1 =
'http://gd2.mlb.com/components/game/mlb/year_2010/month_03/day_06/gid_2010_03_06_anamlb_oakmlb_1/boxscore.xml'
@page1 = Hpricot(open(@url1))
<%= @page 1 %>

The gd2.mlb.com XML file does not have any style information according
to Firefox. I can read it using Oxygen. Can somebody provide me with a
hint on how to parse the mlb.com XML? Thanks!

-A
--
Posted via http://www.ruby-forum.com/.

Allan Last

unread,

Mar 7, 2010, 12:24:02 PM3/7/10

to rubyonra...@googlegroups.com

Any idea how to parse this XML?

-A

Hassan Schroeder

unread,

Mar 7, 2010, 12:35:18 PM3/7/10

to rubyonra...@googlegroups.com

On Sun, Mar 7, 2010 at 4:10 AM, Allan Last <li...@ruby-forum.com> wrote:

> #This does not work, and I'm scratching my head

And I'm scratching mine trying to guess what you mean by "does not
work" ...

--
Hassan Schroeder ------------------------ hassan.s...@gmail.com
twitter: @hassan

Allan Last

unread,

Mar 7, 2010, 12:39:32 PM3/7/10

to rubyonra...@googlegroups.com

Hpricot is not parsing the MLB xml file. I'm thinking the reason that it
is not reading the MLB xml file is because it is not in a standard XML
format.

If you give my code a quick try, you'll notice that it will read other
XML files, but not the MLB XML.

#This works and outputs the proper xml data
@url1 = 'http://www.sportingnews.com/stories/sportingnews/MLB/rss.xml'
@page1 = Hpricot(open(@url1))
<%= @page1 %>

#This does not work, and I'm scratching my head

@url1 =
'http://gd2.mlb.com/components/game/mlb/year_2010/month_03/day_06/gid_2010_03_06_anamlb_oakmlb_1/boxscore.xml'
@page1 = Hpricot(open(@url1))
<%= @page1 %>

--
Posted via http://www.ruby-forum.com/.

Hassan Schroeder

unread,

Mar 7, 2010, 12:43:21 PM3/7/10

to rubyonra...@googlegroups.com

On Sun, Mar 7, 2010 at 9:39 AM, Allan Last <li...@ruby-forum.com> wrote:

> If you give my code a quick try, you'll notice that it will read other
> XML files, but not the MLB XML.

Actually, I already did, and it seems to work just fine. Hence my own
head-scratching. :-)

So, again, maybe you can say *exactly* what you expect to happen
and how that differs from what you're seeing.

Allan Last

unread,

Mar 7, 2010, 1:18:55 PM3/7/10

to rubyonra...@googlegroups.com

Hi Hassan,

This picture:
http://picasaweb.google.com/lh/photo/Qf4DFta9p5ERoCRb6Lbd2Q?feat=directlink

This is the parsed output from the feed from the sportingnews XML file.
It
is displayed on my view with <%= @page1 %>.

This picture:
http://picasaweb.google.com/lh/photo/xLVr8_U-x12rJnADs_qcEw?feat=directlink

The blank space what is displayed on the view with <%= @page1 %> using
the MLB XML file.

I'm expecting the XML information seen here on Firefox:
http://picasaweb.google.com/lh/photo/X7VFocR3L4S4Pl_2jvDzVQ?feat=directlink

to be displayed when I parse the MLB file. Hpricot is not parsing this
file.

-A

Frederick Cheung

unread,

Mar 7, 2010, 2:55:54 PM3/7/10

to Ruby on Rails: Talk

On Mar 7, 6:18 pm, Allan Last <li...@ruby-forum.com> wrote:
> I'm expecting the XML information seen here on Firefox:http://picasaweb.google.com/lh/photo/X7VFocR3L4S4Pl_2jvDzVQ?feat=dire...

>
> to be displayed when I parse the MLB file. Hpricot is not parsing this
> file.
>

Have you tried viewing the source of the page generated by your view?
I suspect hpricot is parsing the file but just blatting it into the
view like that is producing invalid html which your browser is not
rendering.

Fred

> -A
>
> Hassan Schroeder wrote:
> > On Sun, Mar 7, 2010 at 9:39 AM, Allan Last <li...@ruby-forum.com> wrote:
>
> >> If you give my code a quick try, you'll notice that it will read other
> >> XML files, but not the MLB XML.
>
> > Actually, I already did, and it seems to work just fine. Hence my own
> > head-scratching. :-)
>
> > So, again, maybe you can say *exactly* what you expect to happen
> > and how that differs from what you're seeing.
>
> > --

> > Hassan Schroeder ------------------------ hassan.schroe...@gmail.com
> > twitter: @hassan
>
> --
> Posted viahttp://www.ruby-forum.com/.

Hassan Schroeder

unread,

Mar 7, 2010, 3:54:50 PM3/7/10

to rubyonra...@googlegroups.com

On Sun, Mar 7, 2010 at 10:18 AM, Allan Last <li...@ruby-forum.com> wrote:

> I'm expecting the XML information seen here on Firefox:

/

> to be displayed when I parse the MLB file. Hpricot is not parsing this
> file.

Sure it is -- use irb to examine what's in @page1.

As Frederick already suggested, you apparently have a view problem,
not an Hpricot parsing problem.

Allan Last

unread,

Mar 8, 2010, 5:13:15 AM3/8/10

to rubyonra...@googlegroups.com

Thanks everybody. I saw the info on the source. I figured it out.

-A

--
Posted via http://www.ruby-forum.com/.

Reply all

Reply to author

Forward