Parsing HTML is a little bit tricky. You can't use an XML parser
unless it happens to be XHTML. You should be able to use TagSoup from
Gosu, just place it into your classpath. This prints all text content
from
http://google.com:
uses org.xml.sax.helpers.DefaultHandler
uses java.net.URL
var factory = new org.ccil.cowan.tagsoup.jaxp.SAXFactoryImpl()
var is = new URL( "
http://google.com" ).openStream()
factory.newSAXParser().parse( is, new DefaultHandler() {
override function characters ( ch : char[], start : int, length :
int ) {
print( new String( ch ) )
}
} )
Dana
On Oct 12, 11:39 pm, Andrew Myers <
am2...@gmail.com> wrote:
> Will see if I can share it. In that earlier email I mean "because I'm parsing *HTML*"...
>
> Sent from my mobile
>
> On 13/10/2011, at 5:23 PM, Carson Gross <
carsongr...@gmail.com> wrote:
>
>
>
>
>
>
>
> > If you've got an XSD for your content, that's definitely the way to go. Maybe Dlank and chime in with advice as well.
>
> > Also, if it is feasible, setting the project up on Github would let us help out in case stuff gets straight crazy, like at the end of this video:
>
> >
http://www.youtube.com/watch?v=WXtpNm_a4Us
>
> > Cheers,
> > Carson
>
> > On Wed, Oct 12, 2011 at 11:13 PM, Andrew Myers <
am2...@gmail.com> wrote:
> > Thanks all.
>
> > My parsing previously has been quite "ad hoc". Because I'm parsing XML it's been something like an x path to get the value from say the 1st cell in the 3rd row of the 2nd table on the page or something similar.
>
> > So I'm not sure how or if an xsd fits into this scenario?
>
> > When the kids are asleep later I'll have a try and see how I go anyway.
>
> > Regards,
> > Andrew
>
> > Sent from my mobile
>
> > On 13/10/2011, at 1:27 PM, Peter Rexer <
pre...@alum.mit.edu> wrote:
>
> >> An example that I've put together is a project on Github for integrating Jira to AgileZen. Seehttps://
github.com/prexer/Jira2AgileZen. As I show in that project, you can just drop a few xsd files somewhere in your classpath, and then access XML documents that conform to that XSD easily in Gosu. That project is also useful as a project template in that the Gosu editor has the classpath arguments all setup, so it behaves pretty well. Jira has SOAP/RPC apis, that I used Axis to create a jar file to access (since SOAP api's are pretty complex beasts) and the AgileZen REST API's I created a few xsd's.
>
> >> Hope that helps.
>
> >> On Wednesday, October 12, 2011, Andrew wrote:
> >> Hi,
>
> >> Is anyone able to point me towards an example of using Gosu to parse
> >> XML or XHTML?
>
> >> Currently I have a few projects that use TagSoup and groovy to parse
> >> HTML, but I am starting up a new project and would like to see if I
> >> can do it using Gosu this time.
>
> >> Andrew.
>
> >> --
> >> You received this message because you are subscribed to the Google Groups "gosu-lang" group.
> >> To post to this group, send email to
gosu...@googlegroups.com.
> >> To unsubscribe from this group, send email to
gosu-lang+...@googlegroups.com.
> >> For more options, visit this group athttp://
groups.google.com/group/gosu-lang?hl=en.
> >> For more options, visit this group athttp://
groups.google.com/group/gosu-lang?hl=en.
> > For more options, visit this group athttp://
groups.google.com/group/gosu-lang?hl=en.