how can i use other html parser?

175 views
Skip to first unread message

Demián Andrés Rodriguez

unread,
Mar 12, 2011, 7:44:14 PM3/12/11
to zomb...@googlegroups.com
I've been trying to use zombie for the last 3 weeks but unfortunately every URL I tested failed with nasty html parsing errors. 
I reported at least 5 issues related to html5 parser and jsdom but still nothing is fixed and seems that those modules are not in active development...
Is there any way to use another html parser?

Demián Andrés Rodriguez

unread,
Mar 12, 2011, 7:44:27 PM3/12/11
to zomb...@googlegroups.com

Damian Janowski

unread,
Mar 14, 2011, 4:46:55 PM3/14/11
to zomb...@googlegroups.com

AFAIK jsdom is in active development.

In any case, Zombie is not an automation tool – it's meant to write
tests for an application you write, not someone else.

Demián Andrés Rodriguez

unread,
Mar 14, 2011, 5:44:30 PM3/14/11
to zomb...@googlegroups.com
Well, i could have been the author of that page that breaks the html5 parser. I don't see any difference, if I need to test an application written by me I need an automated tool...
I may have seen a jsdom option to use an old html parser, is it possible to change that option in zombie?

Damian Janowski

unread,
Mar 14, 2011, 5:49:13 PM3/14/11
to zomb...@googlegroups.com, Demián Andrés Rodriguez
On Mon, Mar 14, 2011 at 6:44 PM, Demián Andrés Rodriguez
<demi...@gmail.com> wrote:
> Well, i could have been the author of that page that breaks the html5
> parser. I don't see any difference, if I need to test an application written
> by me I need an automated tool...

If you write the application, you can fix the HTML.

If the page is valid HTML and it still breaks the parser, report the issue.

Demián Andrés Rodriguez

unread,
Mar 14, 2011, 8:24:39 PM3/14/11
to zomb...@googlegroups.com, Demián Andrés Rodriguez
OK, but it doesn't make any sense. The html5 parser should work the same way as any browser does. If I enter a page which contains invalid html then my browser should crash? The parser has to be good enough to deal with any kind of error...

What I don't get is why zombie has to be used to test your own applications. What I need right now is to emulate a user that enters a page which requires javascript enabled to work properly, execute some functions and parse the resulting html to extract some info. How am I supposed to do that without zombie? I think this library is great but I would never use it to test applications, I would use it primarily to emulate a browser. Is that illegal? I don't get it...

Assaf Arkin

unread,
Mar 14, 2011, 9:45:55 PM3/14/11
to zomb...@googlegroups.com, Demián Andrés Rodriguez

On Monday, March 14, 2011 at 5:24 PM, Demián Andrés Rodriguez wrote:

OK, but it doesn't make any sense. The html5 parser should work the same way as any browser does. If I enter a page which contains invalid html then my browser should crash? The parser has to be good enough to deal with any kind of error...

What I don't get is why zombie has to be used to test your own applications. What I need right now is to emulate a user that enters a page which requires javascript enabled to work properly, execute some functions and parse the resulting html to extract some info. How am I supposed to do that without zombie? I think this library is great but I would never use it to test applications, I would use it primarily to emulate a browser. Is that illegal? I don't get it...
Zombie.js is an open source project released under a permissive license. There's no field of use restriction. I'm not going to tell you what you can or can't do with Zombie. In fact, you get full access to the source code, test suite, documentation, build scripts, everything you need to modify and enhance it to your heart's content.

I really like these two attributes of open source — you decide what and how to use the code, and if you don't like how it works, you can always change that.


That said, I feel I should set some expectations:

One. Zombie was conceived and developed for testing Web applications. It's great that you can use it for other purposes, kick ass that it's good for more than one thing. But all the design decisions are biased towards testing. Reason is ...

Two. The people who contribute their time and energy to improve Zombie are passionate about testing. Naturally, we invest our time and attention in making Zombie the best test tool possible. As is the case with open source, that which is interesting to contributors gets fixed. So ...

Three. Until you or anyone else starts investing in fixing issues, adding features, answering questions, writing documentation, etc that relate to Web automation, the scope will remain limited to what existing contributors are interested in contributing to. That is why ...

Four. Zombie does not advertise itself as a tool for Web automation. I'd hate for it to promise on something it can't deliver. In the meanwhile ...

Five. It's only broken if it doesn't do what it was designed to do.






Demián Andrés Rodriguez

unread,
Mar 14, 2011, 10:50:25 PM3/14/11
to zomb...@googlegroups.com, Demián Andrés Rodriguez
If zombie does not advertise itself as a web automation tool, what do you call testing a web page? isn't that an automatic process? a user does not click a link, zombie does it for me...
Well, the fact is that there is nothing wrong with zombie, the problem is the html5 parser that cannot handle html errors.
I'll try to debug and fix the errors I found when I have time.
Reply all
Reply to author
Forward
0 new messages