Converting html to java

7 views
Skip to first unread message

Felix Leipold

unread,
Apr 16, 2009, 10:41:39 AM4/16/09
to hypirinha-users
Hi there,
 
as an exercise I wrote an html -> piranha java code converter.
It's callled missionary [1] and takes html on stdin and pushes java code to stdout.
Initially I thought this might be a starting point to convert an existing page to using hypirinha. When testing I realized that most pages out there are not only not valid xhtml, but also not well-formed either...
So I put the HtmlCleaner library in front of the generator. It's still not necessarily producing valid code, but it's a starting point.
The interesting thing I realised is, that the ide gives you a whole lot of leverage. If you use intellij's extract method factory in a smart way (i.e. extracting variables for things you expect to be parameters), it will actually find all occurences of a certain pattern.
 
Cheers,
 
Felix
 
 
[1] http://wuetender-junger-mann.de/temp/missionary.jar (executable jar).

Alistair Jones

unread,
Apr 19, 2009, 6:28:31 PM4/19/09
to hypirin...@googlegroups.com
Hi Felix,

This is really cool!

I've been playing around with the code so that it automatically
in-lines variables where possible - quite easy to do when you have the
DOM available. I'm also experimenting with a very basic servlet that
gets a specified URL and converts the content. I'm aiming to deploy
it to google app engine as a handy tool - could actually be pretty
useful.

Is there somewhere we should commit this code? Would you mind if I
created an SVN directory called "java-utils" under trunk, as a sibling
of "java"?
Or should we ignore google's antiquated use of SVN and use github like
the ruby kids?

-Alistair

Alistair Jones

unread,
Apr 19, 2009, 7:56:08 PM4/19/09
to hypirin...@googlegroups.com
It turned out to be very easy to deploy to google app engine.

Try it out here:
http://hypirinha.appspot.com/html2hypirinha

-Alistair

Felix Leipold

unread,
Apr 26, 2009, 4:23:08 PM4/26/09
to hypirin...@googlegroups.com
Hi Alistair,

that's really nice. If you want to check the code in google code is surely the easiest way.
I like your optimisation. In theory the generated builder could be broken down into methods or even classes based on the DOM. So you could have rules like, all divs go in there own method unless they are wrapper-divs. But actually I think this goes a bit too far.

Has Robert Rees contacted you for the geek night event he is planning in June?
It seems to be good place to promote hypirinha...

Cheers,
Felix

Alistair Jones

unread,
Apr 29, 2009, 7:03:45 PM4/29/09
to hypirin...@googlegroups.com
On Sun, Apr 26, 2009 at 9:23 PM, Felix Leipold
<felix....@googlemail.com> wrote:
> If you want to check the code in google code is surely
> the easiest way.

Cool. I haven't done it yet though - need more time with an internet
connection.

> I like your optimisation. In theory the generated builder could be broken
> down into methods or even classes based on the DOM. So you could have rules
> like, all divs go in there own method unless they are wrapper-divs. But
> actually I think this goes a bit too far.

Yes, it's difficult to know when to stop. Other things I thought of were:
1) condense whitespace in text nodes because insignficant whitespace
gets rendered as "\t\t\t\t\t\t" etc.
2) improve the inline-variable-when-only-one-child rule to ignore
child text nodes that are only whitespace. It needs a bit of
tree-pruning.

> Has Robert Rees contacted you for the geek night event he is planning in
> June?
> It seems to be good place to promote hypirinha...

Sounds great. I've signed up for a 5 minute slot.

cheers,

-Alistair

Alistair Jones

unread,
May 31, 2009, 5:52:07 AM5/31/09
to hypirin...@googlegroups.com
Hi Felix,

I finally got round to checking in the missinary code you wrote. It's here:
http://hypirinha.googlecode.com/svn/trunk/java-extras/html2hypirinha

And deployed here:
http://hypirinha.appspot.com/html2hypirinha

I added you as a project owner to the google code project.

The minimal set of unit tests are running in cruise here:
http://build.hypirinha.org/cruise/tab/pipeline/history/html2hypirinha

I'll send you cruise login details off-list.

I'm afraid I slipped back to the bad habit of writing a build script
in bash (this time with the help of my new friend sed). If fancy
changing it to use buildobjects, please go ahead, I'd be fascinated to
see how it goes.

-Alistair

Reply all
Reply to author
Forward
0 new messages