[racket] Reacquainting myself

Hendrik Boom

unread,

Jun 4, 2011, 4:18:07 PM6/4/11

to us...@racket-lang.org

New to Racket, old to Scheme and Lisp (bit I've been away from actively
using them for decades, and much has changed).

I'm refamiliarising myself, taking a small project that I could write
in several other languages in an hour, but doing it as learning project
instead of a quickie implementation, finding out what's available
nowadays.

The project is to take some overly dictatorial HTML and liberate it.
You know the kind of thing, take out all the excess <br>'s and &nbsp.'s,
figure out the real structure they're hiding, and reemit it using <p>
and <h5> other proper tags.

Now I can do all this from scratch, reading characters from ports and
writing code to lex it and parse it and recognise idioms and such ...
but that's so nose-to-the-grindstone.

Isn't the whole point of these new Scheme systems that we work at a
more conceptual level?

So I ask. What tools and libraries are already there to make this kind
of task easier? Or more elaborate tasks of this kind -- because I will
run into them later. I look on the web and find myself lost in beginner
documentation. I'm looking for links to heavier stuff.

-- hendrik

_________________________________________________
For list-related administrative tasks:
http://lists.racket-lang.org/listinfo/users

Eli Barzilay

unread,

Jun 4, 2011, 5:12:10 PM6/4/11

to Hendrik Boom, us...@racket-lang.org

50 minutes ago, Hendrik Boom wrote:
>
> So I ask. What tools and libraries are already there to make this
> kind of task easier? Or more elaborate tasks of this kind --
> because I will run into them later. I look on the web and find
> myself lost in beginner documentation. I'm looking for links to
> heavier stuff.

It sounds like a good setup would be:

* Use the the `html' library to parse the files into xexprs (which are
simple sexpr representation for html).

* Tweak the result with lots of uses of `match', which is usually very
convenient for these kinds of things. (Possibly going through
writing your own match patterns, if needed.)

* Spit out a new file using the html library.

--
((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
http://barzilay.org/ Maze is Life!

Noel Welsh

unread,

Jun 4, 2011, 5:25:19 PM6/4/11

to Hendrik Boom, us...@racket-lang.org

In addition to Eli's suggestions...

Parsing:

http://planet.racket-lang.org/display.ss?package=htmlprag.plt&owner=neil

Pattern matching (don't know if this still works):

http://planet.racket-lang.org/display.ss?package=sxml-match.plt&owner=jim

N.

On Sat, Jun 4, 2011 at 9:18 PM, Hendrik Boom <hen...@topoi.pooq.com> wrote:
> New to Racket, old to Scheme and Lisp (bit I've been away from actively
> using them for decades, and much has changed).

Hendrik Boom

unread,

Jun 4, 2011, 9:44:18 PM6/4/11

to Eli Barzilay, us...@racket-lang.org

On Sat, Jun 04, 2011 at 05:12:10PM -0400, Eli Barzilay wrote:
> 50 minutes ago, Hendrik Boom wrote:
> >
> > So I ask. What tools and libraries are already there to make this
> > kind of task easier? Or more elaborate tasks of this kind --
> > because I will run into them later. I look on the web and find
> > myself lost in beginner documentation. I'm looking for links to
> > heavier stuff.
>
> It sounds like a good setup would be:
>
> * Use the the `html' library to parse the files into xexprs (which are
> simple sexpr representation for html).

Thanks. This gave me the search terms I needed to find:

section 17.2 of
http://pre.racket-lang.org/docs/html/reference/collects.html

which describes how libraries work, and

http://download.plt-scheme.org/doc/html/

which contains a link to some docs for the html library, at

http://download.plt-scheme.org/doc/html/html/index.html

Are these the official locations for this information?

I'm still on DrScheme, which is what's in the stable version of Debian,
which is currently on my laptop. Prsumably this will be replaced by
DrRacket when I upgrade to testing.

>
> * Tweak the result with lots of uses of `match', which is usually very
> convenient for these kinds of things. (Possibly going through
> writing your own match patterns, if needed.)

There's examples of this in http://download.plt-scheme.org/doc/html/html/index.html

Neil Van Dyke

unread,

Jun 5, 2011, 1:24:16 PM6/5/11

to us...@racket-lang.org

Noel Welsh wrote at 06/04/2011 05:25 PM:
> Pattern matching (don't know if this still works):
>
> http://planet.racket-lang.org/display.ss?package=sxml-match.plt&owner=jim
>

Jim Bender's "sxml-match" works great, and is a useful tool that I think
anyone doing XML processing in Racket should keep handy in their toolbox.

--
http://www.neilvandyke.org/

Reply all

Reply to author

Forward