I'm refamiliarising myself, taking a small project that I could write
in several other languages in an hour, but doing it as learning project
instead of a quickie implementation, finding out what's available
nowadays.
The project is to take some overly dictatorial HTML and liberate it.
You know the kind of thing, take out all the excess <br>'s and  .'s,
figure out the real structure they're hiding, and reemit it using <p>
and <h5> other proper tags.
Now I can do all this from scratch, reading characters from ports and
writing code to lex it and parse it and recognise idioms and such ...
but that's so nose-to-the-grindstone.
Isn't the whole point of these new Scheme systems that we work at a
more conceptual level?
So I ask. What tools and libraries are already there to make this kind
of task easier? Or more elaborate tasks of this kind -- because I will
run into them later. I look on the web and find myself lost in beginner
documentation. I'm looking for links to heavier stuff.
-- hendrik
_________________________________________________
For list-related administrative tasks:
http://lists.racket-lang.org/listinfo/users
It sounds like a good setup would be:
* Use the the `html' library to parse the files into xexprs (which are
simple sexpr representation for html).
* Tweak the result with lots of uses of `match', which is usually very
convenient for these kinds of things. (Possibly going through
writing your own match patterns, if needed.)
* Spit out a new file using the html library.
--
((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
http://barzilay.org/ Maze is Life!
Parsing:
http://planet.racket-lang.org/display.ss?package=htmlprag.plt&owner=neil
Pattern matching (don't know if this still works):
http://planet.racket-lang.org/display.ss?package=sxml-match.plt&owner=jim
N.
On Sat, Jun 4, 2011 at 9:18 PM, Hendrik Boom <hen...@topoi.pooq.com> wrote:
> New to Racket, old to Scheme and Lisp (bit I've been away from actively
> using them for decades, and much has changed).
Thanks. This gave me the search terms I needed to find:
section 17.2 of
http://pre.racket-lang.org/docs/html/reference/collects.html
which describes how libraries work, and
http://download.plt-scheme.org/doc/html/
which contains a link to some docs for the html library, at
http://download.plt-scheme.org/doc/html/html/index.html
Are these the official locations for this information?
I'm still on DrScheme, which is what's in the stable version of Debian,
which is currently on my laptop. Prsumably this will be replaced by
DrRacket when I upgrade to testing.
>
> * Tweak the result with lots of uses of `match', which is usually very
> convenient for these kinds of things. (Possibly going through
> writing your own match patterns, if needed.)
There's examples of this in http://download.plt-scheme.org/doc/html/html/index.html
Jim Bender's "sxml-match" works great, and is a useful tool that I think
anyone doing XML processing in Racket should keep handy in their toolbox.
--
http://www.neilvandyke.org/