Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Caml-list] About the O'Reilly book on the web

1 view
Skip to first unread message

Francois Colonna

unread,
Nov 25, 2006, 1:42:10 PM11/25/06
to caml...@yquem.inria.fr
Hello

in the version of the O'Reilly book on the web
http://caml.inria.fr/pub/docs/oreilly-book/html/book-ora105.html#toc134

Chapter 11 about Str Library page 293

the followin example of a regular expression is given :

*let*| |english_date_format| ||=|| |Str.regexp| ||"[0-9]+\.[0-9]+\.[0-9]+"|| |;;


1 - if you are aware of how to compile a Library and try to compile
this line you will get a Warning :

Warning X: illegal backslash escape in string.

Is that a typing error ? Who is in charge of this kind of corrections ?

2 - if you are not aware of how to compile a Library it will be impossible
to make this line of code be compiled and linked.

It will be nice to warn the reader about this trouble (for example with a special icon)
and to give him the possibility to find (quickly) a description of the rather
difficult compilation procedure to be used in this case

Thanks

François Colonna

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Sebastien Ferre

unread,
Nov 27, 2006, 5:13:12 AM11/27/06
to caml...@yquem.inria.fr
Hello,

Francois Colonna wrote:
> Hello
>
> in the version of the O'Reilly book on the web
> http://caml.inria.fr/pub/docs/oreilly-book/html/book-ora105.html#toc134
>
> Chapter 11 about Str Library page 293
>
> the followin example of a regular expression is given :
>
> *let*| |english_date_format| ||=|| |Str.regexp|
> ||"[0-9]+\.[0-9]+\.[0-9]+"|| |;;

There should be a double backslash \\.
Indeed \ is a meta-chararcter of regular expressions, but also
of usual strings.
In general, all backslashes in regular expressions must be
doubled when represented as caml strings. For instance, the
same happens with groupings :

the regular expression : \([0-9]+\),\1

is represented by

let re = Str.regexp "\\([0-9]+\\),\\1"

Hope it helps,
Sebastien

Philippe Wang

unread,
Nov 28, 2006, 4:06:57 PM11/28/06
to caml...@inria.fr
Hello,

If you look closer, you can see that the book is about the version 2.04.

http://caml.inria.fr/pub/docs/oreilly-book/html/book-ora009.html

With OCaml 2.04, you don't have those warnings because they hadn't
appeared yet.

Objective Caml version 2.04

# "[0-9]+\.[0-9]+\.[0-9]+";;
- : string = "[0-9]+\\.[0-9]+\\.[0-9]+"

You can probably avoid warnings by backslashing your backslashes...

Still I believe the OCaml Team should find another way to express
regular expressions, because if \. and \\. both mean \\. then it is a
very bad idea...

Well, enjoy OCaml :-)

--
Philippe Wang
ma...@philippewang.info


PS : I hardly understand what I'm supposed to do to post in this
mailing-list, so you have probably not received this one before, but
maybe you already have...
(I hope this time it'll work...)


Sebastien Ferre a écrit :

Till Varoquaux

unread,
Nov 28, 2006, 5:37:09 PM11/28/06
to Philippe Wang
hello!
On 11/28/06, Philippe Wang <li...@philippewang.info> wrote:
> Hello,
..

> # "[0-9]+\.[0-9]+\.[0-9]+";;
> - : string = "[0-9]+\\.[0-9]+\\.[0-9]+"
>
> You can probably avoid warnings by backslashing your backslashes...
>
> Still I believe the OCaml Team should find another way to express
> regular expressions, because if \. and \\. both mean \\. then it is a
> very bad idea...
Although I do agree the problem seems a little more complicated:
we are used to a more or less standard regexp syntax where special
chars can be escaped by \, this obviously clashes with escaping
characters in string if we pass strings to the function defining the
regular exceptions....
I would recommend treating all warnings as errors:
-warn-error A
to avoid such conflicts.

As far as I'm concerned I find the problem to be more complicated:
regular expressions are not syntaxily checked nor are they typed
checked when specified through strings. Some languages intergrate them
as first class values thus allowing these verifications. Another
solution would be to build them using an Ocaml recursive sum type.
Although this would solve the syntax problem it would make regexp very
tedious to write. A library offering both options can be found at:
http://www.lri.fr/~marche/regexp/

Ideally one would want to precompile regular expression from strings
to actual constructed types using a preprocessor (e.g. camlp4). It
seems Francois Potier was one of the first to try such an approach:
[http://caml.inria.fr/pub/ml-archives/caml-list/2001/07/30b327c7c4b0fa5ace86dbf258e2c5d1.en.html]
I'm pretty sure this has been done in other libraries (regexp-pp for
instance). Actual type-checking might prove a little harder to get
working.
Cheers,
Till
P.S.:Je confirme: j'ai bien recu ton mail ;-)...

Martin Jambon

unread,
Nov 28, 2006, 5:50:45 PM11/28/06
to Till Varoquaux
On Tue, 28 Nov 2006, Till Varoquaux wrote:

> hello!
> On 11/28/06, Philippe Wang <li...@philippewang.info> wrote:
> > Hello,

> ...

You should definitely have a look at micmatch. It's backslash free!

main page: http://martin.jambon.free.fr/micmatch.html
tutorial: http://martin.jambon.free.fr/micmatch-howto.html
reference: http://martin.jambon.free.fr/micmatch-manual.html


Martin

--
Martin Jambon, PhD
http://martin.jambon.free.fr

Philippe Wang

unread,
Nov 28, 2006, 6:11:49 PM11/28/06
to Till Varoquaux
Hello,

> Although I do agree the problem seems a little more complicated:
> we are used to a more or less standard regexp syntax where special
> chars can be escaped by \, this obviously clashes with escaping
> characters in string if we pass strings to the function defining the
> regular exceptions....
> I would recommend treating all warnings as errors:
> -warn-error A
> to avoid such conflicts.

I am not used to using regexp with Caml... (although I write some
scripts with OCaml instead of using Bash or Perl, or even Php, sometimes...)
But writing systematically the double backslash for a single regexp
backslash and a quadruple backslash for a backslashed backslash... Well
I wouldn't do it!


> As far as I'm concerned I find the problem to be more complicated:
> regular expressions are not syntaxily checked nor are they typed
> checked when specified through strings. Some languages intergrate them
> as first class values thus allowing these verifications.

Last semester, with some friends we wrote a Caml compiler (that keeps
the type informations at runtime), and one idea (which we did not
implement because of some lack of time) was first class regexp, a bit
like in Perl, while mixing it with the match-with-like syntax... (If
only we could have as much time as we want or need...)

It could be something "usable", like :

matchr (* the matchr keyword is an example *)
"some string"
with
| "PLOP 42 - \([0-9]\)" -> Int (int_of_string $1)
| "PLOP 43 - \(.*\)" -> $1

Then you can close it like in Coq with an "end" keyword, or just keep
the OCaml syntaxe... (implicit closing, whatever)

Well, of course, bad thing could be that $ is a character for infix
operators (but whatever it's not so important)

Then the question is still "What do we want OCaml to be?"
Do we want to make regexps easy to use with OCaml ? And are we ready to
make OCaml bigger ("just") for that ?

> Another
> solution would be to build them using an Ocaml recursive sum type.
> Although this would solve the syntax problem it would make regexp very
> tedious to write. A library offering both options can be found at:
> http://www.lri.fr/~marche/regexp/

It looks really ... "not funny" I would say!


> Ideally one would want to precompile regular expression from strings
> to actual constructed types using a preprocessor (e.g. camlp4). It
> seems Francois Potier was one of the first to try such an approach:
> [http://caml.inria.fr/pub/ml-archives/caml-list/2001/07/30b327c7c4b0fa5ace86dbf258e2c5d1.en.html]
>
> I'm pretty sure this has been done in other libraries (regexp-pp for
> instance). Actual type-checking might prove a little harder to get
> working.

Actually I don't really see the types problems...
If everything in return with $n has type string, then there is no
matter... We can also easily detect that $4 does not exist for a regexp
such as "\(PLOP[4-2]\) 42.*" :-p
(or decide that $4 would be an empty string, but it would seem a bit dirty)

Anyways, it would probably be more "a good thing" than "a bad thing"...

Philippe

> P.S.:Je confirme: j'ai bien recu ton mail ;-)...

PS : Je n'ai toujours pas compris comment ça marche...
Parce que le mail qui est passé est celui qui a été envoyé avec la
mauvaise adresse o_O
Peut-être que je comprendrai un jour prochain...
(donc je retente avec la "mauvaise adresse" puisque ça semble mieux
fonctionner)

Philippe Wang

unread,
Nov 28, 2006, 7:23:07 PM11/28/06
to Martin Jambon

Martin Jambon a écrit :

> You should definitely have a look at micmatch. It's backslash free!
>
> main page: http://martin.jambon.free.fr/micmatch.html
> tutorial: http://martin.jambon.free.fr/micmatch-howto.html
> reference: http://martin.jambon.free.fr/micmatch-manual.html

But isn't that making things "different again"?


I know so many people that do not want to use OCaml just because its
syntax is (very) "different"...

(don't tell me about the alternative syntax, which is - to me - an
"horror"...)

--
Philippe Wang
ma...@philippewang.info

Martin Jambon

unread,
Nov 28, 2006, 8:50:41 PM11/28/06
to Philippe Wang
On Wed, 29 Nov 2006, Philippe Wang wrote:

> Martin Jambon a écrit :
>
>> You should definitely have a look at micmatch. It's backslash free!
>>
>> main page: http://martin.jambon.free.fr/micmatch.html
>> tutorial: http://martin.jambon.free.fr/micmatch-howto.html
>> reference: http://martin.jambon.free.fr/micmatch-manual.html
>
> But isn't that making things "different again"?

You can't make things better without making them different. OCaml is about
being better, and so is the syntax I chose for regexps. It is fully
compatible with the syntax used by ocamllex, and I must say ocamllex
regexps are incredibly easy to learn and to use. I never had any problem
with them. In comparison Str or PCRE regexps are truly horrible.

> I know so many people that do not want to use OCaml just because its syntax
> is (very) "different"...

I don't think that those people would be more satisfied with another
syntax anyway, because OCaml would still be different! It's just that
average people are afraid of anything that is different from what they
already know.

> (don't tell me about the alternative syntax, which is - to me - an
> "horror"...)

OK. Here is what I propose to whoever thinks OCaml's syntax is not
good: give me a *complete* description of the syntax that you want and
I'll implement it.

Philippe Wang

unread,
Nov 29, 2006, 10:31:16 AM11/29/06
to Martin Jambon
Hello,

> You can't make things better without making them different. OCaml is
> about being better, and so is the syntax I chose for regexps. It is fully
> compatible with the syntax used by ocamllex, and I must say ocamllex
> regexps are incredibly easy to learn and to use. I never had any problem
> with them. In comparison Str or PCRE regexps are truly horrible.

Maybe if regexps are taken to first class values (at least in the
syntax, whatever what is done behind...), like in Perl, then it'll be
easily usable. Well, I will look with more attention to micmatch, maybe
it's actually really easy to handle it.

I will think about it when I have time for that.

> I don't think that those people would be more satisfied with another
> syntax anyway, because OCaml would still be different! It's just that
> average people are afraid of anything that is different from what they
> already know.

In deed, you're probably right. Still I hope not.


> OK. Here is what I propose to whoever thinks OCaml's syntax is not good:
> give me a *complete* description of the syntax that you want and I'll
> implement it.

That's really hard :-D

I like the OCaml syntax very much (I must be crazy :-D)
(but definitely not the one of "Camlp4 Chapter 6 : The Revised syntax")


--
Philippe Wang

Philippe Wang

unread,
Nov 29, 2006, 1:12:43 PM11/29/06
to brogoff
brogoff a écrit :

> That would be a more interesting comment if you gave some reasons
> as to why you believe that. I prefer the Revised syntax, for reasons
> of overall consistency and because it removes a few gotchas, but for
> various nontechnical reasons (tiny user community, questions about the
> future of CamlP4 and the level of support for it, etc.) would not
> switch over.

Maybe it's because I know the standard syntax quite well.
Or maybe because there are some things that are too weird in the revised
syntax, like lists stuff.

Like that :

OCaml Revised
x::y::z::t [x::[y::[z::t]]]
x::y::z::t [x; y; z :: t]

=> It's too weird for me.

The reversed notation for types : I don't like it either.
(maybe just because I'm not used to that)

In declaration of a concrete type, brackets must enclose the constructor
declarations:
OCaml Revised
type t = A of i | B;; type t = [ A of i | B ];
Why is it so much better to add brackets? To me they are useless...
Do they really make things clearer for some people?

Well, I am not going to say all I like and all I don't.
Of course there things that are potentially "better", like parenthesis
around tuples. But I prefer not having to put them systematically.
There are good ideas in the revised syntax, but it doesn't fit my tastes 8-)

--
Philippe Wang

Till Varoquaux

unread,
Nov 29, 2006, 4:27:04 PM11/29/06
to Jon Harrop
On 11/29/06, Jon Harrop <j...@ffconsultancy.com> wrote:

> On Wednesday 29 November 2006 17:25, brogoff wrote:
> > questions about the future of CamlP4
>
> I thought the upcoming, revamped camlp4 was one of the hotly anticipated new
> features scheduled for OCaml 4?
Scheduled for ocaml 3.10...
>
> --
> Dr Jon D Harrop, Flying Frog Consultancy Ltd.
> Objective CAML for Scientists
> http://www.ffconsultancy.com/products/ocaml_for_scientists

skaller

unread,
Nov 29, 2006, 9:34:07 PM11/29/06
to Philippe Wang
On Wed, 2006-11-29 at 19:10 +0100, Philippe Wang wrote:
> brogoff a écrit :
>
> > That would be a more interesting comment if you gave some reasons
> > as to why you believe that. I prefer the Revised syntax, for reasons
> > of overall consistency and because it removes a few gotchas, but for
> > various nontechnical reasons (tiny user community, questions about the
> > future of CamlP4 and the level of support for it, etc.) would not
> > switch over.
>
> Maybe it's because I know the standard syntax quite well.
> Or maybe because there are some things that are too weird in the revised
> syntax, like lists stuff.

What might actually be interesting and useful is standard conforming
Standard MetaLanguage (SML) syntax, or a good subset of it.

I wonder how far that could go? Is there anything in SML that
you can't do in Ocaml with similar enough syntax that Camlp4
could cope with it?

--
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net

Tom

unread,
Nov 30, 2006, 1:23:57 PM11/30/06
to skaller
>
>
> I wonder how far that could go? Is there anything in SML that
> you can't do in Ocaml with similar enough syntax that Camlp4
> could cope with it?
>

For me, personally, the question is not whether it can be done, but whether
I want it or not!

I am used to OCaml and don't want to switch to the unfamiliar, somewhat
strange syntax of SML. The webpage
http://www.ps.uni-sb.de/~rossberg/SMLvsOcaml.html<http://www.ps.uni-sb.de/%7Erossberg/SMLvsOcaml.html>gives
a (seemingly) thorough comparision between SML and OCaml (both syntax
and language features are compared). The following are only some of the
things I would only hardly be able to cope with:

characters written as #"J" instead of 'J',
fn x => e instead of fun x -> e
case of instead of match with
different declarations for values and functions (val, fun, in OCaml only
let)
datatype instead of type, plus eqtypes
strance multiple values definition

These are only minor differences, but if one is accustumed to one taste, one
would suffer when forcefully introduced to another one.

By the way, there are also some strange syntax structures introduced by
camlp4 that I don't like...

- Tom

skaller

unread,
Nov 30, 2006, 10:38:35 PM11/30/06
to Tom
On Thu, 2006-11-30 at 19:20 +0100, Tom wrote:
>
> I wonder how far that could go? Is there anything in SML that
> you can't do in Ocaml with similar enough syntax that Camlp4
> could cope with it?
>
> For me, personally, the question is not whether it can be done, but
> whether I want it or not!

The point is to reuse existing SML code -- not write you
new code in SML, though it may be useful to do that
sometimes too.

For example, if you want high performance you might want
the option of using the whole program analyser Mlton
for the final product, but use Ocaml for development.

I actually have a vague interest in that. At least in part,
being able to use a *standardised* syntax, good or not,
may offer some advantages.

Tom

unread,
Dec 1, 2006, 1:50:11 AM12/1/06
to skaller
Hm.... but what about the libraries and the semantic incompatibilities?

- Tom

piggybox

unread,
Dec 1, 2006, 2:00:48 AM12/1/06
to
Is there any FP language implementing direct regexp pattern match? I
know none and I don't understand why not. The case/switch control
structure in Perl or Ruby is less powerful than the pattern match in
FPLs (no guards or variable binding), yet one thing great is that they
can match regexp directly.

0 new messages