Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Caml-list] Patch to 3.10.0 compiler enabling simple spell-checking

2 views
Skip to first unread message

Edgar Friendly

unread,
Oct 29, 2007, 5:12:04 PM10/29/07
to Caml List
One random little feature of GNAT that comes in handy for me is its
habit of, when I misspell an identifier, giving me a possible correction
in its compile error message. Spending some time with the 3.10.0
sources, I have created a "second draft" patch creating this
functionality in my favored language.

Example:
========

# /home/thelema/Projects/ocaml-custom/bin/ocamlc -o coml -I +lablgtk2
lablgtk.cma gtkInit.cmo coml.ml
File "coml.ml", line 61, characters 16-25:
Unbound value is_arcive, possible misspelling of is_archive

Impacts:
========

Efficiency in the case of finding a mistake should be quite good,
although this shouldn't matter too much since the compiler quits pretty
early in compilation when it finds an unbound identifier.

In the case of no unbound identifiers, the cost is an extra try/with
block around the standard lookup. I haven't made any benchmarks, though.

I expect this code to have little long term maintenance issues - the
major source of code changes was adding a "* string list" to a number of
exceptions to carry the list of possible correct spellings to the point
they get output by the compiler. These exceptions are still usable as
before with an empty list in this spot.

It's possible the code has created opportunities for uncaught exceptions
in the compiler as I only checked for instances of "Not_found" in a few
files -- those which dealt with the Unbound_* exceptions. Someone who
knows the internals better might find places the "Found_nearly"
exception that carries possible corrections might escape into.


Dedicated to:
Yaron Minsky and the team at Jane Street

E.

ocaml-spelling.patch

Till Varoquaux

unread,
Oct 29, 2007, 5:34:50 PM10/29/07
to Edgar Friendly, Caml List
Cool!

Haven't looked at the patch yet but this seems like a neat feature
(might be a little too much of a gadget but,hey I love gadgets).

I am curious. Why is this dedicated to Jane Street? Since I am
probably the worst at typing out here (and even though they bought me
a shiny shiny keyboard) I will take this patch as a personnal
intention ;-).

Till

> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>
>


--
http://till-varoquaux.blogspot.com/

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Julien Moutinho

unread,
Oct 29, 2007, 7:32:49 PM10/29/07
to Caml List
On Mon, Oct 29, 2007 at 04:11:12PM -0500, Edgar Friendly wrote:
> Spending some time with the 3.10.0 sources, I have created
> a "second draft" patch creating this functionality
> in my favored language.

I'm sorry but could it be that you have posted an incomplete patch?

For instance typing/typetexp.ml should be modified, because
it defines [Unbound_type_constructor of Longident.t]
which is used in b/typing/typecore.ml
as a [of Longident.t * string list]

Besides [find_name_with_nearly] is defined in b/typing/ident.ml
but is never used anywhere.

Also, could you post a patch against today's release310 branch please?

Regards,
Julien.

Yitzhak Mandelbaum

unread,
Oct 29, 2007, 8:18:10 PM10/29/07
to Caml List
Very cool! Do you think there's any way you could separate it from
the compiler, like Learner et al.'s SEMINAL work, which separates
type error messages from the compiler?. Separation could help ensure
this (and any other, similar) ideas don't accidentally introduce bugs
into the compiler, and make it much easier for you to maintain. A
very simple hack might be tod wrap ocamlc in a script that parses
such error messages and then tokenizes the source file, looking for
similar strings?

Cheers,
Yitzhak

-----------------------------
Yitzhak Mandelbaum

Edgar Friendly

unread,
Oct 30, 2007, 1:52:47 AM10/30/07
to Caml List
Julien Moutinho wrote:
> On Mon, Oct 29, 2007 at 04:11:12PM -0500, Edgar Friendly wrote:
>> Spending some time with the 3.10.0 sources, I have created
>> a "second draft" patch creating this functionality
>> in my favored language.
>
> I'm sorry but could it be that you have posted an incomplete patch?

I did. Here's a "third draft" which should include all the necessary
bits to patch off 3.10.0. There's still plenty of rough edges to smooth
out in the patch, but it should suffice for people to have something
working.

E.


ocaml-spelling-3.10.0.patch

Edgar Friendly

unread,
Oct 30, 2007, 1:54:03 AM10/30/07
to Caml List
Yitzhak Mandelbaum wrote:
> Very cool! Do you think there's any way you could separate it from the
> compiler, like Learner et al.'s SEMINAL work, which separates type error
> messages from the compiler?. Separation could help ensure this (and any
> other, similar) ideas don't accidentally introduce bugs into the
> compiler, and make it much easier for you to maintain. A very simple
> hack might be tod wrap ocamlc in a script that parses such error
> messages and then tokenizes the source file, looking for similar strings?
>
> Cheers,
> Yitzhak
>
Separating it from the compiler would keep from interfering with the
compiler's activities, but it would add some difficulties:

Parsing ocaml - I guess I could steal the parser out of the current
source code, and use the internal parse tree, but it'd be a lot more
difficult than what I've done so far.

Namespaces - Ocaml has a ton of namespaces. At the moment, the patch
doesn't find mistyped module names, but it does distinguish the
following: type parameters, type constructors, classes, row variables,
values, constructors, labels and instance variables.

Visibilty/scope - A simple script would have to add much complication to
keep track of where each identifier is visible - Maybe easy, maybe not.

Parsing the output of the ocaml compiler - ocamlc lacks i18n to make
this extra difficult, but the error messages don't follow any spec, and
could change at any time.


On the plus side, if the simpler hack were built into an IDE, it could
embed a list of corrections into a right-click menu, like a spell
checker. It could do this outside the current cycle of edit -> compile
-> edit

I don't have interest in doing this - my ideas of a super-ocaml-IDE seem
too big for me to program ATM.

E.

Sébastien Hinderer

unread,
Oct 30, 2007, 4:15:52 AM10/30/07
to caml...@yquem.inria.fr, Caml List
> One random little feature of GNAT that comes in handy for me is its
> habit of, when I misspell an identifier, giving me a possible correction
> in its compile error message. Spending some time with the 3.10.0
> sources, I have created a "second draft" patch creating this
> functionality in my favored language.

Sounds great! Just out of curiosity: does the patch take into account
typing information to restrict the proposals to identifiers with a
compatible type ?
If it does not, is it because it would be too difficult to gather all the
necessary information ?

Cheers, and congratulations!
Sébastien.

Edgar Friendly

unread,
Oct 30, 2007, 11:50:44 AM10/30/07
to caml...@yquem.inria.fr, Caml List
Sébastien Hinderer wrote:
> Sounds great! Just out of curiosity: does the patch take into account
> typing information to restrict the proposals to identifiers with a
> compatible type ?
> If it does not, is it because it would be too difficult to gather all the
> necessary information ?
>
> Cheers, and congratulations!
> Sébastien.

It does. It works by wrapping the tree lookup for an identifier
(separate trees are kept for each "compatible type" of identifier), and
if that fails, traversing the tree once more to find corrections.

E.

Sébastien Hinderer

unread,
Oct 30, 2007, 11:59:10 AM10/30/07
to caml...@yquem.inria.fr, Caml List
Edgar Friendly :

> Sébastien Hinderer wrote:
> > Sounds great! Just out of curiosity: does the patch take into account
> > typing information to restrict the proposals to identifiers with a
> > compatible type ?
> > If it does not, is it because it would be too difficult to gather all the
> > necessary information ?
> It does. It works by wrapping the tree lookup for an identifier
> (separate trees are kept for each "compatible type" of identifier), and
> if that fails, traversing the tree once more to find corrections.

Wow... Definitely great.
Sébastien.

0 new messages