French spelling checker of vim does not have words with ligature oe

58 views
Skip to first unread message

Dominique Pelle

unread,
Jan 27, 2008, 3:22:24 AM1/27/08
to vim...@googlegroups.com
Hi,

Here is systematic problem I see with the spelling
checker of Vim (7.1.241) in French. I'm using the
'fr.utf-8.spl' file.

None of the words in French which use the ligature form of
oe (œ) are recognized using Vim's French dictionary.
Here are some frequent French words using ligature oe:

bœuf, chœur, œuvre, sœur

Vim recognizes only their alternative spelling
using 2 letters oe:

boeuf, choeur, oeuvre, soeur

Ideally, the spelling using ligature oe should be
the preferred spelling. Alternative spelling using
two letter oe could perhaps be marked as RARE
in the dictionary.

Using ":spelldump", I can see that the dictionary does
not contain any word at all with ligature oe.

OpenOffice and Firefox spell checkers only recognize
words with ligature but not with 2 letters oe (opposite
behavior than vim)

aspell and gmail spell checkers recognize both spellings.

-- Dominique

Patrick Texier

unread,
Jan 27, 2008, 5:10:46 AM1/27/08
to vim...@googlegroups.com
On Sun, 27 Jan 2008 09:22:24 +0100, Dominique Pelle wrote:

> None of the words in French which use the ligature form of
> oe (œ) are recognized using Vim's French dictionary.
> Here are some frequent French words using ligature oe:
>
> bœuf, chœur, œuvre, sœur

ligature ae (æ), Latin1 code E6, is missing too:

cæcum, et cætera (etc).

--
Patrick Texier

vim:syntax=mail:ai:ts=4:et:tw=72

Dominique Pelle

unread,
Jan 27, 2008, 7:20:41 AM1/27/08
to vim...@googlegroups.com
On Jan 27, 2008 11:10 AM, Patrick Texier <p.te...@genindre.org> wrote:

> On Sun, 27 Jan 2008 09:22:24 +0100, Dominique Pelle wrote:
>
> > None of the words in French which use the ligature form of
> > oe (œ) are recognized using Vim's French dictionary.
> > Here are some frequent French words using ligature oe:
> >
> > bœuf, chœur, œuvre, sœur
>
> ligature ae (æ), Latin1 code E6, is missing too:
>
> cæcum, et cætera (etc).


I just reread ":help spell-mkspell" which explains how to
build a dictionary of vim:

===
You can create a Vim spell file from the .aff and .dic files that Myspell
uses. Myspell is used by OpenOffice.org and Mozilla. You should be able to
find them here:
http://wiki.services.openoffice.org/wiki/Dictionaries
==

So I downloaded the following French dictionary from above location:
French (France) Classique+1990 2007-12-20

The zip file contains among other things a README_fr_FR.txt file which says:

===
##### VERSION 2.0.2 - novembre 2007 ###########################################
...
* Corrections des mots avec ligatures ('oe', 'ae').

===

Tranlation in English:

Version 2.0.2 - November 2007
...
* Correction of the words with ligatures ('oe', 'ae').


I recreated my French dictionary of Vim from this updated file
and now vim properly recognizes words with ligature oe (œ)
(cœur, sœur, etc). It no longer accepts alternative spelling
(coeur, soeur...) but that is OK. This newly rebuilt French
dictionary also contains words with ligature ae (æ) such
as: æquo, cætera, ...

-- Dominique

Bram Moolenaar

unread,
Jan 27, 2008, 7:23:36 AM1/27/08
to Patrick Texier, vim...@googlegroups.com

Patrick Texier wrote:

> On Sun, 27 Jan 2008 09:22:24 +0100, Dominique Pelle wrote:
>
> > None of the words in French which use the ligature form of

> > oe (½) are recognized using Vim's French dictionary.


> > Here are some frequent French words using ligature oe:
> >

> > b½uf, ch½ur, ½uvre, s½ur


>
> ligature ae (æ), Latin1 code E6, is missing too:
>
> cæcum, et cætera (etc).

The 0xE6 code exists in latin1, but for me 0xBD is 1/2.

The spell files come from here:
http://ftp.services.openoffice.org/pub/OpenOffice.org/contrib/dictionaries
I don't see a remark about who to contact for improvements. Hmm, it
appears to come from ispell, made by Christophe Pythoud.

It's possible to make a patch specifically for Vim, but it's probably
better to change the OpenOffice files.

--
ARTHUR: ... and I am your king ....
OLD WOMAN: Ooooh! I didn't know we had a king. I thought we were an
autonomous collective ...
"Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

/// Bram Moolenaar -- Br...@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ download, build and distribute -- http://www.A-A-P.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

Dominique Pelle

unread,
Jan 27, 2008, 8:03:42 AM1/27/08
to vim...@googlegroups.com
On Jan 27, 2008 1:23 PM, Bram Moolenaar <Br...@moolenaar.net> wrote:
>
>
> Patrick Texier wrote:
>
> > On Sun, 27 Jan 2008 09:22:24 +0100, Dominique Pelle wrote:
> >
> > > None of the words in French which use the ligature form of
> > > oe (½) are recognized using Vim's French dictionary.
> > > Here are some frequent French words using ligature oe:
> > >
> > > b½uf, ch½ur, ½uvre, s½ur
> >
> > ligature ae (æ), Latin1 code E6, is missing too:
> >
> > cæcum, et cætera (etc).
>
> The 0xE6 code exists in latin1, but for me 0xBD is 1/2.
>
> The spell files come from here:
> http://ftp.services.openoffice.org/pub/OpenOffice.org/contrib/dictionaries
> I don't see a remark about who to contact for improvements. Hmm, it
> appears to come from ispell, made by Christophe Pythoud.
>
> It's possible to make a patch specifically for Vim, but it's probably
> better to change the OpenOffice files.

Bram:

we wrote a almost the same time. As I said in my previous
email, the latest French dictionary from OpenOffice actually
fixes the ligature oe and ae. Latest update is from Dec 2007:
(French (France) Classique+1990 2007-12-20) but actual
fix for ligature was done in November according to README
file.

The files from OpenOffice are encoded in ISO8859-15
rather than latin1 (so they have both ligature oe and ae).

http://en.wikipedia.org/wiki/ISO8859-15

It works fine when I use it to create the utf-8 dictionary
of vim.

-- Dominique

Bram Moolenaar

unread,
Jan 27, 2008, 10:43:53 AM1/27/08
to Dominique Pelle, vim...@googlegroups.com

Dominique Pelle wrote:

> On Jan 27, 2008 1:23 PM, Bram Moolenaar <Br...@moolenaar.net> wrote:
> >
> >
> > Patrick Texier wrote:
> >
> > > On Sun, 27 Jan 2008 09:22:24 +0100, Dominique Pelle wrote:
> > >
> > > > None of the words in French which use the ligature form of

> > > > oe (=BD) are recognized using Vim's French dictionary.


> > > > Here are some frequent French words using ligature oe:
> > > >

> > > > b=BDuf, ch=BDur, =BDuvre, s=BDur
> > >
> > > ligature ae (=E6), Latin1 code E6, is missing too:
> > >
> > > c=E6cum, et c=E6tera (etc).


> >
> > The 0xE6 code exists in latin1, but for me 0xBD is 1/2.
> >
> > The spell files come from here:
> > http://ftp.services.openoffice.org/pub/OpenOffice.org/contrib/dictionaries
> > I don't see a remark about who to contact for improvements. Hmm, it
> > appears to come from ispell, made by Christophe Pythoud.
> >
> > It's possible to make a patch specifically for Vim, but it's probably
> > better to change the OpenOffice files.
>
> Bram:
>
> we wrote a almost the same time. As I said in my previous
> email, the latest French dictionary from OpenOffice actually
> fixes the ligature oe and ae. Latest update is from Dec 2007:
> (French (France) Classique+1990 2007-12-20) but actual
> fix for ligature was done in November according to README
> file.
>
> The files from OpenOffice are encoded in ISO8859-15
> rather than latin1 (so they have both ligature oe and ae).
>
> http://en.wikipedia.org/wiki/ISO8859-15
>
> It works fine when I use it to create the utf-8 dictionary
> of vim.

Ah, so they updated the file. I'll see if I can generate new FR spell
files.

Vim doesn't support ISO-8859-15, it's handled as if it was latin1. I
think it works for most people if we use the spell file as if it was
latin1. It should work correctly for utf-8.

--
"The future's already arrived - it's just not evenly distributed yet."
-- William Gibson

Tony Mechelynck

unread,
Jan 27, 2008, 3:01:28 PM1/27/08
to vim...@googlegroups.com, Patrick Texier, Bram Moolenaar
Bram Moolenaar wrote:
>
> Patrick Texier wrote:
>
>> On Sun, 27 Jan 2008 09:22:24 +0100, Dominique Pelle wrote:
>>
>>> None of the words in French which use the ligature form of
>>> oe (œ) are recognized using Vim's French dictionary.

>>> Here are some frequent French words using ligature oe:
>>>
>>> bœuf, chœur, œuvre, sœur

>> ligature ae (æ), Latin1 code E6, is missing too:
>>
>> cæcum, et cætera (etc).
>
> The 0xE6 code exists in latin1, but for me 0xBD is 1/2.
>
> The spell files come from here:
> http://ftp.services.openoffice.org/pub/OpenOffice.org/contrib/dictionaries
> I don't see a remark about who to contact for improvements. Hmm, it
> appears to come from ispell, made by Christophe Pythoud.
>
> It's possible to make a patch specifically for Vim, but it's probably
> better to change the OpenOffice files.
>

Warning: the present message is in UTF-8.

The oe ligature, present in a few very common French words such as œuf (egg),
bœuf (ox, beef), sœur (sister), cœur (heart), and a few slightly less common
ones such as chœur (choir), vœu (a wish), does indeed not exist in Latin1,
neither in lowercase nor in capitals: these characters (plus the uppercase Y
with diaeresis, which is almost never used), are the few French letters which
cannot be represented in Latin1. The corresponding Unicode codepoints are Œ =
U+0152 (OE, capitals), œ = U+0153 (oe, lowercase). These characters do exist
in Windows-1252, the charset of Patrick's original post. The ae ligature, rare
in French but common in Danish, exists in Latin1 as Æ = 0xC6 (capitals), æ =
0xE6 (lowercase).

Best regards,
Tony.
--
The question is, why are politicians so eager to be president? What is
it about the job that makes it worth revealing, on national television,
that you have the ethical standards of a slime-coated piece of
industrial waste?
-- Dave Barry, "On Presidential Politics"

Reply all
Reply to author
Forward
0 new messages