The Catalan language has a ligature consisting in one
"l" character, followed by a middle dot ("·"), followed
by another "l". See here for more details:
http://en.wikipedia.org/wiki/L·l#Catalan
Is there a way to make emacs aware of this, so that it
doesn't treat a word containing "l·l" as two separate
words?
Thanks.
PS. Please CC me, if you reply to this.
--
Ernest
> Is there a way to make emacs aware of this, so that it
> doesn't treat a word containing "l·l" as two separate
> words?
How about using ŀ? It's LATIN SMALL LETTER L WITH MIDDLE DOT at U
+0140. The problem is that · only between two l becomes a word
constituent and in so many other cases it's a multiplication sign, a
comma, a name separator, some kind of bullet sign...
--
Greetings
Pete
The human animal differs from the lesser primates in his passion for
lists of "Ten Best."
– H. Allen Smith
> Hi there,
>
> The Catalan language has a ligature consisting in one
> "l" character, followed by a middle dot ("·"), followed
> by another "l". See here for more details:
> http://en.wikipedia.org/wiki/L·l#Catalan
>
> Is there a way to make emacs aware of this, so that it
> doesn't treat a word containing "l·l" as two separate
> words?
>
> Thanks.
>
> PS. Please CC me, if you reply to this.
You could use dynamic syntax-tables via font-lock.
(add-hook 'text-mode-hook
(lambda nil
(set (make-variable-buffer-local
'parse-sexp-lookup-properties) t)
;; get font-lock started
(unless font-lock-defaults
(setq font-lock-defaults '(nil t)))
(add-to-list
(make-variable-buffer-local
'font-lock-syntactic-keywords)
;; let ! between 2*a have word syntax
'("a\\(!\\)a" 1 "w"))))
Replace `a' and `!' with your characters and it'll work,
hopefully.
-ap
18/10/09 @ 21:24 (+0200), thus spake Peter Dyballa:
> How about using ŀ? It's LATIN SMALL LETTER L WITH MIDDLE DOT at U
> +0140. The problem is that · only between two l becomes a word
> constituent and in so many other cases it's a multiplication sign, a
> comma, a name separator, some kind of bullet sign...
Seems the way to go, yes. Unfortunately, everybody still
uses the middle dot, for example, spell-checkers think ŀ is
a misspelling.
Cheers.
--
Ernest
It does what I wanted. :)
Thanks!
Ernest
> How about using ŀ? It's LATIN SMALL LETTER L WITH MIDDLE DOT at U
> +0140. The problem is that · only between two l becomes a word
> constituent and in so many other cases it's a multiplication sign, a
> comma, a name separator, some kind of bullet sign...
It may be mis-used, but U+00B7 is MIDDLE DOT (punctuation). BULLET is
U+2022 and the mathematical DOT OPERATOR is U+22C5. It surely doesn't
really matter in this context anyhow. A lot of character syntaxes have
long been wrong in Emacs anyhow.
> Hi there,
>
> The Catalan language has a ligature consisting in one
> "l" character, followed by a middle dot ("·"), followed
> by another "l". See here for more details:
> http://en.wikipedia.org/wiki/L·l#Catalan
>
> Is there a way to make emacs aware of this, so that it
> doesn't treat a word containing "l·l" as two separate
> words?
[You're probably not really interested in word boundaries, just word
constituents. For an illustration of the difference, see variable
`word-combining-categories' and what capitalized-words-mode does in
Emacs 23.]
You should define a Catalan language environment to be used in ca_ES
locales. (I'm surprised I didn't do it, as there's a relevant input
method.) It should set the base syntax of · to word, and set a suitable
default input method. The existing one, `catalan-prefix', should
presumably bind `~.' to `·', as in latin-prefix; it doesn't currently,
and maybe needs other fixes.
The environment would be something like this (untested), which is
probably better then trying to use categories. [The default Latin-1
character set is overridden in, say, ca_ES.UTF-8.]
(push '("ca" . "Catalan") locale-language-names)
(set-language-info-alist
"Catalan" '((tutorial . "TUTORIAL.es") ; maybe...
(charset iso-8859-1)
(coding-system iso-latin-1 iso-latin-9)
(coding-priority iso-latin-1)
(input-method . "catalan-prefix")
(nonascii-translation . iso-8859-1)
(unibyte-display . iso-latin-1)
(setup-function
. (lambda ()
(modify-syntax-entry ?· "w" (standard-syntax-table))))
(exit-function
. (lambda ()
(modify-syntax-entry ?· "_" (standard-syntax-table))))
;; Fixme:
;; (sample-text . "Spanish (Español) ¡Hola!")
(documentation . "\
This language environment uses the Latin-1 character set, sets
the default input method to \"catalan-prefix\", and sets the
syntax of `·' to word. It selects the Spanish tutorial, in the
absence of a Catalan translation."))
'("European"))
You could make a bug report if you have more luck than me with reports
about stuff I worked on.
Well, it's a pretty odd way to do it. If you really only want to use
the ligature in Text mode -- and not programming language comments, for
instance -- just amend `text-mode-syntax-table'.
Thanks a lot. Have you got any idea of where this should be
put in order to be loaded automatically at start-up?
I tried in init.el, and in a file in the "language" directory
in /usr/share/emacs/23.1/lisp/ to no avail.
It says that there's "no match", when I try to set the language
environment to Catalan interactively.
> You could make a bug report if you have more luck than me with reports
> about stuff I worked on.
I will try, once I get it to work :)
Cheers,
Ernest
1. C-x C-f ~/.emacs
2. M-x find-library RET default.el
3. M-x find-library RET site-start.el
> I tried in init.el, and in a file in the "language" directory
> in /usr/share/emacs/23.1/lisp/ to no avail.
> It says that there's "no match", when I try to set the language
> environment to Catalan interactively.
>
>> You could make a bug report if you have more luck than me with reports
>> about stuff I worked on.
>
> I will try, once I get it to work :)
>
> Cheers,
>
> Ernest
>
>
>
--
Kevin Rodgers
Denver, Colorado, USA