Keystroke sequence to output precomposed character instead of combing diacritic

82 views
Skip to first unread message

Kevin C

unread,
Nov 9, 2023, 4:00:12 PM11/9/23
to Ukelele Users
Hi,

I've successfully used a keyboard layout that I created with Ukelele for a few years. I created it to follow a similar layout which I had created for my team to use on Windows several years before. So, first, thanks for this wonderful app!

The layout I created  has two frequently-used diacritics: the acute accent and combining macron below.

Since there are no precomposed characters that use the combining macron below, we eventually decided to use combining diacritics for both the acute and macron below. For a short time, the Windows version used a dead key to produce precomposed characters with the acute accent. But it is/was not possible to have the Microsoft Keyboard Layout Creator use a dead key to produce the two code points for the combining macron below. However, it was too cumbersome/confusing to type them differently, so we made the switch for both to be output as combining diacritics.

While more and more programs/apps support combining diacritics, there are still quite a few that do not really handle normalization well (i.e. precomposed acute characters are not always treated the same as the equivalent versions using combining acute).

Recently I've wondered if I could use Ukelele to change my layout to address these problems.

From what I understand, unlike with Windows, I could easily use Ukelele to create a layout that would use a dead key to output the two code points needed for the combining macron below. As best as I can tell, however, it would still require the dead key to be typed first.
 
However, as I continue to have use the equivalent Windows keyboard layout in certain situations (both in a VM and other computers), I think it would be too confusing to switch back and forth (typing the diacritics first on Mac, and afterward on Windows).

Is it possible that I could modify the layout I created with Ukelele to output the precomposed acute characters while still maintaining the existing keystrokes where the diacritics are typed afterward?

From reading the documentation, I'm not finding anywhere that would suggest it would be possible. Is that correct?

I apologize for the long email to describe the context. But I hope what I'm describing is clear.

Thanks for your help!

Gé van Gasteren

unread,
Nov 9, 2023, 4:29:31 PM11/9/23
to ukelel...@googlegroups.com
Hi Kevin,

After a dead key, the system waits for the next keystroke, and after a standard key, it doesn’t.
Therefore, you can’t have a key function as both a dead key and as a standard key, if that’s what you want to do.

Otherwise, it’s perfectly possible to have one key combination output two Unicode characters (e.g. a base character + a combining diacritic) and another key combination output just the combining diacritic.

The only possible concern would be to check if the app you are using handles the normalization properly, so you don’t end up having texts with, somewhere, precomposed characters and somewhere else the two parts separately.


--
You received this message because you are subscribed to the Google Groups "Ukelele Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ukelele-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ukelele-users/eeec1436-9ed2-4bbe-9cfb-ef4a7e085155n%40googlegroups.com.

Kevin C

unread,
Nov 9, 2023, 5:13:28 PM11/9/23
to Ukelele Users
Hi Gé,

Thanks for your help and the info. 

To clarify, I don't want any individual key to function a both a dead key and a standard key. :)

I'm mostly trying to avoid normalization issues. The few apps that I still need to use on Windows actually handle normalization pretty well. Several of the Mac apps I use also handle normalization well, but there are several that do not.

My current implementation has the following (for all of the vowels):

a + ´ = á (U+0061 LATIN SMALL LETTER A  U+0301 COMBINING ACUTE ACCENT)

Meaning typing a and then ´ produces the two code points U+0061 LATIN SMALL LETTER A  and U+0301 COMBINING ACUTE ACCENT

I don't actually need to ever have the acute on its own.

The current layout also has:

 a + ` = a̱ ( U+0061 LATIN SMALL LETTER A  U+0331 COMBINING MACRON BELOW)

I understand how I could use Ukelele  to use two independent dead keys to switch the keystrokes so that both accents would be typed first to achieve the desired output on Mac: 

 ´ + a = á (U+00E1 LATIN SMALL LETTER A WITH ACUTE)
` + a = a̱ ( U+0061 LATIN SMALL LETTER A  U+0331 COMBINING MACRON BELOW)

However, while this would fix my normalization issues on Mac, it would mean having different keystrokes on Windows, because with my custom MSKLC layout, there is no way to type  ` + a = a̱ with a dead key.

So that's why I was wondering if there is a way to do the opposite with Ukelele on Mac (for the other vowels as well), so that:

a + ´ = á (U+00E1 LATIN SMALL LETTER A WITH ACUTE)

Meaning typing a and then ´ would produce the precomposed U+00E1 LATIN SMALL LETTER A WITH ACUTE

Again, I don't actually need to ever have the acute or macron below separate from the vowel. I'm just trying to get around some of the normalization issues that arise.

But if I'm understanding correct, this might not be possible with Ukelele, correct?

Thanks again!

Gé van Gasteren

unread,
Nov 9, 2023, 6:15:17 PM11/9/23
to ukelel...@googlegroups.com
Thanks for the details.

In fact, you do want to use the "a" both as a dead key and as a standard key, depending on what comes after.

And you’re right, this isn’t really how dead keys work usually. But you may like this somewhat quirky solution:

You could define the "a" as a dead key, with "a" as its terminator, and only the output for the sequence "a" " ́" defined.
Then the "a" would almost function like a standard key, except that after typing it, an underline will appear or you see it highlighted – to show that the Mac is waiting for the next keystroke.
If that next keystroke is not an acute, the "a" and the next keystroke are then shown, because "a" was defined as the "terminator", i.e. the default character shown when the sequence with the second keystroke is undefined.

Because this is more or less uncharted territory, I can’t guarantee that you won’t run into problems.
E.g. when you need to type a password ending in "a", you have to type a space after it to make the "a" really appear.

I attach a random keyboard layout with this change made for the a +  ́  so you can get an idea if you like it, without having to work for it :-)
(It’s zipped, as I can’t attach the file as is.)

testing dead-key a.bundle.zip

John Brownie

unread,
Nov 10, 2023, 1:45:42 AM11/10/23
to ukelel...@googlegroups.com
If you just want to add a combining diacritic, you could make the ´ key (which key would that be, since it’s not a standard key?) produce U+0301, and ` produce U+0331 in all cases. That removes the need for a dead key and makes it more like Windows. The downside would be if you need those keys to produce other output when typed after other letters.

If you really want to have a keyboard layout that is the same for Windows and Mac, then you might consider using Keyman (https://keyman.com/) instead, as it is a cross-platform solution that can do things that Ukelele or MSKLC can’t. Its downside is that you need to have the Keyman app running to use your keyboard layout, and there may be situations where things work in unexpected ways. It is also a little more complex to set up.

John

Kevin C

unread,
Nov 10, 2023, 9:10:08 AM11/10/23
to Ukelele Users

You could define the "a" as a dead key, with "a" as its terminator, and only the output for the sequence "a" " ́" defined.

That idea had occurred to me. Needing to type something after it might be the main set back. :) But since you suggested it also, I might try it out.  It would be interesting to see which would end up being the most productive.

Thanks again for your help!

Kevin C

unread,
Nov 10, 2023, 9:17:54 AM11/10/23
to Ukelele Users
Hi John,

Thanks for your help.

> If you just want to add a combining diacritic, you could make the ´ key (which key would that be, since it’s not a standard key?) produce U+0301, and ` produce U+0331 in all cases.

This is what the current layout that I've used for several years does. It's worked quite well apart from some normalization issues with certain frequently used apps. In the end, I might just keep living with the normalization annoyances, as I have for several years. :)

I guess I should revisit Keyman, at least on Windows. When I originally created the layout for Windows, Keyman wasn't free (and thus wasn't an option for many members of the team).

The layout is based on the Spanish layout, as many others on my team have physical Spanish keyboards. On my US physical keyboard, the keys are ' (apostrophe) and [.

Thanks again for your help!

Kevin

Gé van Gasteren

unread,
Nov 10, 2023, 10:05:43 AM11/10/23
to ukelel...@googlegroups.com
>> You could define the "a" as a dead key, with "a" as its terminator, and only the output for the sequence "a" " ́" defined.
>That idea had occurred to me. Needing to type something after it might be the main set back. :) But since you suggested it also, I might try it out.  It would be interesting to see which would end up being the most productive.
Have you tried it with the layout I attached?
To be honest, my feeling is that its quirky behavior will become annoying in the long run, especially if you do this with all vowel characters.

Another approach completely: It seems that the Windows keyboard layout file format does support multiple-character output for dead-key sequences, even though MSKLC doesn’t.
So you might want to look into this, including other ways to edit those files.
I’ve tried KbdEdit, but alas, at first glance it seems to have the same limitation.

Or, at the end of the day, you might want to keep everything as it is, post-keying combining diacritical marks, trusting that the problems with normalization will be taken care of in time :-)


Kevin C

unread,
Nov 10, 2023, 11:04:51 AM11/10/23
to Ukelele Users
> Have you tried it with the layout I attached? 
> To be honest, my feeling is that its quirky behavior will become annoying in the long run, especially if you do this with all vowel characters.

I haven't tried it yet, but I understand the concept. My initial thought is that I will have the same feeling about the quirky behavior. :)

Thanks for sharing the link to KbdEdit. That looks like a good replacement for MSKLC

I’ve tried KbdEdit, but alas, at first glance it seems to have the same limitation.

I glanced quickly through their documentation, and it looks like the dead key limitation to one code point is Windows internal:

> Dead key limitations

> Due to an internal Windows limitation, dead characters are restricted to operating only against single BMP (<=FFFF) Unicode characters.

> This restriction applies to all components of a dead character transformation: the "from" and "to" characters, as well as the dead character itself.

> This has two important practical consequences:

  • Dead keys cannot produce a ligature.
  • Dead keys cannot produce nor transform non-BMP characters (>FFFF, i.e. 5 or 6 hex-digits).
    This somewhat counter-intuitive restriction is a consequence of non-BMP characters' internal representation as ligatures of surrogate pairs.

quoted from http://kbdedit.com/manual/dead_character_properties.html

I remember trying to edit the actual layout files outside of MSKLC years ago and running into the same issue when trying to output more than one code point.

If I understand KbdEdit's documentation correctly, you can use the program to create custom ligatures, which would give the desired output. However, it looks like these can only be assigned to keyboard shortcuts rather than dead keys.

Or, at the end of the day, you might want to keep everything as it is, post-keying combining diacritical marks, trusting that the problems with normalization will be taken care of in time :-)

I imagine this is the way I will go. Though I still might give Keyman another try for Windows, as John suggested.

The normalization issues on Mac usually boil down to apps not always handling the following well:

1. Including Spanish in the same documents as the other language I'm working on (some apps' spell checkers only work with the precomposed forms for Spanish).
2. Search/find not matching on both forms.

Maybe one day it will all work everywhere. :)

Thanks again for your help!


Gé van Gasteren

unread,
Nov 10, 2023, 1:51:14 PM11/10/23
to ukelel...@googlegroups.com
OK, thanks for the heads-up about KbdEdit/Windows.

Keyman can be interesting in a multi-user environment, where people use all kinds of devices.
You create/modify keyboard layouts in Keyman Developer, which runs only on Windows – but that is not a problem for you.
This Developer then exports/generates the files for the various operating systems.

A last thought – unhindered by any knowledge of your target script: what if you replaced the macron below by a dot below?
In printed text, a dot should be good enough to read easily, and in handwriting you can still use a macron.
Reply all
Reply to author
Forward
0 new messages