Nikud

145 views
Skip to first unread message

E L

unread,
Oct 15, 2012, 3:43:48 AM10/15/12
to open...@googlegroups.com
Hello,

First all welcome to everyone who joined. I hope this group will be able to increase to cooperation between different
OpenSource/OpenContent jewish projects.

If you know more people who are interested in joining please feel free to invite them.

I think that a subject which is common to almost every project, is the manner of nikud.
It became costume lately to mark sheva na, kamatz katan and dagesh hazak differently in order to make the reading more accurate.
While kamatz katan seems to finally have its own unicode char the other two are still missing.
What I was trying to do so far is:

- Talk to the culmus people about adding special signs for all 3, using either the unicode char or private space, which seems to be the unicode
way for adding extra chars.

- Write a python script that discovers and mark them (WIP)

- Work with wikipedia people about integrating it into mediawiki, but this is more project specific.

I would be happy to hear your opinion and suggestions about the topic. Is adding a new font really the right way to go about it?
Are there other characters which are missing?

Ely

Efraim Feinstein

unread,
Oct 16, 2012, 3:23:16 AM10/16/12
to open...@googlegroups.com
Hi,

Thank you for setting up the list.

On 10/15/2012 12:43 AM, E L wrote:
>
> I think that a subject which is common to almost every project, is the
> manner of nikud.

Definitely.

> It became costume lately to mark sheva na, kamatz katan and dagesh
> hazak differently in order to make the reading more accurate.

> While kamatz katan seems to finally have its own unicode char the
> other two are still missing.
> What I was trying to do so far is:
>
> - Talk to the culmus people about adding special signs for all 3,
> using either the unicode char or private space, which seems to be the
> unicode
> way for adding extra chars.

Most (all?) of the Culmus fonts already support qamats qatan.

The *only* correct way to add characters at this point is to use the
Unicode Private Use Area. Any other way breaks the Unicode standard.

The best solution is to propose the characters to the Unicode Technical
Committee. Note that the sheva na character was proposed and rejected
back in 2002/3 (or thereabouts). With enough people saying "we need it",
I think it could get through, as its necessity was one of the sticking
points.

>
> - Write a python script that discovers and mark them (WIP)

We made some attempts at this. It is doable *to some extent*, but also
reveals some issues:
(1) ambiguities: basically, if you know *one* of qamats qatan or sheva
na, you know the other. If you know neither,
the answer is ambiguous without understanding context. If you know
either, dagesh kal/dagesh hazak are easily distinguishable automatically.
(2) differences in custom: eg, is the sheva merachef (medial sheva)
pronounced as a sheva na (eg, Chabad) or a sheva nach (almost everyone
else?).

> - Work with wikipedia people about integrating it into mediawiki, but
> this is more project specific.

Not sure what you mean here. Most of Hebrew Wikipedia is without vowels.
(MediaWiki, Wikisource, and Wikipedia are different projects) A bigger
issue might be that even biblical keyboards won't support PUA characters.

> I would be happy to hear your opinion and suggestions about the topic.
> Is adding a new font really the right way to go about it?
> Are there other characters which are missing?

There's one more Unicode issue: HOLAM HASER FOR VAV, which is an actual
Unicode 5.0 character -- the holam dot above a vav when it is a holam
haser and not a holam maleh. Its existence is entirely a typographic
issue. Holam haser for vav should be typeset to the left of the vav,
holam maleh should be typeset above the vav. There is no programmatic
ambiguity to distinguishing the two if they are not distinguished.

My 2c,


--
---
Efraim Feinstein
Lead Developer
Open Siddur Project
http://opensiddur.net
http://wiki.jewishliturgy.org

E L

unread,
Oct 16, 2012, 3:46:34 AM10/16/12
to Efraim Feinstein, open...@googlegroups.com
On Tue, Oct 16, 2012 at 9:23 AM, Efraim Feinstein <efraim.f...@gmail.com> wrote:

While kamatz katan seems to finally have its own unicode char the other two are still missing.
What I was trying to do so far is:

- Talk to the culmus people about adding special signs for all 3, using either the unicode char or private space, which seems to be the unicode
way for adding extra chars.

It became costume lately to mark sheva na, kamatz katan and dagesh hazak differently in order to make the reading more accurate.
Most (all?) of the Culmus fonts already support qamats qatan.


Yes, the issue with Kamatz Katan is that it has no official shape. Some make it a bit longer, some make it bolder. The question is what is the way that
is most appropriate for computer screen/printing. 

The *only* correct way to add characters at this point is to use the Unicode Private Use Area. Any other way breaks the Unicode standard.

agreed.
 
The best solution is to propose the characters to the Unicode Technical Committee. Note that the sheva na character was proposed and rejected back in 2002/3 (or thereabouts). With enough people saying "we need it", I think it could get through, as its necessity was one of the sticking points.

True, but the only way to get more people to say we need it, is to get them used it :-)
 


- Write a python script that discovers and mark them (WIP)

We made some attempts at this. It is doable *to some extent*, but also reveals some issues:
(1) ambiguities: basically, if you know *one* of qamats qatan or sheva na, you know the other. If you know neither,
the answer is ambiguous without understanding context. If you know either, dagesh kal/dagesh hazak are easily distinguishable automatically.
I used very basic rules, can you give me an example to something the script might miss?
 
(2) differences in custom: eg, is the sheva merachef (medial sheva) pronounced as a sheva na (eg, Chabad) or a sheva nach (almost everyone else?).

Yes, I'm going according to the rav mazuz custom, but I guess the script can take it as a parameter.
I guess it's not that hard to support all of them, the question is how do you mark it? Should we add a different private char for Shva merachef?
Or different char for Kamatz Katan which is different between harav broer and harav mazuz?

 
- Work with wikipedia people about integrating it into mediawiki, but this is more project specific.
Not sure what you mean here. Most of Hebrew Wikipedia is without vowels. (MediaWiki, Wikisource, and Wikipedia are different projects) A bigger issue might be that even biblical keyboards won't support PUA characters.

Wikipedia uses nikud in places where it explains how to pronounce a word. Also the lack of free nakdan makes it really tedious to add nikud in Hebrew. Hopefully with the
right tools it will become more common.


I would be happy to hear your opinion and suggestions about the topic. Is adding a new font really the right way to go about it?
Are there other characters which are missing?

There's one more Unicode issue: HOLAM HASER FOR VAV, which is an actual Unicode 5.0 character -- the holam dot above a vav when it is a holam haser and not a holam maleh. Its existence is entirely a typographic issue. Holam haser for vav should be typeset to the left of the vav, holam maleh should be typeset above the vav. There is no programmatic ambiguity to distinguishing the two if they are not distinguished.

Actually I'll ask maxim if he wants to join the discussion :-)
My 2c,



--
---
Efraim Feinstein
Lead Developer
Open Siddur Project
http://opensiddur.net
http://wiki.jewishliturgy.org

--



Efraim Feinstein

unread,
Oct 16, 2012, 1:12:50 PM10/16/12
to E L, open...@googlegroups.com
Hi,


On 10/16/2012 12:46 AM, E L wrote:

Yes, the issue with Kamatz Katan is that it has no official shape. Some make it a bit longer, some make it

No character has an "official shape!" Unicode defines code points, not glyphs. The glyph representation is entirely up to the font designer (as it should be).


 
The best solution is to propose the characters to the Unicode Technical Committee. Note that the sheva na character was proposed and rejected back in 2002/3 (or thereabouts). With enough people saying "we need it", I think it could get through, as its necessity was one of the sticking points.

True, but the only way to get more people to say we need it, is to get them used it :-)

The UTC can also be convinced by published books. There weren't many in '02/'03. There are a lot more now. Also, users actually saying "we need it" on the Unicode email lists during the discussion would help.





- Write a python script that discovers and mark them (WIP)

We made some attempts at this. It is doable *to some extent*, but also reveals some issues:
(1) ambiguities: basically, if you know *one* of qamats qatan or sheva na, you know the other. If you know neither,
the answer is ambiguous without understanding context. If you know either, dagesh kal/dagesh hazak are easily distinguishable automatically.
I used very basic rules, can you give me an example to something the script might miss?

The true, correct, and unhelpful answer: not without source code. :-)
Try: לִשְׁמָרְךָ, צִדְּקָתְךָ
(first examples I could think of)


 
(2) differences in custom: eg, is the sheva merachef (medial sheva) pronounced as a sheva na (eg, Chabad) or a sheva nach (almost everyone else?).

Yes, I'm going according to the rav mazuz custom, but I guess the script can take it as a parameter.
I guess it's not that hard to support all of them, the question is how do you mark it? Should we add a different private char for Shva merachef?

Not really. According to most opinions, there is no such thing as a shva merachef. It's either a sheva na or a sheva nach.


Or different char for Kamatz Katan which is different between harav broer and harav mazuz?

Too many characters just gets confusing to type and read. I don't think adding characters is the answer.


 
- Work with wikipedia people about integrating it into mediawiki, but this is more project specific.
Not sure what you mean here. Most of Hebrew Wikipedia is without vowels. (MediaWiki, Wikisource, and Wikipedia are different projects) A bigger issue might be that even biblical keyboards won't support PUA characters.

Wikipedia uses nikud in places where it explains how to pronounce a word. Also the lack of free nakdan makes it really tedious to add nikud in Hebrew. Hopefully with the
right tools it will become more common.

Auto-pointing words in context is hard, but might be doable to good accuracy with machine learning; out of context is *really* hard (particularly nouns): Try this and you'll see what I mean: עבד
:-)



There's one more Unicode issue: HOLAM HASER FOR VAV, which is an actual Unicode 5.0 character -- the holam dot above a vav when it is a holam haser and not a holam maleh. Its existence is entirely a typographic issue. Holam haser for vav should be typeset to the left of the vav, holam maleh should be typeset above the vav. There is no programmatic ambiguity to distinguishing the two if they are not distinguished.

BTW, this character is already in Culmus. (Culmus/Culmus Ancient Scripts has been really good at keeping up with Unicode AFAICT).

Does that make it 4c,

Dovi Jacobs

unread,
Oct 16, 2012, 3:36:01 PM10/16/12
to Efraim Feinstein, E L, open...@googlegroups.com
Hi, without responding to all the points that have been made, here are a few comments:

*Any automatic script for grammatical issues like this would be immensely helpful and valuable. However, there are always exceptions to rules, unusual theories or customs, as well as divergent opinions in certain local contexts, so the ability to override the script and make a manual decision is absolutely necessary.
*I use the unicode for holam-haser-for-vav. A complete, objective list of all its occurences in Tanakh is available here.
*In the edition of the Tanakh that I am working on, I have indicated qamaz qatan since the unicode character is available. I used a simple template for tagging the places where qamaz qatan is different according to the two main customs and grammatical approaches, which theoretically allows for the user to choose which custom he wants to view.
*I haven't dealt with shewa both because the unicode doesn't exist and because it would be exceedingly tedious to do without an automatic function.
*Regarding Unicode acceptance, Ephraim is current that the literal *flood* of published books that distinguish na/nah, dagesh kal/hazak, etc. has been over the past decade. I highly doubt they are aware of this, but if brought to their attention it should make an impact. The WLC people are entirely unaware of this (Israel is not their scene), and it is they who more than any others made the original requests for unicode characters for biblical Hebrew.
*Add to the wish-list: Mapiq, silluq, alternative short meteg (has a variety of uses; Rav Breuer put it to work in all three of his editions).



From: Efraim Feinstein <efraim.f...@gmail.com>
To: E L <nak...@gmail.com>
Cc: open...@googlegroups.com
Sent: Tuesday, October 16, 2012 7:12 PM
Subject: Re: Nikud

--
 
 


E L

unread,
Oct 16, 2012, 4:49:32 PM10/16/12
to Efraim Feinstein, open...@googlegroups.com
On Tue, Oct 16, 2012 at 7:12 PM, Efraim Feinstein <efraim.f...@gmail.com> wrote:
Hi,


On 10/16/2012 12:46 AM, E L wrote:

Yes, the issue with Kamatz Katan is that it has no official shape. Some make it a bit longer, some make it

No character has an "official shape!" Unicode defines code points, not glyphs. The glyph representation is entirely up to the font designer (as it should be).


But in a way we are wearing the designer hat. If every application will chose a font that represents those characters
differently it will be very confusing for people.
 
The best solution is to propose the characters to the Unicode Technical Committee. Note that the sheva na character was proposed and rejected back in 2002/3 (or thereabouts). With enough people saying "we need it", I think it could get through, as its necessity was one of the sticking points.

True, but the only way to get more people to say we need it, is to get them used it :-)

The UTC can also be convinced by published books. There weren't many in '02/'03. There are a lot more now. Also, users actually saying "we need it" on the Unicode email lists during the discussion would help.





- Write a python script that discovers and mark them (WIP)

We made some attempts at this. It is doable *to some extent*, but also reveals some issues:
(1) ambiguities: basically, if you know *one* of qamats qatan or sheva na, you know the other. If you know neither,
the answer is ambiguous without understanding context. If you know either, dagesh kal/dagesh hazak are easily distinguishable automatically.
I used very basic rules, can you give me an example to something the script might miss?

The true, correct, and unhelpful answer: not without source code. :-)
Try: לִשְׁמָרְךָ, צִדְּקָתְךָ
(first examples I could think of)

My code is still very basic I'll keep those as trial examples:)

 
(2) differences in custom: eg, is the sheva merachef (medial sheva) pronounced as a sheva na (eg, Chabad) or a sheva nach (almost everyone else?).

Yes, I'm going according to the rav mazuz custom, but I guess the script can take it as a parameter.
I guess it's not that hard to support all of them, the question is how do you mark it? Should we add a different private char for Shva merachef?

Not really. According to most opinions, there is no such thing as a shva merachef. It's either a sheva na or a sheva nach.


Or different char for Kamatz Katan which is different between harav broer and harav mazuz?

Too many characters just gets confusing to type and read. I don't think adding characters is the answer.


We do need to find some solution thought. Or it will grow to be a very annoying problem while sharing texts.
 
- Work with wikipedia people about integrating it into mediawiki, but this is more project specific.
Not sure what you mean here. Most of Hebrew Wikipedia is without vowels. (MediaWiki, Wikisource, and Wikipedia are different projects) A bigger issue might be that even biblical keyboards won't support PUA characters.

Wikipedia uses nikud in places where it explains how to pronounce a word. Also the lack of free nakdan makes it really tedious to add nikud in Hebrew. Hopefully with the
right tools it will become more common.

Auto-pointing words in context is hard, but might be doable to good accuracy with machine learning; out of context is *really* hard (particularly nouns): Try this and you'll see what I mean: עבד
:-)

I know, though there are some articles who get to 95% accuracy in hebrew. But even a basic nakdan that just does
nikud suggestions will already save a lot of work. e.g. you press on a word and see a list of the possible nikud.

Efraim Feinstein

unread,
Oct 16, 2012, 6:02:24 PM10/16/12
to E L, open...@googlegroups.com
On 10/16/2012 01:49 PM, E L wrote:
No character has an "official shape!" Unicode defines code points, not glyphs. The glyph representation is entirely up to the font designer (as it should be).

Not really -- as a text provider, I don't consider this my job at all, the same way the developer of a word processor doesn't care what TTFs are available on a system. Every provider might make a choice of *default*, but, beyond that, it's the user's choice.




We do need to find some solution thought. Or it will grow to be a very annoying problem while sharing texts.

Open Siddur (proposes to -- no texts actually have this yet) use a system similar to Dovi's -- put the variant in markup.
Auto-pointing words in context is hard, but might be doable to good accuracy with machine learning; out of context is *really* hard (particularly nouns): Try this and you'll see what I mean: עבד
:-)

I know, though there are some articles who get to 95% accuracy in hebrew. But even a basic nakdan that just does
nikud suggestions will already save a lot of work. e.g. you press on a word and see a list of the possible nikud.

I'd presume that means 95% accuracy *with context* (also, it means 1/20 points is wrong) -- Without context, I'd be very skeptical of that kind of claim. Also, it's far easier to get high accuracy when the spelling is consistent with modern Hebrew conventions.

6c(?),

E L

unread,
Oct 17, 2012, 1:09:49 AM10/17/12
to Efraim Feinstein, open...@googlegroups.com
On Wed, Oct 17, 2012 at 12:02 AM, Efraim Feinstein <efraim.f...@gmail.com> wrote:
On 10/16/2012 01:49 PM, E L wrote:
No character has an "official shape!" Unicode defines code points, not glyphs. The glyph representation is entirely up to the font designer (as it should be).

Not really -- as a text provider, I don't consider this my job at all, the same way the developer of a word processor doesn't care what TTFs are available on a system. Every provider might make a choice of *default*, but, beyond that, it's the user's choice.

I agree, but someone needs to chose which choices to give the user:)
But nevermind that, culmus people usually do much better job than us in making fonts:)



We do need to find some solution thought. Or it will grow to be a very annoying problem while sharing texts.

Open Siddur (proposes to -- no texts actually have this yet) use a system similar to Dovi's -- put the variant in markup.
markup is a huge problem when collaborating, it's hard standardize, hard to parse and hard to type. It's
already better just to add another char and let it show according to the font.  
Auto-pointing words in context is hard, but might be doable to good accuracy with machine learning; out of context is *really* hard (particularly nouns): Try this and you'll see what I mean: עבד
:-)

I know, though there are some articles who get to 95% accuracy in hebrew. But even a basic nakdan that just does
nikud suggestions will already save a lot of work. e.g. you press on a word and see a list of the possible nikud.

I'd presume that means 95% accuracy *with context* (also, it means 1/20 points is wrong) -- Without context, I'd be very skeptical of that kind of claim. Also, it's far easier to get high accuracy when the spelling is consistent with modern Hebrew conventions.

Yes, of course with context. I wasn't even talking about without context. But in our case context is not the problem..
6c(?),

what's 6c?

Ely


-- 
---
Efraim Feinstein
Lead Developer
Open Siddur Project
http://opensiddur.net
http://wiki.jewishliturgy.org

--
 
 

Maxim Iorsh

unread,
Oct 24, 2012, 4:58:03 AM10/24/12
to open...@googlegroups.com, Efraim Feinstein, ely...@cs.huji.ac.il
Hello all,

I generally take a neutral position about Unicode additions, mostly because I'm not competent enough to decide whether shva na, dagesh hazak etc. represent separate glyphs, and how they should be implemented. You can take a look at http://unicode.org/mail-arch/unicode-ml/y2004-m05/0234.html, which discusses and successfully leads to the inclusion of qamats qatan. In general, people on Unicode mailing list can have very valuable insights about this topic.

Each font in the Culmus Project is usually maintained by its original creator. Thus, Yoram is responsible for fonts with taamim, and I can perform additions in David/Frank/Miriam. I don't see any problem adding the desired glyphs to the Private Use Area, and the fonts can even support simple substitutions, such as "always put dagesh hazak in Beged-Kefet" etc.

However, since fonts without taamim are oriented towards modern Hebrew, any substitution rule for special nikud will have to be optional and thus accessible only with special OpenType-aware software (for now, not supported by Open Office, and the browser support is limited - see https://developer.mozilla.org/en-US/docs/CSS/font-feature-settings).

Please also that any text file which directly references Private Use Area, will not be stable, unless distributed with a specific version of font. I'm not going to keep glyphs in PUA once they attain proper Unicode position.

Regarding the visual appearance, in my fonts qamats qatan is usually narrower than qamats gadol, and has longer tail. I think this distortion is sufficient to attract reader's attention. Making it bolder seems too flashy and distracting to me. However this is my personal opinion. Everyone can modify any Culmus font for his own purposes and according to his preferences in conformance to GNU GPL.

E L

unread,
Oct 24, 2012, 1:08:42 PM10/24/12
to Dovi Jacobs, Efraim Feinstein, open...@googlegroups.com
Hi
What's a silluq?

Ely

E L

unread,
Oct 24, 2012, 1:15:22 PM10/24/12
to Maxim Iorsh, open...@googlegroups.com, Efraim Feinstein
Hi Maxim,

Thanks for joining the discussion.
I think that dagesh hazak and shva na are the two main things which are missing.
I saw that Dovi were talking about 3 other signs, but AFAIK mapik is drawn the same as a dagesh.
And I don't know what a siluq is but I don't think they will agree to add another makaf.

Efraim will you be willing to coordinate the request for those 2 signs?
I don't mind helping with getting more information or taking pictures from books.

Ely


--
 
 

Efraim Feinstein

unread,
Oct 24, 2012, 1:41:06 PM10/24/12
to E L, Maxim Iorsh, open...@googlegroups.com
Hi,

I don't have any experience with making a request with the Unicode
Technical Committee. It sounds like Maxim might be more familiar with
the process(?)

For mapiq/dagesh or dagesh hazak/dagesh kal, it's not entirely clear to
me that a different character is needed (though I could be convinced
easily :-) ). They're both resolvable unambiguously using the current
system.

Silluq is currently overloaded to meteg. Is there any case of ambiguity
between them that cannot be resolved? (I'm not sure about this one!)

Short meteg/left-vs-right meteg, etc.: It's a rare occurrence and
there's not much differentiation in the literature, so it might be
difficult to get a new character.

On the other hand, sheva na/sheva nach is not unambiguously resolvable,
so I think there might be the best case for the sheva request.

Maybe someone with experience could answer these questions: Is each
character request handled independently? Would requesting too much at
once negatively impact the chances of any of the characters being added?
If so, I'd like to prioritize the sheva na character. Also, does anyone
know the process? Would it help if we all get on the UTC's mailing lists?

Admittedly, I haven't seen this in a while, but it looked like there was
some dispute among the UTC members as to what the purpose of Unicode
characters are. Some of them think that if characters always look alike
in glyph form, they should be the same character (Convincing evidence
for them would be a good number of scanned books that do differentiate).
Others think that the character merely has to be semantically different
and useful (Convincing evidence would be enough people saying "we need
this"). The stated purpose is the latter, but apparently, it's not a
universally held belief.

Dovi Jacobs

unread,
Oct 25, 2012, 3:01:53 AM10/25/12
to Efraim Feinstein, E L, Maxim Iorsh, open...@googlegroups.com
Hi everyone, responses to some points and questions that have been raised:

1. A silluq is the true sign for sof pasuq, and is indicated in the stressed syllable in the final word of the pasuq. It looks like a meteg but it is not one. Adding two dots after the pasuq is a custom that the manuscripts don't always keep; the silluq is actually more important than the better known "sof pasuq" sign. In terms of Unicode this is not urgent, because the final "meteg" in the verse is always really the silluq, but it would still be nice.

2. It is true that mappiq and dagesh appear the same. Dagesh hazaq in "heh" would default to mappiq. So this is also not urgent, but would be nice to have someday.

3. Ephraim noted: "Short meteg/left-vs-right meteg". These are two entirely different issues. In terms of left-vs-right meteg, the people at WLC are very careful about this because they want to convey every orthographic anomaly in the Leningrad Codex (even though the distinction has no value whatsoever). They have also found an adequate way to accomplish this for their needs, by designing a font that can show the meteg before or after the vowel depending on whether it is entered before or after the vowel. But I don't think this distinction is relevant to any of us, nor does it appear in any Jewish edition (only in the WLC/BHQ type literature).

4. However, short/long meteg IS extremely relevant, and I will explain why. First of all it is widely used in some of the most important Tanakh editions of the past 30 years, namely Mosad Harav Kook, Horev, and Keter Yerushalayim (the three Breuer editions). It is not rare, but employed tens of thousands of times in these three editions. So there is a huge differentiation in the literature.

Furthermore, through my experience working on a Tanakh edition I've learned (as others have before me) that metagim/ge'ayot are by far the most difficult and widespread choice that has to be made in any edition because the variations between the manuscripts and printed editions are so huge (as opposed to differences in letters, vowels, and cantillation notes, which are relatively rare compared to metagim). The purpose of long versus short metagim is to indicate whether the meteg occurs in the source text, or whether it has been added by the editor based on other considerations. This is how Breuer used them, and it is extremely relevant for any kind of edition we might contemplate doing. Without this the documentation becomes extraordinarily difficult or impossible. Alternatively: Different metagim have different functions, and these can be indicated by using two different signs. For either of these reasons, this is definitely a distinction that is sorely needed in order to prevent losing information when entering the text of Tanakh in a digital format.

5. Regarding "qamaz qatan": I agree with Maxim that, based on the what has already become the custom in many printed editions, it should be narrower and have a longer tail. That said, I think it is important to stress that the tail should be *significantly* longer so that the distinction is absolutely clear to someone reading from the screen. In many of the fonts I've looked at there is a distinction, but it still isn't easy to distinguish the two.



From: Efraim Feinstein <efraim.f...@gmail.com>
To: E L <nak...@gmail.com>
Cc: Maxim Iorsh <maxim...@gmail.com>; open...@googlegroups.com
Sent: Wednesday, October 24, 2012 7:41 PM
Subject: Re: Nikud
--



E L

unread,
Oct 25, 2012, 3:30:17 AM10/25/12
to Dovi Jacobs, Efraim Feinstein, Maxim Iorsh, open...@googlegroups.com
On Thu, Oct 25, 2012 at 9:01 AM, Dovi Jacobs <dovij...@yahoo.com> wrote:
Hi everyone, responses to some points and questions that have been raised:

1. A silluq is the true sign for sof pasuq, and is indicated in the stressed syllable in the final word of the pasuq. It looks like a meteg but it is not one. Adding two dots after the pasuq is a custom that the manuscripts don't always keep; the silluq is actually more important than the better known "sof pasuq" sign. In terms of Unicode this is not urgent, because the final "meteg" in the verse is always really the silluq, but it would still be nice.

2. It is true that mappiq and dagesh appear the same. Dagesh hazaq in "heh" would default to mappiq. So this is also not urgent, but would be nice to have someday.

3. Ephraim noted: "Short meteg/left-vs-right meteg". These are two entirely different issues. In terms of left-vs-right meteg, the people at WLC are very careful about this because they want to convey every orthographic anomaly in the Leningrad Codex (even though the distinction has no value whatsoever). They have also found an adequate way to accomplish this for their needs, by designing a font that can show the meteg before or after the vowel depending on whether it is entered before or after the vowel. But I don't think this distinction is relevant to any of us, nor does it appear in any Jewish edition (only in the WLC/BHQ type literature).

4. However, short/long meteg IS extremely relevant, and I will explain why. First of all it is widely used in some of the most important Tanakh editions of the past 30 years, namely Mosad Harav Kook, Horev, and Keter Yerushalayim (the three Breuer editions). It is not rare, but employed tens of thousands of times in these three editions. So there is a huge differentiation in the literature.

Furthermore, through my experience working on a Tanakh edition I've learned (as others have before me) that metagim/ge'ayot are by far the most difficult and widespread choice that has to be made in any edition because the variations between the manuscripts and printed editions are so huge (as opposed to differences in letters, vowels, and cantillation notes, which are relatively rare compared to metagim). The purpose of long versus short metagim is to indicate whether the meteg occurs in the source text, or whether it has been added by the editor based on other considerations. This is how Breuer used them, and it is extremely relevant for any kind of edition we might contemplate doing. Without this the documentation becomes extraordinarily difficult or impossible. Alternatively: Different metagim have different functions, and these can be indicated by using two different signs. For either of these reasons, this is definitely a distinction that is sorely needed in order to prevent losing information when entering the text of Tanakh in a digital format.

We should think of a way to explain it to the unicode people. But I agree with efraim, we should start with shva na, I think then dagesh chazak and then the meteg.
We need to see what they are looking for a write a format letter with examples, I can take pictures from books if it will help to convince them.
 
5. Regarding "qamaz qatan": I agree with Maxim that, based on the what has already become the custom in many printed editions, it should be narrower and have a longer tail. That said, I think it is important to stress that the tail should be *significantly* longer so that the distinction is absolutely clear to someone reading from the screen. In many of the fonts I've looked at there is a distinction, but it still isn't easy to distinguish the two.

Well, that's a font issue. If someone wants it in a different way one can always make a new font:)

Ely

E L

unread,
Oct 25, 2012, 3:36:23 AM10/25/12
to Efraim Feinstein, Maxim Iorsh, open...@googlegroups.com
Hi,

In sfaradik tradition dagesh kal and chazak are pronounced differently. Also sometimes the world is different according to the type of the dagesh.
Known example is Mashiv haroch, with dagesh kal it means bringing back the wind and with dagesh kazak (in the shin) it means blowing the wind.

Btw to all the font experts, it seems the current way of adding teamin is putting them in the same level as the nikud.
This gets very confusing when reading. Is there a way to make the nikud above or below the nikud level? Like it is done in bible books.

Ely
--



Dovi Jacobs

unread,
Oct 25, 2012, 3:58:24 AM10/25/12
to E L, Efraim Feinstein, Maxim Iorsh, open...@googlegroups.com
We should think of a way to explain it to the unicode people. But I agree with efraim, we should start with shva na, I think then dagesh chazak and then the meteg.

We need to see what they are looking for a write a format letter with examples, I can take pictures from books if it will help to convince them.
I would suggest an application for all three at once. Why wait?

The main thing that will help convince them is evidence of orthographic differences in established editions. So Ely, if you could take pictures from books that would be an amazing way to help!

Baruj Nahón

unread,
Jun 10, 2015, 6:52:43 AM6/10/15
to open...@googlegroups.com
This post is some years old, but I don't know if is out there something new about this question. I've certainly seen shva na and kamatz katan implemented in some fonts (or in Davka Writer), but I haven't seen anything about dagesh hazak. Any news? 

I am working in a family siddur project. We follow moroccan nusach, but there's no published moroccan siddur that has everything we need, so I thought about making our one. It will be a hard task because I have cabbalists in my family, also dikduk perfeccionists that won't allow me to write anything without the "famous" question of Kamatz Katan, Dagesh Kal / Dagesh Hazak, Shva Na’ / Shva Nach  etc. (ok, I'm one of those, too), I have also people that wants halachot and minhag written in the siddur, and also people that needs traslation and trasliteration. Hard, uh?

Luckly, these days there's lots of help online, and I think it's doable. I have experience editing and autopublishing books, also.
Reply all
Reply to author
Forward
0 new messages