all-in-one super TTF file

1,231 views
Skip to first unread message

Peter Weber

unread,
Nov 13, 2014, 11:13:53 AM11/13/14
to noto...@googlegroups.com
Hello
It is a long time I'm looking for a complet font like GNU Unifont,
but I never found a nice one. Noto looks like a good solution,
if someone compiles a all-in-one super TTF file!
All characters from 0x0000 to 0xFFFF will be a good start.

Best Regards

Waldir Pimenta

unread,
Nov 14, 2014, 12:04:25 PM11/14/14
to noto...@googlegroups.com
Hi, Peter.

I agree it would be great. This has been raised before (issue #13) and the comments there point to technical difficulties with this idea. Particularly, to quote comment #7:
All modern font formats, such as TrueType and OpenType, limit the number of glyphs to 64K. Unicode currently includes well over 100K characters, meaning that multiple font resources are a necessity.

It's interesting to see the take of GNU Unifont on this (from https://savannah.gnu.org/projects/unifont, emphasis mine):
The Unicode Basic Multilingual Plane (BMP) covers the first 65,536 (or 2^16) Unicode code points. Initially Unicode was a 16-bit encoding, allowing 2^16 = 65,536 code points. Today Unicode has grown beyond that early limitation. Its initial 16-bit range is now known as the Basic Multilingual Plane (BMP), or Plane 0. The BMP contains most of the world's scripts that are in current use. The Unicode encoding space now covers 17 such planes of 65,536 code points each. (...) Because of the limitations of TrueType, an individual font has a practical limitation of 65,536 code points.

And from http://unifoundry.com/:
As of 20 June 2008 (...), GNU Unifont had a glyph for every printable code point in the Unicode BMP. That version covered Unicode 5.1, which was the current version at the time.

The BMP is, indeed, the range 0x0000 to 0xFFFF which you indicate in your email. I don't see why Noto couldn't do the same.

--Waldir


Waldir Pimenta

unread,
Nov 14, 2014, 12:16:54 PM11/14/14
to noto...@googlegroups.com
For completeness' sake, I'll also quote http://unifoundry.com/unifont.html:
The latest release of GNU Unifont [contains] glyphs for every printable code point in the Unicode 7.0 Basic Multilingual Plane (BMP).

(Unicode 7.0 is --currently--- the latest release of the Unicode standard, from June 2014).

--Waldir

Roozbeh Pournader

unread,
Nov 14, 2014, 12:46:06 PM11/14/14
to noto...@googlegroups.com
As others have mentioned, there are technical limitations with the number of glyphs in a font. (Also note that the number of glyphs needed to support a certain number of characters could be a multiplier. For example, our recently released Nastaliq Urdu font has 1140 glyphs in order to support 265 characters.) Then, our Noto CJK fonts are already at the 65536-glyphs limit.

That said, you can take the Noto fonts and build your own superfont. There is a sample script for doing that at https://code.google.com/p/noto/source/browse/nototools/merge_noto.py, which you can modify to include the fonts you want in your superfont.

--
You received this message because you are subscribed to the Google Groups "noto-font" group.
To unsubscribe from this group and stop receiving emails from it, send an email to noto-font+...@googlegroups.com.
To post to this group, send email to noto...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/noto-font/d76a69e9-8974-4bba-a619-c759bb6f5a52%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Peter Weber

unread,
Nov 15, 2014, 6:38:43 AM11/15/14
to noto...@googlegroups.com
I have already found the merge_noto.py script, but there are still some problems with the merge function from fonttools and duplicated glyphs.
Then I miss all the CJK glyphs and I can not get them from the CJK font files. Somehow thus CJK files are broken. Try to open them in 
the fontforge editor and you get only the latin letters A-Z.
There is also some function missing. For example restrict the range to 0x0000 - 0xFFFF and don't merge glyphs without code points.
Why are there unused glyphs in the fonts like "init, medi, fina" characters in KufiArabic? Fontforge will place all thus glyphs at the end of file.

P.S. Gnu Unifont stays in the UTF16 range and don't implement multibyte characters. We have now the same problem as with UTF-8 characters.
Look for example how in windows a UTF32 character is displayed: actually Windows uses two UTF16 characters. So a good start will be a complete 
UTF16 fontset 0x0000 to 0xFFFF (Plane 0)

P.S.S From the programmers viewpoint: You have a Grid and you will display all characters from the Plane 0, with a complet font you don't need extra logic to switch fonts to display all characters.

P.S.S.S From the users viewpoint: You have a program witch is interfacing to library database with multilingual books. You don't like to see tofus on your screen for the transliterations. 

Roozbeh Pournader

unread,
Nov 18, 2014, 5:18:32 PM11/18/14
to noto...@googlegroups.com


On Nov 15, 2014 3:38 AM, "Peter Weber" <fips.a...@gmail.com> wrote:
>
> I have already found the merge_noto.py script, but there are still some problems with the merge function from fonttools and duplicated glyphs.

Would you please elaborate on what's broken with duplicated glyphs?

> Then I miss all the CJK glyphs and I can not get them from the CJK font files. Somehow thus CJK files are broken. Try to open them in 
> the fontforge editor and you get only the latin letters A-Z.

That appears to be a bug in FontForge.

> There is also some function missing. For example restrict the range to 0x0000 - 0xFFFF and don't merge glyphs without code points.

Subsetting can be done using fontTools too. Please note that a lot of important characters (especially emoji) are outside the BMP range now.

> Why are there unused glyphs in the fonts like "init, medi, fina" characters in KufiArabic? Fontforge will place all thus glyphs at the end of file.

They are not unused. They are triggered by GSUB rules that make sure Arabic is displayed correctly. You'd need them to display Arabic correctly.

Peter Weber

unread,
Nov 20, 2014, 9:42:22 AM11/20/14
to noto...@googlegroups.com
Sorry, I had to test a lot to comme to the conclusion: The problem is fontforge on MAC.
I can transform the OTF on Windows but not on MAC. Still some questions why 
the glyphs not showing directly and I have to flatten and reencode the files.
This is also the problem on the Mac. I can flatten the file, but then I can not
get the characters in the right order. Most things are all right, but in the range
before uffff a lot is not on the right place.

Is it possible to create the CJK files directly in TTF format? Merging with the 
other Noto files would be much easier!  

Best Regards 

Roozbeh Pournader

unread,
Nov 20, 2014, 4:32:41 PM11/20/14
to noto...@googlegroups.com
On Thu, Nov 20, 2014 at 6:42 AM, Peter Weber <fips.a...@gmail.com> wrote:
Is it possible to create the CJK files directly in TTF format? Merging with the 
other Noto files would be much easier!  

I know :-)

It's possible, but it's not a priority for us at the moment. But as you found, various tools can do the conversion, with varying level of quality.
Reply all
Reply to author
Forward
0 new messages