Latin + Indic Font merging

53 views
Skip to first unread message

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Feb 14, 2026, 1:00:14 AMFeb 14
to sanskrit-programmers

Can someone merge all Noto Sans Indic and Latin fonts into a single functioning font?

Motivation 
managing font transitions in autogenerated (from md) latex files can be hairy - like - 

\setmainfont{Noto Sans}[Script=Latin,Renderer=HarfBuzz]

\newfontfamily\devanagarifont{Noto Sans Devanagari}[
  Script=Devanagari,
  Renderer=HarfBuzz,
  Scale=1.3,
  ItalicFont={Noto Sans Devanagari},
  ItalicFeatures={FakeSlant=0.2},
  BoldItalicFont={Noto Sans Devanagari},
  BoldItalicFeatures={FakeSlant=0.2, Weight=Bold}
]

\setTransitionsForDevanagari{\devanagarifont}{\rmfamily}
\setTransitionsForDevanagariExtended{\devanagarifont}{\rmfamily}
\setTransitionsForDevanagariExtendedA{\devanagarifont}{\rmfamily}
\setTransitionsForVedicExtensions{\devanagarifont}{\rmfamily}
\setTransitionsFor{DevanagariDanDa}{\devanagarifont}{\rmfamily}


% Define a font that contains the arrow symbol (Noto Sans Symbols is best)
\newfontfamily\symbolfont{Noto Sans Symbols}[Renderer=HarfBuzz]

% FIX: Use newunicodechar for arrows instead of ucharclasses transitions
\usepackage{newunicodechar}
\newunicodechar{→}{{\symbolfont →}}
\newunicodechar{←}{{\symbolfont ←}}
\newunicodechar{↔}{{\symbolfont ↔}}

--
--
Vishvas /विश्वासः

Krishna Thapa

unread,
Feb 18, 2026, 12:56:45 AMFeb 18
to sanskrit-programmers
I rebuilt the merged fonts (thanks to LLMs!)

Gist with the fix: merge_noto_sans_indic.py

It produces two working fonts (Regular + Bold) — 8,911 glyphs, 3,792 codepoints, covering Latin, Cyrillic, Greek, Devanagari, Bengali, Gujarati, Gurmukhi, Kannada, Malayalam,
  Oriya, Tamil, Telugu. All GSUB shaping tables preserved, so conjuncts and ligatures work correctly.

LaTeX setup simplifies from ~30 lines to:

\usepackage{fontspec}
\setmainfont{NotoSansLatIndic}[
  Path = ./fonts/,
  Extension = .ttf,
  UprightFont = *-Regular,
  BoldFont = *-Bold,
  Renderer = HarfBuzz
]


If you want the pre-built font files directly (since the gist can't host binaries), you can generate them in under a minute:

bash download_noto_fonts.sh
uv run merge_noto_sans_indic.py --noto-dir ./noto-source-fonts

Or I can share them via Google Drive if that's easier.

- Krishna

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Feb 18, 2026, 3:15:18 AMFeb 18
to sanskrit-p...@googlegroups.com
With this latex, I get the below  

image.png

Valiant attempt - saved to github though (plz contribute fixes there directly). 

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/sanskrit-programmers/7b5c4f64-5376-4ee7-8121-74b59dc09947n%40googlegroups.com.

Shreevatsa R

unread,
Feb 18, 2026, 11:31:05 AMFeb 18
to sanskrit-p...@googlegroups.com
1. Have you tried using Typst instead of LaTeX? (In my view it has many bugs and deficiencies for good typesetting especially in Indic languages, but a lot of people seem very happy with it as an alternative to LaTeX, so I thought I should suggest it anyway.)

2. If your LaTeX files are autogenerated from MD, why do you care about how "hairy" they look? You need never look at the LaTeX source code anyway, or you could move all this preamble into a separate file and \include it.

3. Ideally IMO for good typesetting, it is best to manually specify what font must be used where, so again if your LaTeX files are autogenerated from MD, then you might as well have an intermediate script that recognizes runs of characters and specifies the fonts for them (e.g. use \textenglish for runs of English text), instead of doing it automatically inside LaTeX itself as you do above (using ucharclasses and \setTransitionsForDevanagari etc). (Using some of my own old answers on TeX StackExchange to refresh my memory :) linklink, link, link.

4. There seem to be several dozen fonts that cover both Devanagari and Latin script, see e.g. here (or more narrowly here) — I think if you're not particular about Noto Sans there may be several alternatives (and I think a Noto Sans that includes both may already be included there? Have you tried downloading recently?)

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Feb 20, 2026, 5:13:56 AMFeb 20
to sanskrit-p...@googlegroups.com
On Wed, 18 Feb 2026 at 22:01, Shreevatsa R <shree...@gmail.com> wrote:
1. Have you tried using Typst instead of LaTeX? (In my view it has many bugs and deficiencies for good typesetting especially in Indic languages, but a lot of people seem very happy with it as an alternative to LaTeX, so I thought I should suggest it anyway.)

Hm - Haven't yet. so far just doing pandoc to epub and then calibre to pdf (has it's own bugs) - atleast that has less font and overflow problems.

 

2. If your LaTeX files are autogenerated from MD, why do you care about how "hairy" they look? You need never look at the LaTeX source code anyway, or you could move all this preamble into a separate file and \include it.

If it worked, I would care less - but couldn't get the transitions right :-D


3. Ideally IMO for good typesetting, it is best to manually specify what font must be used where, so again if your LaTeX files are autogenerated from MD, then you might as well have an intermediate script that recognizes runs of characters and specifies the fonts for them (e.g. use \textenglish for runs of English text), instead of doing it automatically inside LaTeX itself as you do above (using ucharclasses and \setTransitionsForDevanagari etc). (Using some of my own old answers on TeX StackExchange to refresh my memory :) linklink, link, link.

Had tried it a bit, because I want devanAgarI font size 30% larger than latin - but found the corner cases frustrating.


 
4. There seem to be several dozen fonts that cover both Devanagari and Latin script, see e.g. here (or more narrowly here) — I think if you're not particular about Noto Sans there may be several alternatives (and I think a Noto Sans that includes both may already be included there? Have you tried downloading recently?)

Ah - Tiro Devanagari seems ok atleast for devanAgarI (though still need to specify separate font for symbols like → ). That filter is excellent, thanks!

 

Bhasha IME

unread,
Feb 20, 2026, 5:16:39 AMFeb 20
to sanskrit-p...@googlegroups.com
Has any had any success with ConTeXt with Bharatiya scripts?

Anunad Singh

unread,
Feb 20, 2026, 10:29:18 AMFeb 20
to sanskrit-p...@googlegroups.com
I am not able to understand whether we are mixing two isolated concepts, font and Latex etc.

Coming to Latex, context etc, I feel that we should give the pain to AI text generators. For example, in Google colab, I asked Gemini - "Write Pythogorean theorem in Devanagari. take a, b, c as ka, kha, ga. 

Its response, "Certainly! Here is the Pythagorean theorem written with Devanagari characters for 'a', 'b', and 'c' (using क, ख, and ग respectively): $$ \text{क}^2 + \text{ख}^2 = \text{ग}^2 $$

and it displays it nicely in the 'text cell' : image.png

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Feb 20, 2026, 10:46:44 AMFeb 20
to sanskrit-p...@googlegroups.com
On Fri, 20 Feb 2026 at 20:59, Anunad Singh <anu...@gmail.com> wrote:
I am not able to understand whether we are mixing two isolated concepts, font and Latex etc.

Coming to Latex, context etc, I feel that we should give the pain to AI text generators. For example, in Google colab, I asked Gemini - "Write Pythogorean theorem in Devanagari. take a, b, c as ka, kha, ga. 

Can you figure out a way to provide this md to some AI (should be free to use) and give me beautiful 2-column pdf with minimal margins, page numbers and table of contents, subject to 

- a critical condition (hard for llms) - DONT ALTER ANY TEXT, 
- converting annotations like +++(फलवत् कर्म)+++ to 80% sized italicized text, aligned a bit lower than the rest as in this file
- devanAgarI being 30% higher in size.

 

Anunad Singh

unread,
Feb 20, 2026, 12:43:44 PMFeb 20
to sanskrit-p...@googlegroups.com
The task is not appropriate to be done by an AI text generator. It is so because (1) the text is too big. (2) AI text generators are notorious for leaving some text or the other, as if that was not there in the input text.

The conversion you want could be better accomplished using (1) specialised tools such as  pandoc  (2) find-replace using regular expressions.

-- anunAda

Anunad Singh

unread,
Feb 21, 2026, 12:29:43 AMFeb 21
to sanskrit-p...@googlegroups.com
I want to be more specific on the problem you have described and add the following. This problem should not be tried to be given to generative AI. Instead, do the following-
1) make minor  markup changes (italic, changing size of some types of text etc) using 'regular expression find-replace'.
2) convert the resulting text to html (there are many online free tools)
3) open it in LibreOffice and do the following-
a) convert it into two columns
b) do page numbering
3) finally convert into pdf.

-- अनुनाद 

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Feb 21, 2026, 12:35:49 AMFeb 21
to sanskrit-p...@googlegroups.com
On Sat, 21 Feb 2026 at 10:59, Anunad Singh <anu...@gmail.com> wrote:
I want to be more specific on the problem you have described and add the following. This problem should not be tried to be given to generative AI. Instead, do the following-
1) make minor  markup changes (italic, changing size of some types of text etc) using 'regular expression find-replace'.
2) convert the resulting text to html (there are many online free tools)
3) open it in LibreOffice and do the following-
a) convert it into two columns
b) do page numbering
3) finally convert into pdf.


The following (automated via commandline + py) seems better that that

Anunad Singh

unread,
Feb 21, 2026, 12:41:09 AMFeb 21
to sanskrit-p...@googlegroups.com
Just after sending my previous reply, it came to my mind that AI can be used here effectively. And that is this : If we want, we can submit some part of the text to AI and ask it to suggest suitable regular expressions for making the required changes. Then do those changes in a suitable text editor, such as Geany.

-- अनुनाद 

Anunad Singh

unread,
Feb 21, 2026, 1:04:48 AMFeb 21
to sanskrit-p...@googlegroups.com
Python, with its army and armour of lakhs of packages, can do almost anything!

-- अनुनाद 

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages