Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Multilingual Document in Latex

55 views
Skip to first unread message

Simon

unread,
Dec 29, 2008, 9:19:33 PM12/29/08
to
Good day everyone, i have question on how to write multilingual text
in latex, i am using tex? The following verbatim is an example.
Second, what is the suitable software for write and compile it,
because i failed to use the WinEdt and TeXnicCenter? Third, is there
any way to use unicode like java, \u0000 in latex?

=====

\documentclass[12pt]{article}
\usepackage{CJK}
\begin{document}

%for English
Google. Advanced Search · Preferences · Language Tools · Advertising
Programs - Business Solutions - About Google.

%for chinese
\begin{CJK*}{Bg5}{bsmi}
\CJKtilde
\noindent 繁體中文入口網站
\end{CJK*}

%for Arabic?
البحث في ويب البحث في الصفحات العربية. البرنامج الإعلاني‏ - كل ما تحب
معرفته عن

%for Hindi?
निर्देशिका की सामग्री

%for Japan
多言語対応サーチエンジンの日本版。ウェブ、イメージおよびニュース検索、Usenet 掲示板。

%and other languages?

\end{document}

Hongyi

unread,
Dec 29, 2008, 11:37:06 PM12/29/08
to
On 12月30日, 上午10时19分, Simon <choonchin...@gmail.com> wrote:
> Good day everyone, i have question on how to write multilingual text
> in latex, i am using tex?

This is a more complex issue. TeX itself in core is not unicode-
based.
Currrently, you can use CJKutf8, XeLaTeX, or LuaTeX to meet your
requirement.

>The following verbatim is an example.
> Second, what is the suitable software for write and compile it,

The Gun emcas 23 is enough.

> because i failed to use the WinEdt and TeXnicCenter?

Don't use winedt to do that, is not a unicode based editor.

>Third, is there
> any way to use unicode like java, \u0000 in latex?

CJK package support the \CJKchar command to do this, e.g., for chinese
character, 中, as the following:

\CJKchar{"0D6}{"0D0}

or

\CJKchar[UTF8]{"04E}{"02D}

>
> =====
>
> \documentclass[12pt]{article}
> \usepackage{CJK}
> \begin{document}
>
> %for English
> Google. Advanced Search · Preferences · Language Tools · Advertising
> Programs - Business Solutions - About Google.
>
> %for chinese
> \begin{CJK*}{Bg5}{bsmi}
> \CJKtilde
> \noindent 繁體中文入口網站
> \end{CJK*}
>
> %for Arabic?
> البحث في ويب البحث في الصفحات العربية. البرنامج الإعلاني‏ - كل ما تحب
> معرفته عن
>
> %for Hindi?
>  निर्देशिका की सामग्री
>
> %for Japan
> 多言語対応サーチエンジンの日本版。ウェブ、イメージおよびニュース検索、Usenet 掲示板。
>
> %and other languages?
>
> \end{document}

Your example is error, I'll send you some stuff I used based on your
above example.

HTH,
Regards,

Oliver Corff

unread,
Dec 30, 2008, 4:46:13 AM12/30/08
to
Simon <choonc...@gmail.com> wrote:
: Good day everyone, i have question on how to write multilingual text

: in latex, i am using tex? The following verbatim is an example.

Use LaTeX rather than TeX because TeX itself is not aware of languages
(in a strict sense). The phantastic language support found at CTAN
is generally geared towards LaTeX (with notable exceptions).

Depending on the languages you want to include in your document you
have to make some fundamental decisions. Looking at your example I see
that you declare BIG5 as encoding for Chinese. You can do that, but run
into problems when mixing with other languages and the commonly
accepted answer here is: Unicode, or ISO10646. If you want to combine
only Chinese and Arabic you can use BIG5 for Chinese and a
transliteration-based Arabic system like ArabTeX. Any language that
can be expressed in a romanization can be processed conveniently even in
LaTeX installations which are not Unicode-aware. However, sometimes
it is awkward to write in a transliteration when the natural writing
system is more convenient. So, again, this reads: choose Unicode.

: Second, what is the suitable software for write and compile it,


: because i failed to use the WinEdt and TeXnicCenter?

I have no experience with WinEdt nor TeXnicCenter but any editor which
can handle UTF8-encoded Unicode material, combined with a user interface
that handles the fonts, can display and edit Unicode material. Sometimes,
support for particular languages may be missing at the font or GUI level
but the editor of choice (I prefer vim, others will recommend Emacs)
can still handle your data.

: Third, is there


: any way to use unicode like java, \u0000 in latex?

In principle, the CJK package allows you to enter individual codepoints
but the truly generic way these days is XeTeX (and XeLaTeX) which can
understand and express these codes natively.

If you indicate which languages you want to include and what the purpose
of your documents is you may have a good chance that there will be more
substantial recommendations and solutions.

So far, I suggest to have a look at XeTeX and the languages branch of
CTAN.

Oliver.

--
Dr. Oliver Corff e-mail: co...@zedat.fu-berlin.de

Simon

unread,
Jan 6, 2009, 4:25:38 AM1/6/09
to
Thanks the information given. I guess following is the solution, i
used Texmaker and XeTex.

===

\documentclass{book}

%UNICODE%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% additional packages based on http://en.wikipedia.org/wiki/XeTeX
\usepackage[cm-default]{fontspec}% use 'cm-default' to restore default
LaTeX font
\usepackage{xunicode}
\usepackage[no-sscript]{xltxtra}% use 'no-sscript' to eliminate
footnote font error
%UNICODE%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%UNICODE%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% change `pdftex' option to `xetex'
\usepackage[xetex,colorlinks=true,linkcolor=black,citecolor=blue]
{hyperref}
%UNICODE%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\begin{document}

\mainmatter

%UNICODE%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Install additonal fonts:
% * download Code2000.ttf at http://www.code2000.net/code2000_page.htm
% * install Code2000.ttf to C:\Windows\Fonts\

\newfontfamily\cjk{SimSun}% define a shortcut macro for repeating use.
% assume SimSun.ttf exists in C:\Windows
\Fonts\
% if not, change 'SimSun' to 'Code2000'.
\chapter{Unicode}
\flushleft Chinese traditional: {\cjk 繁體中文入口網站} \linebreak

\flushleft Chinese simplified: {\cjk 简体中文入口网站} \linebreak

\flushleft Japanese: {\fontspec{Arial Unicode MS} 多言語対応サーチエンジンの日本版。}
\linebreak

\flushleft Arabic: {\fontspec[Script=Arabic]{Code2000}% Script option,
see fontspec.pdf
BBCArabic.com | الصفحة
الرئيسيةHomeNewsSportRadioTVWeatherLanguagesنصوص فقطآخر تحديث: الإثنين
11 أغسطس 2008 10:20 GMTأحدث الأخباراصابة 3 من رجال الشرطة في تفجير
جديد بالجزائر الارقام القياسية في السباحة تتهاوى في اولمبياد بكين
صحيفة: قاتل المطربة اللبنانية شخصية مصرية معروفة ميدفيديف: "روسيا أنهت
الجزء الرئيسي من عملياتها في أوستيا" الرئيس} \linebreak

\flushleft Pashto:{\fontspec[Mapping=tex-text]{Arial Unicode MS}
BBCPashto.com | خبرونه | هلمنديان: د بيارغونې کارونه دې دولت وکړي11:54
گرينويچ 2008 ,21 مۍعبدالصمد روحانيله کابلههلمنديان: د بيارغونې کارونه
دې دولت وکړيدسوېلي ولايت هلمند ډېرى اوسېدونكي وايي چې كه نړيوال
درغنيزو كارونو لپاره پيسې د انجوګانو پرځاى دولت ته ورنكړي نوښايي په
دغه ولايت كې په بيارغونه كې كوم پرمختګ را نشي .ددغه ولايت د ډېرو سيمو
خلك وايي چې يو خو له لويه سره انجوګاني كا} \linebreak

\flushleft Persian{\fontspec[Mapping=tex-text]{Arial Unicode MS}
BBCPersian.com18:41 گرينويچ - دوشنبه 31 دسامبر 2007 - 10 دی 1386محمد
باریکانیروزنامه نگار در بیروتلبنان؛ هراس از ناامنی در آستانه فصل
سردشدت یافتن اختلاف میان طرفهای سیاسی درگیر در بحران کنونی
ریاست‌جمهوری لبنان منجر به افزایش نگرانی‌ها میان مردم این کشور نسبت به
خطر بازگشت مجدد ترور، ناامنی و در نهایت جنگ داخلی به لبنان شده
است.تلاش‌های داخلی و بین المللی برای حل بحران} \linebreak

\flushleft Hindi: {\fontspec[Mapping=tex-text]{Arial Unicode MS}
निर्देशिका की सामग्री निर्देशिका की सामग्री } \linebreak
%UNICODE%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\flushleft Nepali: {\fontspec[Mapping=tex-text]{Arial Unicode MS}
BBCNepali.com | पहिलो पृष्ठ | संविधान संशोधनका चुनौती14 जुन, 2007
16:30 GMT सम्मका समाचारहरुसुशील शर्माबीबीसी नेपाली सेवासंविधान
संशोधनका चुनौतीलागु गरिएको पाँच महिना भित्रैमा नेपालको अन्तरिम
संविधानको दोश्रो संशोधन संसदले पारित गरि सक्दानसक्दै संविधानमा तेस्रो
संशोधनका चर्चाहरु चल्न थालिसकेका छन्।राजालाई संविधानसभा चुनावअगाडि नै
अन्तरिम संसदले हटाउन सक्ने र मंसिर महिनामा संविधानसभाको चुनाव गर्ने
कुरामा सत्ताधारी आठ दलबीच सहमति भइसकेकाले दोस्रो संशोधन प्रस्ताव पारित
हुनु एउटा } \linebreak
%UNICODE%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\flushleft Russian: {\fontspec[Mapping=tex-text]{Arial Unicode MS}Би-
би-си | Россия | XXI век: вооружен и
опасенHomeNewsSportRadioTVWeatherLanguagesБез графикиО сайтеОбратная
связь< ГлавнаяВ миреРоссияЭкономикаНаука и
техникаЛюдиКультураСпортБритания-----------------АналитикаВаше
мнениеМир в кадреLearn EnglishРадиоВидеоПартнерыgБИ-БИ-СИ НА ДРУГИХ
ЯЗЫКАХОбновлено: пятница, 25 апреля 2008 г., 08:20 GMT 12:20
MCKОтправить по почте Версия для печати XXI век: вооружен и
опасенНаступивший век не будет спокойным и безопасным, а главным
очагом нестабильности станет Азиатско-Тихоокеанский регион. Так
считает ведущий эксперт российского} \linebreak

\flushleft Macedonian: {\fontspec[Mapping=tex-text]{Arial Unicode MS}
BBCMacedonian.com | Вести | Среда, 6 август06. Август 2008 - Објавено
на 09:23 GMTСреда, 6 августДНЕВНИК на насловната објавува „Државата
без државен врв".Во текстот се вели дека највисоките државни
функциониери, претседателот на државата и Собранието во исто време се
на } \linebreak

\end{document}

0 new messages