Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Debian Dictionary

1 view
Skip to first unread message

Jutta Wrage

unread,
Jan 26, 2005, 5:10:13 AM1/26/05
to
Hi!

On the Debian pages, there is a dictionary [1] which is rather empty.
While searching, the only translations I found obviously are russian.
Everything found in [2] is more than three years old.

A newer attempt was made by Joe Oppegard in this maling list [3].

I did not see the message from Joe, when I began to make another
attempt to collect a multilingual dict, which is now is included in the
Debian Women Project [4] and in my homepage [5].

The dictionaries are created out of simple text files and can be edited
by nearly everyone. Else than with Joe's Dict there is no sgml markup
in the source files.
There currently are three sorts of dicts:

- The acronym dict
- Translations from and to English languages
- Monolingual Dictionaries with explanations of words or phrases.
- the dicts are bound together with links from meanings of the acronyms
to translations and from translations to glossary entries.

HTML files and files for dictd are created when running makdictutf8.pl.
The program can create pages in utf8 only or in the encodings specified
in the config files for each language.

To get the alphabetical order, there is a special sorting implemented
in the perl program. Therefore three files had to be copied from perl
5.8 package:
/usr/local/lib/perl/5.6.1/Unicode/Collate.pm
/usr/local/lib/perl/5.6.1/Unicode/UCD.pm
/usr/local/lib/perl/5.6.1/Unicode/Collate/keys.txt

I discussed the dict with one of the Debian Developers. And thought, it
would be a good idea to include the dict into the debian doc web pages
as a project at alioth, maybe.

Main features of the dicts and makedictutf8.pl:
- The dicts may carry languages, not currently supported in the debian
home page.
- The dicts use a rather simple source file format making it possible
to edit them for nearly everyone, even if they do not know about
gettext and other things.
- Files for dictd server are created

You may read more at the about.html linked from the dictionary index
page. Source files are in the sub directory "source" in [4] and [5].
Submitting entries can be done by mailing to the dict admin or using a
form (only available at www.witch.westfalen.de).

A dict server carrying the Debian dicts is available at [6]. Please be
aware that dict protocol is utf-8 and you may need a client supporting
utf-8.

Please feel free to comment, ask or suggest.

If you want to contribute content (translations or other entries)
please contact me directly or use the web form linked from the about
page.

Jutta

[1] http://www.debian.org/doc/manuals/dictionary/
[2]
http://cvs.debian.org/ddp/manuals.sgml/dictionary/?cvsroot=debian-doc
[3] http://lists.debian.org/debian-doc/2004/05/msg00092.html
[4] http://women.alioth.debian.org/dicts/
[5] http://www.witch.westfalen.de/debian/
[6] dict: la-sorciere.de

--
http://www.witch.westfalen.de
http://witch.muensterland.org


--
To UNSUBSCRIBE, email to debian-do...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org

Helmut Wollmersdorfer

unread,
Jan 26, 2005, 9:20:10 AM1/26/05
to
Jutta Wrage wrote:

First, thx for the nice overview.

> On the Debian pages, there is a dictionary [1] which is rather empty.
> While searching, the only translations I found obviously are russian.
> Everything found in [2] is more than three years old.

There are other dicts and glossaries as well - e.g.
http://www.debian.org/devel/join/newmaint#Glossary

> There currently are three sorts of dicts:

> - The acronym dict
> - Translations from and to English languages
> - Monolingual Dictionaries with explanations of words or phrases.
> - the dicts are bound together with links from meanings of the acronyms
> to translations and from translations to glossary entries.

In a more general view a specific document lives in a hierarchy of
contexts. E.g. the Debian Installer Manual uses words which are part of

- common language, e.g. Webster's for English (or Duden for German)
- computer language, e.g. The Free On-line Dictionary of Computing
- Unix/Linux language, e.g.
http://www.tldp.org/LDP/Linux-Dictionary/html/
- Debian language
- Debian Maintainer language

During a wording review of a document I check against a hierarchy of
dicts. This means that every word not found in an upper context needs a
definition to be understandable.

> Please feel free to comment, ask or suggest.

Here my ideas for long term goals:

1) Agree on uniform format, administration and infrastructure for dicts
or glossaries within the Debian Project
E.g. docbook-xml supports AFAIK glossaries.

2) Develop or collect utilities for conversion in different formats
like dictd, HTML, TeX, plain text etc.

3) Have dicts or glossaries of debian documents in a separate file.
This makes it possible to include such a dictionary in the specific
document, or include it to other dictionaries as well.

1) to 3) are not technically difficult. It's more a problem of
coordination, standardization, discipline, and at least hard work for
the doc-writers and doc-reviewers.

Helmut Wollmersdorfer

Jutta Wrage

unread,
Jan 26, 2005, 7:20:10 PM1/26/05
to

Am Mittwoch, 26.01.05 um 14:50 Uhr schrieb Helmut Wollmersdorfer:

> - common language, e.g. Webster's for English (or Duden for German)
> - computer language, e.g. The Free On-line Dictionary of Computing

These are dicts available for a quick few, for sure.

> - Unix/Linux language, e.g.
> http://www.tldp.org/LDP/Linux-Dictionary/html/

Here it needs some more time to find the definition. Not everyone is a
living address book. And not everyone is online connected to the
internet, when searching for the meaning of a word or phrase.

> - Debian language
> - Debian Maintainer language

That is something the dicts are for. Debian language is not selef
explaing.

> During a wording review of a document I check against a hierarchy of
> dicts. This means that every word not found in an upper context needs
> a definition to be understandable.

What do you mean with "in an upper" context here?

> 1) Agree on uniform format, administration and infrastructure for dicts
> or glossaries within the Debian Project
> E.g. docbook-xml supports AFAIK glossaries.

Docbook xml is something, that can be produced by a program - I am
sure, that I can add something like that to the makdict program. But if
you do not have lots of people willingly to collect dict entries and
translations, you loose, if you tell, oh, yes, you can collect, but you
have to deliver as xml file. That is, why I took that special format
with two colons as a divider.

> 2) Develop or collect utilities for conversion in different formats
> like dictd, HTML, TeX, plain text etc.

Good Idea. But wait with the dicts until someone has done and
documented?
The current dict pages seem to be from 1999 or something like that.
Maybe, there is a reason, that Debian pages still come without any
content for me there?

> 3) Have dicts or glossaries of debian documents in a separate file.
> This makes it possible to include such a dictionary in the specific
> document, or include it to other dictionaries as well.

That is something, I do not understand. different files for any
documentation? All the words, I collected and me and others translated
came from different sources: Discussions in IRC, mailing lists or even
the debian web pages or other sources.

> 1) to 3) are not technically difficult. It's more a problem of
> coordination, standardization, discipline, and at least hard work for
> the doc-writers and doc-reviewers.

That for sure is wrong, if you want many people without much technical
knowledge collect dict entries, without having one person (or even
more) with lots of time, putting everything in the right format.

My idea was making the maintenance as simple as possible. And my
current experience shows, that I would have got not much more than my
own translations, if I had not taken the simple source format.

For sure, I look at things different than a DD. I am looking at it as a
user, who does not want to search through the web on every not clearly
known word, acronym or phrase.

greetings

Jutta

Helmut Wollmersdorfer

unread,
Jan 30, 2005, 6:50:05 PM1/30/05
to
Jutta Wrage wrote:
>
> Am Mittwoch, 26.01.05 um 14:50 Uhr schrieb Helmut Wollmersdorfer:

>> - Unix/Linux language, e.g.
>> http://www.tldp.org/LDP/Linux-Dictionary/html/

> Here it needs some more time to find the definition.

That's only a technical problem. After conversion into dictd it should
be very fast.

> What do you mean with "in an upper" context here?

"more general" would be the better word. The special context "Debian
Maintainer" is part of the more general context "Debian". E.g. an
acronym like BTS (Bug Tracking System) is IMHO part of general Debian
language, and should be included in a general Debian-dict. An acronym
like NMU (Non Maintainer Upload) is part of Debian-Maintainer language.

> My idea was making the maintenance as simple as possible. And my current
> experience shows, that I would have got not much more than my own
> translations, if I had not taken the simple source format.

Making things as simple as possible is always a good idea.

> For sure, I look at things different than a DD. I am looking at it as a
> user, who does not want to search through the web on every not clearly
> known word, acronym or phrase.

I am not a DD. I am experienced in writing and reviewing documentation.
From this point of view I tried to explain my method of creating
dict-entries. The condition "not clearly known word" can be defined by a
formal rule. The goal is the same as your goal: making words understandable.

Helmut Wollmersdorfer

0 new messages