I'm aware that this topic has been dicussed before, but I couldn't
find exactly what I'm looking for in the archives.
There are several tools for maintaining a lexicon and generating
inflexions according to rules, and then there's software like
Langmaker, which besides being non-free and unmaintained, is quite
similar to what I want, but I don't think it's quite that. I'm judging
from what I've heard and some web apps that supposedly do something
similar. They just give you a random list of words, but not the full
list of possible words reponding to the regular expression or BNF
introduced.
So, what I need is a program/script which can:
1) Generate the FULL list of possible words according to a user-
defined specification, like a EBNF or regular expressions. For
instance, "Word := C V C | V C" with additional restrictions, like
"the first letter can't be "h" "
2) Let me define another list of "dummy names" for the words in the
lexicon, so that I can build example sentences like "#article-1 #noun-
dog #verb-barks" without using actual words of the conlang, which may
not even be assigned yet.
3) Let me build a third list of translations to other language or
languages.
4) Let me assign conlang words to their dummy names, and dummy names
to their translations
5) Tell me at every time which words are still available (unassigned
to dummy names) and vice versa, which dummy names have yet no
translation, and vice versa, which translations have no conlang word
and viceversa.
6)Output the results in a suitable table format (word- dummy name -
translation) to be inserted in text or HTML documents.
Most of this can be done with a database, like kexi (similar to MS
Access) or OOBase. There are even more flexible options, like
powerloom. The problem I have is how to interact with the database for
automated lexicon generation (point 1). So, maybe if a script can
generate a CSV text file of words from a BNF description, that could
be imported to the database.
All ideas are welcome :)
Regards,
If I needed to generate a CSV list of words according to a formula, I
would write a BASIC program to do it, probably using the freeware
Chipmunk BASIC interpreter.
I realize BASIC is extremely unfashionable -- nobody will be impressed
if you admit to using it -- but it's free, it's similar to English
(thus easily learned and user-friendly), and it has a lot of
text-handling functions.
You could do it with C or Java too, I suspect.
--
-30-
REM * this program generates all possible syllables *
REM * first we create arrays to hold our strings *
DIM a$(50)
DIM b$(50)
DIM c$(50)
REM * put permitted initial consonants into a$ array *
Initials:
READ y$
IF y$ = "xxx" THEN GOTO Vowels
counta = counta + 1
a$(counta) = y$
GOTO Initials
REM * put permitted vowels into b$ array *
Vowels:
READ y$
IF y$ = "xxx" THEN GOTO Finals
countb = countb + 1
b$(countb) = y$
GOTO Vowels
REM * put permitted finals into b$ array *
Finals:
READ y$
IF y$ = "xxx" THEN GOTO Wrap
countc = countc + 1
c$(countc) = y$
GOTO Finals
REM * create the syllables and output them to a textfile *
Wrap:
OPEN "syllables.txt" FOR OUTPUT AS #1
FOR i = 1 TO counta
FOR j = 1 TO countb
FOR k = 1 TO countc
sa$ = a$(i)
sb$ = b$(j)
sc$ = c$(k)
IF sc$ = "zero" THEN sc$ = ""
syllable$ = sa$ + sb$ + sc$
PRINT #1, syllable$
NEXT k
NEXT j
NEXT i
CLOSE #1
PRINT "Finished. It was a pleasure to serve you."
END
REM * below is where the intials, medials, and finals are stored *
REM * xxx marks the end of each list *
DATA b, ch, d, xxx
DATA a, i, u, xxx
DATA zero, n, s, xxx
- - - - - that's the end of the program - - - - -
- - - - - below is what the output file looks like - - - - -
ba
ban
bas
bi
bin
bis
bu
bun
bus
cha
chan
chas
chi
chin
chis
chu
chun
chus
da
dan
das
di
din
dis
du
dun
dus
Thanks, Rick and Anonymous, for your prompt reply and tips!
This script looks interesting. I'm a bit more familiar with java than
with BASIC, but I've found your script very useful as a guide. Now I
have to find out how to combine this with a DB like kexi or OOBase, or
some other tool, to get those features I want. On the other hand, It
would be nice to do this with some linguistic IDE, like GATE.
Regards,
**************Java***************
// this program generates all possible syllables
import java.io.PrintWriter;
import java.io.IOException;
public final class Syllables {
private static final String[] initials = {"b", "ch", "d"};
private static final String[] medials = {"a", "i", "u"};
private static final String[] finals = {"", "n", "s"};
public static void main(final String[] args) throws IOException {
final PrintWriter out = new PrintWriter("syllables.txt");
for (final String anInitial : initials)
for (final String aMedial : medials)
for (final String aFinal : finals)
out.println(anInitial + aMedial + aFinal);
out.close();
System.out.println("Finished. It was a pleasure to serve you.");
}
}
--
John W. Kennedy
"Only an idiot fights a war on two fronts. Only the heir to the throne
of the kingdom of idiots would fight a war on twelve fronts"
-- J. Michael Straczynski. "Babylon 5", "Ceremonies of Light and Dark"
> **************Ruby***************
> # this program generates all possible syllables
> initials = %w/b ch d/
> medials = %w/a i u/
(etc)
Cool! Those are elegant translations.
I've just tried the ruby version with kdevelop and it works great,
thanks! :)
For generate all syllables, I wrote a small program Javascript,
embesed in a html page .
I try to attach this code here, but it is possible that the system
pull it out (web code in web code !)
******************************************************************************************************************
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">
<html>
<head>
<title>SILABUS 0.0</title>
<meta name="autoro" content="Daniel Macouin">
<script language="JavaScript">
<!-- JavaScript
function silabus(){
l1 = document.form.L1.value
liste1 = l1.split(",");
l2 = document.form.L2.value
liste2 = l2.split(",");
l3 = document.form.L3.value ;
liste3 = l3.split(",");
silabaire = new Array() ;
// boucles!!
n = 0
for (v=0;v<liste2.length;v++){
for (i=0;i<liste1.length; i++){
for (j=0;j<liste3.length;j++){
silabaire[n]= liste1[i]+liste2[v]+liste3[j];
n++
}
}
}
resultSilab = ""
for (l=0;l<silabaire.length;l++){
resultSilab = resultSilab + "," +silabaire[l] ;
}
//afficheSilabaire = resultSilab.join(",")
document.form.silabaire.value = resultSilab ;
}
// - JavaScript - -->
</script></head>
<body bgcolor="#CCFFCC" text="black" link="blue" vlink="purple"
alink="red">
<div align="center"><table border="0">
<tr>
<td colspan="3" bgcolor="maroon"><h1 align="center"><span
style="background-color:maroon;"><font
color="olive">SILABUS
</
font></span></h1></td>
<td bgcolor="maroon"><p align="center"><font
color="#906D65">Lenadi
MOUCINA .2008. libera programo</font></td>
</tr>
<tr>
<td width="883" colspan="3"><div align="center"><table
border="10" cellpadding="15"
cellspacing="0" width="75%" bgcolor="#7FA256"
style="border-width:8; border-style:outset;">
<tr>
<td><form name="form" method="get">
<h1 align="center"><span style="background-
color:maroon;"><font
color="yellow"> 1 </font></
span><font
color="yellow"> <input type="text"
name="L1" value="x,xy,gh,rt"
size="30"></font></h1>
<h1 align="center"><span style="background-
color:maroon;"><font
color="yellow"> 2 </
font></span><font
color="yellow"> <input type="text"
name="L2" value="a,i"
size="30"></font></h1>
<h1 align="center"><span style="background-
color:maroon;"><font
color="yellow"> 3 </font></
span><font
color="yellow"> <input type="text"
name="L3" value="k,t,rt"
size="30"></font></h1>
<h1 align="center"><font
color="yellow"> </font><input
type="button" name="envoi" value="......
>>>>>"
onclick="silabus() ;" style="font-
style:normal; font-weight:bolder; font-size:x-large; color:yellow;
background-color:maroon; text-decoration:none;">
: [1] [2] [3]</h1></td>
</tr>
</table></div></td>
<td rowspan="2"><p align="center"><textarea name="silabaire"
rows="25"
cols="35" wrap="virtual"></textarea></td>
</tr>
<tr>
<td width="33%"></form>
<h2><span style="background-color:maroon;"><font
color="white"> b,c,d,f,g,h,j,k,l,m,n,p,q,r,s,t,v,x</font></span></
h2></td>
<td width="33%"><h2><span style="background-
color:maroon;"><font color="white">a,e,i,o,u</font></span></h2></td>
<td width="33%"><h2><span style="background-
color:maroon;"><font color="white">y,w</font></span></h2></td>
</tr>
</table></div>
<p> </p>
</body>
</html>
***************************
It's all!
You can see the page here : http://danielmacouin.chez-alice.fr/silabus.htm