Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Accented letters

0 views
Skip to first unread message

aliba

unread,
Aug 7, 2007, 8:37:24 AM8/7/07
to
Hello,

How can I code accented letters of foreign languages - e.g., in
French, "à", "é", "ô", ","ù", etc. - in Poplog Prolog?

The same question applies to the apostrophe sign (ASCII code 39).

So far, I have been unable to scan these special characters,
which are recognized without difficulty by tokenizers implemented in
other Edingurgh compatible Prologs, in Poplog Prolog.

Thank you in advance for any help or advice.

Aaron Sloman

unread,
Aug 7, 2007, 4:23:51 PM8/7/07
to

aliba <lin...@bluewin.ch> wrote:

> Date: Tue, 07 Aug 2007 05:37:24 -0700

> How can I code accented letters of foreign languages - e.g., in

> French, "=E0", "=E9", "=F4", ","=F9", etc. - in Poplog Prolog?


>
> The same question applies to the apostrophe sign (ASCII code 39).
>
> So far, I have been unable to scan these special characters,
> which are recognized without difficulty by tokenizers implemented in
> other Edingurgh compatible Prologs, in Poplog Prolog.

I thought this was going to be tricky, but it turns out that Poplog
Prolog uses the same conventions for expressing ascii codes in
strings and atoms as Poplog Pop11 does, as described in the Pop-11
help file:

HELP ASCII

All the special codes start with backslash "\". Some follow unix
conventions, e.g.

ASCII Typed as Represents
32 `\s` the code for a space
13 `\r` the code for a carriage return
10 `\n` the code for a line feed
8 `\b` the code for a back space
9 `\t` the code for a tab
27 `\e` the code for an escape
92 `\\` the code for the backslash itself

In a string (or Prolog atom or string) the string quote character
(apostrophe) is represented by

\'

Control characters:

`\^a` or `\^A` is CTRL-A (ASCII 1)
`\^b` or `\^B` is CTRL-B (ASCII 2)
etc.

A further convention allowed in strings is the representation of an
ASCII code by an integer between the brackets in \(..). E.g.

'\(65)\(66)' =>

is equivalent to

'AB' =>


Other characters can be represented the same way, e.g.
In Poplog prolog

?-write(' \(224) \(233) \(244) \(249) \' \(39) ').

produces

à é ô ù ' ' yes

I don't know whether the first four characters will survive posting
view the news system, but they should correspond to:

=E0 =E9 =F4 =F9

Aaron
http://www.cs.bham.ac.uk/~axs/

Chris Dollin

unread,
Aug 8, 2007, 1:46:28 AM8/8/07
to
Aaron Sloman wrote:

>
> aliba <lin...@bluewin.ch> wrote:
>
>> Date: Tue, 07 Aug 2007 05:37:24 -0700
>
>> How can I code accented letters of foreign languages - e.g., in
>> French, "=E0", "=E9", "=F4", ","=F9", etc. - in Poplog Prolog?
>>
>> The same question applies to the apostrophe sign (ASCII code 39).
>>
>> So far, I have been unable to scan these special characters,
>> which are recognized without difficulty by tokenizers implemented in
>> other Edingurgh compatible Prologs, in Poplog Prolog.
>
> I thought this was going to be tricky, but it turns out that Poplog
> Prolog uses the same conventions for expressing ascii codes in
> strings and atoms as Poplog Pop11 does, as described in the Pop-11
> help file:
>
> HELP ASCII
>
> All the special codes start with backslash "\". Some follow unix
> conventions, e.g.

I think Aliba wants to type those characters in /raw/, not with
funny encodings, so they can have functor names and atoms like
`zàbé` or `côùl` [1].

Presumably the Prolog tokeniser is configurable like the Pop one
and those characters can be given the same class as the ordinary
letters?

[1] Heavens, I've no idea.

--
Classy Hedgehog
"A facility for quotation covers the absence of original thought." /Gaudy Night/

Aaron Sloman

unread,
Aug 8, 2007, 7:10:05 AM8/8/07
to

Chris Dollin <e...@electrichedgehog.net> writes:

> Date: Wed, 08 Aug 2007 05:46:28 GMT
>
> Aaron Sloman wrote:
> > ....


> > I thought this was going to be tricky, but it turns out that Poplog
> > Prolog uses the same conventions for expressing ascii codes in
> > strings and atoms as Poplog Pop11 does, as described in the Pop-11
> > help file:
> >
> > HELP ASCII
> >
> > All the special codes start with backslash "\". Some follow unix
> > conventions, e.g.

[CD]


> I think Aliba wants to type those characters in /raw/, not with
> funny encodings, so they can have functor names and atoms like

> `zàbÊ` or `côÚl` [1].


>
> Presumably the Prolog tokeniser is configurable like the Pop one
> and those characters can be given the same class as the ordinary
> letters?

I've just checked and those things are accepted by poplog prolog
between atom quotes ('atom') and string quotes("a string") if you
have a way of inserting them. (In ved I can insert them using
ENTER ic <ascii code>

e.g.
ENTER ic 224

produces this: ā

It seems that the Prolog tokeniser also accepts them outside quotes,
e.g. I've just tried this in Ved and it worked (to my surprise):

happy(freā).

?-happy(X).

X = freā ?
yes

===

So this leaves me wondering what did not work that prompted the
original enquiry?

If it was a request for something like this input format:

>> French, "=E0", "=E9", "=F4", "=F9", etc. - in Poplog Prolog?

Then in poplog prolog Hex codes can be used thus:

"\(16:E0)", "\(16:E9)", "\(16:F4)", "\(16:F9)", etc.

Admittedly a bit more tedious. I don't know what other prologs
allow.

Aaron

aliba

unread,
Aug 11, 2007, 4:38:29 AM8/11/07
to
On 8 août, 13:10, Aaron Sloman <A.Slo...@cs.bham.ac.uk> wrote:
> Chris Dollin <e...@electrichedgehog.net> writes:
> > Date: Wed, 08 Aug 2007 05:46:28 GMT
>
> > Aaron Sloman wrote:
> > > ....
> > > I thought this was going to be tricky, but it turns out that Poplog
> > > Prolog uses the same conventions for expressing ascii codes in
> > > strings and atoms as Poplog Pop11 does, as described in the Pop-11
> > > help file:
>
> > > HELP ASCII
>
> > > All the special codes start with backslash "\". Some follow unix
> > > conventions, e.g.
>
> [CD]
>
> > I think Aliba wants to type those characters in /raw/, not with
> > funny encodings, so they can have functor names and atoms like
> > `zà bé` or `côùl` [1].

>
> > Presumably the Prolog tokeniser is configurable like the Pop one
> > and those characters can be given the same class as the ordinary
> > letters?
>
> I've just checked and those things are accepted by poplog prolog
> between atom quotes ('atom') and string quotes("a string") if you
> have a way of inserting them. (In ved I can insert them using
> ENTER ic <ascii code>
>
> e.g.
> ENTER ic 224
>
> produces this: à

>
> It seems that the Prolog tokeniser also accepts them outside quotes,
> e.g. I've just tried this in Ved and it worked (to my surprise):
>
> happy(freà).
>
> ?-happy(X).
>
> X = freà ?

> yes
>
> ===
>
> So this leaves me wondering what did not work that prompted the
> original enquiry?
>
> If it was a request for something like this input format:
>
> >> French, "=E0", "=E9", "=F4", "=F9", etc. - in Poplog Prolog?
>
> Then in poplog prolog Hex codes can be used thus:
>
> "\(16:E0)", "\(16:E9)", "\(16:F4)", "\(16:F9)", etc.
>
> Admittedly a bit more tedious. I don't know what other prologs
> allow.
>
> Aaron

Thank you very much for your explanations and my apologies for
replying late to your messages. I believe that the problem of
recognizing foreign accented letters is not with the Poplog Prolog
tokenizer but with the input file of the program I am trying to run.
Hereafter is the code of this input file:

[code]

/* INPUT - A METHOD OF READING SENTENCES ENTERED */
/* */
/* nb. name2 */

enter_sentence(Without_punc) :-
enter_phrase(With_punc),
last_word(Punc),
append(Without_punc,[Punc],With_punc).

enter_phrase([First_word|Rest_sentence]):-
nl,write('Type your sentence '),
/* bell, */
nl,nl,
write(' ==>'),
get0(First_char),
get_word(First_char,First_word,Next_char),
rest_sentence(First_word,Next_char,Rest_sentence).

get_word(First_char,Word,Next_char) :-
punctuation(First_char),
!,
name2(Word,[First_char]),
get0(Next_char).

get_word(First_char,Word,Char_after) :-
low_case(First_char,Low_char),
!,
get0(Next_char),
rest_word(Next_char,Rest_chars,Char_after),
name2(Word,[Low_char|Rest_chars]).

get_word(First_char,Word,Char_after) :-
get0(Next_first_char),
get_word(Next_first_char,Word,Char_after).

rest_sentence(Word,_,[]) :-
last_word(Word),
!.

rest_sentence(Word,Next_char,[Next_word|Rest_words]) :-
get_word(Next_char,Next_word,Char_after),
rest_sentence(Next_word,Char_after,Rest_words).

rest_word(Apostrophe,[Apostrophe],Char_after) :-
apostrophe(Apostrophe),
!,
get0(Char_after).

rest_word(Char,[Low_char|Rest_chars],Char_after) :-
low_case(Char,Low_char),
!,
get0(Next_char),
rest_word(Next_char,Rest_chars,Char_after).

rest_word(Char_after,[],Char_after).

/* set up relations here */

punctuation(33). /* (!) */
punctuation(63). /* (?) */
punctuation(46). /* (.) */
apostrophe(39). /* (') */

low_case(C,C) :- /* small letters */
(C>96),
(C<123).

low_case(C,Lower_case) :- /* capital letters */
(C>64),
(C<91),
(Lower_case is (C + 32)).

low_case(C,C) :- /* digits */
(C>47),
(C<58).

low_case(39,39). /* (') */

low_case(45,45). /* (-) */

last_word('.').
last_word('!').
last_word('?').
last_word(".").
last_word("!").
last_word("?").

/* LIST MANIPULATING PROCEDURES HERE */

append([],List,List).
append(List,[],List).
append([H1|T1],List2,[H1|T3]) :-!,
append(T1,List2,T3).

head([H|T],H).
tail([H|T],T).

[code]

This file is part of a program which no longer seems to be
developped. As I am by no means an expert Prolog programmer I failed
to adjust the above code to handle accented characters so that they
can be read on the screen (as if they were output with "write(X)").
Moreover, I do realize that accented letters are not a trivial problem
in any Prolog.

As can also be noticed, the code above contains a procedure "name2"
which differs from "name" and thus blocks it from translating it to
other, standard Prologs.

Therefore, any suggestion that would help me solving these two
problems will be welcome.

Thank you in advance.

Message has been deleted

Aaron Sloman

unread,
Aug 11, 2007, 5:36:20 AM8/11/07
to

aliba <lin...@bluewin.ch> writes:

[For some reason the message was posted twice]

> Date: Sat, 11 Aug 2007 01:38:29 -0700

> .... snip .....

> ......

As far as I can tell, you seem to be assuming that name2 is defined
in Poplog prolog and is causing a problem.

However, I have searched the Poplog prolog files and cannot find any
such predicate.

It is not defined in the portion of program you posted. So it must
be defined somewhere else in the program. If not, you will get
errors when the program runs, independently of whether you include
'foreign' characters or not.

> .....

> This file is part of a program which no longer seems to be
> developped. As I am by no means an expert Prolog programmer I failed
> to adjust the above code to handle accented characters so that they
> can be read on the screen (as if they were output with "write(X)").
> Moreover, I do realize that accented letters are not a trivial problem
> in any Prolog.

It looks to me as if you need the help of prolog experts, and
therefore it may be a good idea to post your problem simultaneously
to comp.lang.pop and comp.lang.prolog -- any news poster should
allow you to do that simultaneously.

However you should make the whole program available somewhere so
that anyone trying to help you can look at the definitions of all
the prolog predicates you are using. Just showing examples where
name2 is used, without showing its definition, will not provide
enough information.

You should also give examples of input for which the program works
and input for which it does not work, with the full error messages
or faulty output included.

Ideally if you could put all this in a tar or zip file on a web page
somewhere and simply post the url, it will not need to be sent all
round the world, and can be fetched by anyone wishing to help.

> As can also be noticed, the code above contains a procedure "name2"
> which differs from "name" and thus blocks it from translating it to
> other, standard Prologs.

If name2 is defined somewhere in one of the other files the
definition should work in any prolog. If it is not defined it will
not work in poplog prolog and probably not in any other prolog.

So it is probably not a problem specific to Poplog prolog, unless
there is some information I have missed. If development work on the
program stopped, it may have been left with bugs that you will have
to track down and fix!

I hope that helps you narrow the problem down.

Aaron
http://www.cs.bham.ac.uk/~axs/

aliba

unread,
Aug 11, 2007, 11:01:26 AM8/11/07
to

Thank you for your reply. Of course "name2" is not a problem inherent
to Poplog Prolog. On the other hand I will follow your suggestion of
putting the entire program on a web page.

Many thanks again for your help.

0 new messages