Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Using quoted text in Prolog facts

44 views
Skip to first unread message

Ronald Modesitt

unread,
Oct 26, 2017, 1:09:28 PM10/26/17
to
I would like recommendations on using quoted text in a Prolog knowledge base. As a first project, other than tutorial projects, I am building a knowledge base of the British Monarchy. I am copying text from websites that provide information about the Monarch and his/her children, dates of reign, birth, death, consort, etc. Two examples from the same source follow:

Prince Albert, son of Ernst I, Duke of Saxe-Coburg and Gotha and Louise of Saxe-Gotha-Altenburg.

Ernst I, Duke of Saxe-Coburg and Gotha, son of Duke Francis Frederick of Saxe-Coburg-Saalfield and Countess Augusta Reuss zu Ebersdorf und Lobenstein.

The first example has the fields: Name, Father, Title, Mother.
The second example has the fields: Name, Title, Father, Fathers Titile, Mother.

Importing into Excel/LibreOffice allows me to separate the fields by replacing the sequence ", son of " with "," to isolate the father, and " and " with "," to isolate the mother. However, the Name field becomes "Prince Albert" or "Ernst I" which are not simple atoms. Any queries of the knowledge base must use the quotes.

One could use the substitution prince_albert for "Prince Albert" in the Name filed, however, queries of the Title field would look like duke_of_saxe-coburg_and_gotha, for "Duke of Saxe-Coburg and Gotha" - not a friendly approach.

My question is how do experienced Prolog programmers solve these problems?

burs...@gmail.com

unread,
Oct 26, 2017, 1:13:30 PM10/26/17
to
You can use single quotes like:
'Duke of Saxe-Coburg and Gotha'

Then its an atom.

Ronald Modesitt

unread,
Oct 26, 2017, 1:31:32 PM10/26/17
to
Forgive my rookie mistake regarding double and single quotes - I meant single quotes. Is there a better way of writing these atoms?

Julio Di Egidio

unread,
Oct 26, 2017, 1:54:21 PM10/26/17
to
On Thursday, October 26, 2017 at 7:09:28 PM UTC+2, Ronald Modesitt wrote:
>
> I would like recommendations on using quoted text in a Prolog knowledge
> base.
<snip>

Here is how the SWI-Prolog documentation puts it:

<http://www.swi-prolog.org/pldoc/man?section=text-representation>

Julio

Markus Triska

unread,
Oct 26, 2017, 2:20:54 PM10/26/17
to
Hi Ronald,

Ronald Modesitt <ronald.mo...@gmail.com> writes:

> One could use the substitution prince_albert for "Prince Albert" in
> the Name filed, however, queries of the Title field would look like
> duke_of_saxe-coburg_and_gotha, for "Duke of Saxe-Coburg and Gotha" -
> not a friendly approach.
>
> My question is how do experienced Prolog programmers solve these
> problems?

One important principle to get the most out of Prolog is to convert such
informal strings to *structured* data at the earliest opportunity.

See for example how Richard O'Keefe put it at:

http://www.cs.otago.ac.nz/staffpriv/ok/pllib.htm

For example, in your case, you could as a first step use:

name_title('Albert', duke_of('Saxe-Coburg and Gotha'))

And then generalize this for example to:

name_titles('Albert', [duke_of(['Saxe-Coburg and Gotha'])])

To allow for multiple titles and even multiple provinces that you could
represent as *lists* of such atoms.

For example:

name_titles('Ernst', [duke_of(['Saxe-Coburg and Gotha',
'Jülich, Cleves and Berg',
'Angria',
'Westphalia']),
landgrave_in(['Thuringia']),
...
])

Having to quote such uppercase atoms is not really a big deal in
practice: It may now appear somewhat impractical to you because there
are so few other complexities as you are only starting with the project.

If it still matters to you later, you can make it slightly simpler:

name_titles(ernst, [duke_of(['Saxe-Coburg and Gotha',
'Jülich, Cleves and Berg',
angria,
westphalia]),
landgrave_in([thuringia]),
...
])

and define a separate relation (if necessary) that describes how to
capitalize all these atoms correctly.

Note especially that you will still have to quote for example
'saxe-coburg', because saxe-coburg without quotes is no longer an atom:

?- write_canonical(saxe-coburg).
%@ -(saxe,coburg)
%@ true.

Great question by the way! The choice of data representation is very
important in practice, and will influence your programs a lot.

All the best,
Markus

--
comp.lang.prolog FAQ: http://www.logic.at/prolog/faq/
The Power of Prolog: https://www.metalevel.at/prolog

burs...@gmail.com

unread,
Oct 26, 2017, 3:37:56 PM10/26/17
to
For user oriented search you could use a pattern matcher,
here is an example with the Jekejeke Prolog pattern matcher.

The "database"(*) is:

partner(i149,'Grand Duke','Vladimir Romanov',male,1847,f11).
partner(i150,'Grand Duke','Alexis Romanov',male,1850,f11).
partner(i151,'Grand Duke','Serge Alexandrovich Romanov',male,1857,f11).
partner(i152,'Grand Duke','Paul Alexandrovich Romanov',male,1860,f11).
partner(i153,'Grand Duke','George Alexandrovich Romanov',male,1871,f9).
partner(i154,'Grand Duchess','Xenia Romanov',female,1875,f9).
partner(i155,'Grand Duke','Michael Alexandrovich Romanov',male,1878,f9).
partner(i156,'Grand Duchess','Olga Alexandrovna Romanov',female,1882,f9).
partner(i157,'Grand Duchess','Marie Pavlovna',female,1854,f1270).
partner(i158,'Grand Duke','Cyril Vladimirovitch Romanov',male,1876,f47).
partner(i159,'Grand Duke','Boris Romanov',male,1877,f47).
partner(i160,'Grand Duke','Andrei Vladimirovich Romanov',male,1879,f47).

The you could do queries such as:

?- use_module(library(misc/text)).
% 1 consults and 0 unloads in 31 ms.
Yes

?- partner(K,_,N,_,_,_),
pattern_match(N, 'andrei', [boundary(word), ignore_case(true)]).
K = i160,
N = 'Andrei Vladimirovich Romanov'

?- partner(K,_,N,_,_,_),
pattern_match(N, 'vladi*', [boundary(word), ignore_case(true)]).
K = i149,
N = 'Vladimir Romanov' ;
K = i158,
N = 'Cyril Vladimirovitch Romanov' ;
K = i160,
N = 'Andrei Vladimirovich Romanov'

Taken from here:
https://de.wikipedia.org/wiki/GEDCOM

Example pattern matchers for Prolog systems are:

SWI-Prolog:
http://www.swi-prolog.org/pldoc/doc_for?object=section%28%27packages/pcre.html%27%29#re_match/3

Jekejeke Prolog:
http://www.jekejeke.ch/idatab/doclet/prod/en/docs/05_run/10_docu/05_frequent/07_theories/21_misc/01_text.html

In case you have larger database you might need some inverted
search index and persistent storage. There exist search indexes
which can also support pattern matching.

For example an n-gram index can do it.

burs...@gmail.com

unread,
Oct 26, 2017, 3:54:10 PM10/26/17
to
As an example, the real GEDCOM database is much more
complex. For example ideas from Markus Triska are realized,
there is for example a struct for personal name, etc..

http://homepages.rootsweb.ancestry.com/~pmcbride/gedcom/55model1.gif

For an experiment I did, I boiled it
down two only two relations:

partner(PartnerID, Title, PersonName, Sex, BirthYear, FamilyID).
family(FamilyID, PartnerID, PartnerID).

Ronald Modesitt

unread,
Oct 26, 2017, 4:05:43 PM10/26/17
to
WOW! What terrific responses. I can't thank all of you enough for the good advice and suggestions. As a beginner I'll have to study each of your recommendations thoroughly. Over the years I've programmed with C, C++, Python, Java and even some Assembly but Prolog is the most interesting, and the help is fantastic. Thank you all!
0 new messages