Three topics relevant to the Genealogy Software Forum came up:
1. The need for an event-based database program for genealogists, historians,
biographers etc.
2. Dennis's own programs, Geneota and Genota Forms
3. Old utility programs for which there are no modern equivalents.
As a result, I've uploaded a few of these old utilities to the website of the
genealogy software forum, in the hope that they may inspire some enterprising
hackers to reverse engineer them for modern hardware and software. See:
http://groups.yahoo.com/group/gensoft/
There were utilities called Nameview and Namedrop that scanned BBS messages
for things like surnames of interest, and manipulated those messages to
collect them. They worked with Fido Technology Networks, but no one seems to
have written an equivalent that works with mailing lists, newsgroups, or web
forums.
There were utilities that took data from genealogy programs (mainly PAF 2.x)
and printed family grtoup sheets on 3x5 or 4x6 cards.
That is FAR more useful than the stupid trick of software developers who
tried to make a computer screen look like a card index, which had all the
disadvantages and none of the advantages of the cards themselves.
But most of those utilities were written in DOS, and modern printers don't
work with DOS. So the utilities need to be rewritten to use modern programs
and modern hardware.
There was the Tiny Tafel Generator -- which not only developed but matched
Tiny Tafels. Trouble is, it was written in Turbo Pascal, which doesn't work on
fast machines. You need a slow processor for it to run, under 500 Mhz, I
think.
There are a few more there -- back in the old days we may have had less than
we do now, but very often we could do more with the less we had.
So have a look at them, and see if you can reverse engineer them to produce
new versions.
And see the blog post at
http://hayesgreene.wordpress.com/2009/11/26/370/
for more on the event-based program we discussed.
--
Steve Hayes from Tshwane, South Africa
Web: http://hayesfam.bravehost.com/stevesig.htm
Blog: http://methodius.blogspot.com
E-mail - see web page, or parse: shayes at dunelm full stop org full stop uk
I'm not sure why one is supposed to need a slow processor for Turbo
Pascal programs.
Do you have the source for this? Following the links got me nowhere.
If I had the source I could have a look at getting this working in a
modern Pascal implementation.
--
Ian
Hotmail is for spammers. Real mail address is igoddard
at nildram co uk
>Steve Hayes wrote:
>> There was the Tiny Tafel Generator -- which not only developed but matched
>> Tiny Tafels. Trouble is, it was written in Turbo Pascal, which doesn't work on
>> fast machines. You need a slow processor for it to run, under 500 Mhz, I
>> think.
>
>I'm not sure why one is supposed to need a slow processor for Turbo
>Pascal programs.
Nor am I. All I know is that it ran on my 450 Mhz machine, but it doesn't run
on my current one, and someone told me that the problem was the processor
speed.
>Do you have the source for this? Following the links got me nowhere.
>If I had the source I could have a look at getting this working in a
>modern Pascal implementation.
Unfortunately not. I suppose if one could get hold of the author, one might be
able to persuade him to make it "open source", but I'm not sure how to get
hold of him. His name was Christopher Long.
>On Thu, 26 Nov 2009 10:03:16 +0000, Ian Goddard <godd...@hotmail.co.uk>
>wrote:
>
>>Steve Hayes wrote:
>>> There was the Tiny Tafel Generator -- which not only developed but matched
>>> Tiny Tafels. Trouble is, it was written in Turbo Pascal, which doesn't work on
>>> fast machines. You need a slow processor for it to run, under 500 Mhz, I
>>> think.
>>
>>I'm not sure why one is supposed to need a slow processor for Turbo
>>Pascal programs.
>
>Nor am I. All I know is that it ran on my 450 Mhz machine, but it doesn't run
>on my current one, and someone told me that the problem was the processor
>speed.
I don't think so. Except if a game (i.e. something using loops to
count the time), I don't see why a software would not run on a faster
machine. On a slower machine, it can be some mean to block any
attempt to reverse ingeneering the code, but on a faster one ?
However, it is possible the problem is not the speed, but the
processor itself. The slower processors are not the same CPUs
than the faster ones. A 8086 is slower than a 80386 which is
slower than a 80486 or a Pentium. A 80386 has more instructions
than the 8086, but it is possible the software is using some
side effect of an undefined instruction in the 8086 (you can
replace the 86 and 386 by other models). I remember the Apple
][ included in its code "apple" and some letters were undefined
instructions. So, it is possible the Turbo Pascal is using those
undefined instructions that changed when the processor evoluted.
On another hand, it is likely you will get the same kind of
problem if you compare an Intel with an AMI or other brand because
the undefined instructions can react differently.
Denis
--
Denis Beauregard - g�n�alogiste �m�rite (FQSG)
Les Fran�ais d'Am�rique du Nord - www.francogene.com/genealogie--quebec/
French in North America before 1722 - www.francogene.com/quebec--genealogy/
Sur c�d�rom � 1770 - On CD-ROM to 1770
AFAIK, Turbo Pascal, at least in some versions, contains a delay loop to
help control screen painting. This could result in failure on "fast"
processors. I believe the situation varies between versions and also
some copies of some versions may have patches installed for the problem.
So, a particular copy may or may not work on todays processors.
Peter
Whterver the cause, whenever I try to run the Tiny Tafel Editor on a machine
with a fast processor it says:
Runtime error 200 at 2B65:0091.
If the app is only available is a compiled binary a patch would only
help if it's to a library.
--
Ian
Hotmail is a spam-bin. Real mail address is igoddard
at nildram co uk
> Steve Hayes wrote:
>> There was the Tiny Tafel Generator -- which not only developed but
>> matched Tiny Tafels. Trouble is, it was written in Turbo Pascal, which
>> doesn't work on fast machines. You need a slow processor for it to run,
>> under 500 Mhz, I think.
>
> I'm not sure why one is supposed to need a slow processor for Turbo
> Pascal programs.
>
> Do you have the source for this? Following the links got me nowhere.
> If I had the source I could have a look at getting this working in a
> modern Pascal implementation.
>
From memory, long time ago: it has something to do with creating temporary
files on the fly. The names of the files were somehow derived from the
computers clock. With the adventure of faster systems it occurred that two
files created quickly one after another ended up with the same name, thus
trying to overwrite each other, or was it by getting an unexpected error
when the second file was created?
Herman Viaene
--
Veel mensen danken hun goed geweten aan hun slecht geheugen. (G. Bomans)
Lots of people owe their good conscience to their bad memory (G. Bomans)
This concept of "an event-based database program" bothers me on a couple
of grounds.
Firstly the notion that data must be this-based versus that-based versus
the other-based is an exceedingly limited view of databases. If you
think of the typical database-based package on which a business might be
run marketing may see it as customer-based, sales as order-based,
shipping as despatch-based and accounts as invoice-based and those are
only some of the options. They're all correct but only partially so as
they're all perspectives on the overall schema. The same applies to a
database for genealogists etc. (BTW one thing that does not bother me
is the idea that genealogists, historians, biographers etc. might use
the same package; that I heartily applaud.)
Secondly, if you give primacy to anything it must be evidence. The
"events" in the database are only reconstructions; the actual events
(and, for the greater part, the people who took part in them) are long
gone. What we record are our reconstructions of them based on the
evidence that's left. Apart from the current descendants the evidence
is the only thing that's still real.
Does this matter? Let me give you an example.
If you look up on IGI the baptism of Jonathan Goddard on 04 Apr 1779 at
Holmfirth, Yorks. you will find three hits for the event of which two
give his mother as Christiana. If you look at the actual evidence, the
register entry you will find it simply says "Jonathan son of John
Goddard of Holmfirth bapt". There is no mother's name. Where on earth
did Christiana come from?
If you look up the baptism of Christiana Goddard on 27 Mar 1778 you will
find three hits, one of which give her spouse as John Goddard. Clearly
the "event" as described for Jonathan is a reconstruction which is
linkage-based.
However, if we go back to the evidence for 27 Mar 1778 we find the word
"Christiana" isn't there either, it actually says "Wife of John Goddard
of Holmfirth Ch". If you compare the form of this record with that for
Jonathan you will see that the "Ch", which has been expanded to
"Christiana" occupies the place where we'd expect to find a verb. And
"Ch" does indeed stand for a verb.
If we had what you might term an evidence-lead database you'd find that
both records belong to a register of baptisms and churchings (the search
engine is your friend if that's an unfamiliar word). In fact the
statement that Jonathan's mother was Christiana depends entirely on a
misinterpretation of the evidence of a separate event. My own
reconstruction, based on a great many items of evidence, including tithe
maps and trade directories which aren't even event-based, gives the
mother's name as Mary - but YMMV.
The point is that if you only give me your "events" and/or your "links"
any errors in them will be opaque to me. If you give me your evidence
not only would I be able to check your reconstructions, I won't even
need them - I'll be able to work things out for myself.
I also feel the need for a package with that diverse user-base (should
we call it an historian's workbench?) but it must hold the evidence from
which the rest of the data is extracted.
--
Ian
The Hotmail address is my spam-bin. Real mail address is igoddard
at nildram co uk
I see a point of criticism. Once evidence is entered in to a database,
it is no longer the evidence, but some representation of it. And then to
make use of that data, it is necessary to interpret it and draw
conclusions relating to people, events and so on. So yes evidence is
important, but it gets manipulated along the way and once you have
passed beyond it, it is not the most interesting part of the database
and may not have been very substantial in the first place.
In that sense I am quite happy with "event-based" as that may be
effectively how the database works.
Peter
>Steve Hayes wrote:
>> Dennis Allsopp (Author of Genota and Genota Forms) was visiting Johannesburg,
>> so I dropped in to see him and we had an interesting chat.
>>
>> Three topics relevant to the Genealogy Software Forum came up:
>>
>> 1. The need for an event-based database program for genealogists, historians,
>> biographers etc.
>%><
>
>This concept of "an event-based database program" bothers me on a couple
>of grounds.
>
>Firstly the notion that data must be this-based versus that-based versus
>the other-based is an exceedingly limited view of databases. If you
>think of the typical database-based package on which a business might be
>run marketing may see it as customer-based, sales as order-based,
>shipping as despatch-based and accounts as invoice-based and those are
>only some of the options. They're all correct but only partially so as
>they're all perspectives on the overall schema. The same applies to a
>database for genealogists etc. (BTW one thing that does not bother me
>is the idea that genealogists, historians, biographers etc. might use
>the same package; that I heartily applaud.)
Well that's OK.
The point is that most software for genealogy is "lineage linked", and for
genealogy that is probably the most useful program.
But for historians, including family historians, another kind of program could
be useful -- one which will allow events to be listed in chronological order,
and which would also list the people associated with each event.
>
>Secondly, if you give primacy to anything it must be evidence. The
>"events" in the database are only reconstructions; the actual events
>(and, for the greater part, the people who took part in them) are long
>gone. What we record are our reconstructions of them based on the
>evidence that's left. Apart from the current descendants the evidence
>is the only thing that's still real.
Evidence is raw data. There are plenty of note-taking programs available for
recording that, including, for genealogists, Dennis Allsopp's Genota and
Genota Forms.
What I feel the lack of is something that will help to turn the raw data into
information, to help one discern patterns in it.
As I said, I think there already are such programs. One that I use is askSam:
...which you're not going to get with a note-taking program. You need
to think of a number of steps.
The first is to collect items of evidence or, if you prefer, raw data.
This is not necessarily text - it could be images.
The next step is analysis - what names are in each original record
(retaining the spelling as found)? What events are in the record (a
record doesn't necessarily have to record a single event, or even any -
think tithe map)? What roles do the names have? Also what place names
are represented? How are these tied to events? All these analytical
results need to be related to each other & to the original material.
The next step is to start to work out what individuals are represented.
Note that in the previous step I said names, not individuals. The
same individual may have his name spelled in different ways and, of
course many individuals will have the same name. These reconstructed
individuals will need to be linked back to the names on which they're
based and through them to the material collected in the first step.
The final step (which will overlap with the previous one) is to
reconstruct the relationships which, of course, are linked back to the
everything else including the roles extracted in the second step.
ISTM that present-day genealogical S/W takes the first two steps for
granted and supports the second two. Your suggestion of note-taking S/W
partially supports the first. I get the impression that what you're
looking for is, in fact part of the second step but I don't see how it
could be implemented in isolation from the first or what value it
represents in isolation from the remainder.
Firstly you don't manipulate the evidence. The post to which I replied
gave an instance of what can happen when evidence is manipulated.
Secondly, evidence remains crucial. I spent half my working life as a
scientist investigating past events. Good practice is to present both
evidence and conclusions keeping the two separate. This allows two
things, one is for the reader to decide whether they agree that the
evidence merits the conclusions and the second is to allow the reader to
compare the evidence with their own findings. Neither of these is
possible if the evidence isn't presented. The present practice is that
people will share GEDCOMs, publish family trees etc but in the absence
of evidence anyone who needs to make use of this has to repeat the
research - which becomes impossible if it's not clear what the source of
the evidence was. A database which includes that evidence or, as you
put it, a representation of it, would be able to export that evidence
for examination.
Let me give you another example. I'm looking for the marriage of Caleb
Howard to his wife Betty. It's not in the registers of either of the
two local parishes. The only thing that IGI can offer is a member
submission of a marriage in Mottram-in-Longdendale parish. It may be
correct but it's not, however, in the regular extracts of the Mottram
marriages. If the member submissions included a facility to include the
evidence on which they're based I could follow it up to verify it. As
things stand, however, it takes me no further forward.
>Steve Hayes wrote:
>> What I feel the lack of is something that will help to turn the raw data into
>> information, to help one discern patterns in it.
>
>...which you're not going to get with a note-taking program. You need
>to think of a number of steps.
That's what I said.
>The first is to collect items of evidence or, if you prefer, raw data.
>This is not necessarily text - it could be images.
>
>The next step is analysis - what names are in each original record
>(retaining the spelling as found)? What events are in the record (a
>record doesn't necessarily have to record a single event, or even any -
>think tithe map)? What roles do the names have? Also what place names
>are represented? How are these tied to events? All these analytical
>results need to be related to each other & to the original material.
>
>The next step is to start to work out what individuals are represented.
> Note that in the previous step I said names, not individuals. The
>same individual may have his name spelled in different ways and, of
>course many individuals will have the same name. These reconstructed
>individuals will need to be linked back to the names on which they're
>based and through them to the material collected in the first step.
>
>The final step (which will overlap with the previous one) is to
>reconstruct the relationships which, of course, are linked back to the
>everything else including the roles extracted in the second step.
>
>ISTM that present-day genealogical S/W takes the first two steps for
>granted and supports the second two. Your suggestion of note-taking S/W
>partially supports the first. I get the impression that what you're
>looking for is, in fact part of the second step but I don't see how it
>could be implemented in isolation from the first or what value it
>represents in isolation from the remainder.
What lineage-linked genealogy software does is show the relationships between
people in families, and some information about individual people in the
database.
What I am looking for is a research tool that will help one to organise
information for family history rather than just genealogy (and would also be
useful for biogtraphers and general historians as well) to include relations
with people other than family members, and relations of people and groups or
organisations to events. One of the main purposes I envisage is helping one to
get the chronology of events in order, together with people and groups
associated with those events.
The *evidence* for the event can be noted in a note-taking program, so that is
not the most urgent need that I see. The program I envisage would make
provision for sources to be noted. Say, for example, the event is a plane
crash. Evidence might be newspaper reports and so on, and, as you say, images
-- a photo of the crash scene, for example. The program I envisage might also
be used as a took for someone investigating the cause of the plane crash. A
family historian might treat it as a single event in which a family member was
killed or injured. An accident investigator might treat it as a series of
discrete events that led up to the crash, and even the crash itself could be a
series of discrete events -- first impact, tail section breaks off; second
impact, right wing breaks off and so on. All this is, as you say, a
reconstruction, but the reconstruction could be for different purposes and in
different degrees of detail -- in one case to analyse the causes of the crash,
and, in the case of the family historian, to explain the death of a member of
a family, or even several members.
>The next step is analysis - what names are in each original record
>(retaining the spelling as found)? What events are in the record (a
>record doesn't necessarily have to record a single event, or even any -
>think tithe map)? What roles do the names have? Also what place names
>are represented? How are these tied to events? All these analytical
>results need to be related to each other & to the original material.
>
>The next step is to start to work out what individuals are represented.
> Note that in the previous step I said names, not individuals. The
>same individual may have his name spelled in different ways and, of
>course many individuals will have the same name. These reconstructed
>individuals will need to be linked back to the names on which they're
>based and through them to the material collected in the first step.
<SNIP>
>ISTM that present-day genealogical S/W takes the first two steps for
>granted and supports the second two. Your suggestion of note-taking S/W
>partially supports the first. I get the impression that what you're
>looking for is, in fact part of the second step but I don't see how it
>could be implemented in isolation from the first or what value it
>represents in isolation from the remainder.
Someone asked a similar question in another forum, so I've given an example
that I hope will answer both questions:
On 30 Nov 2009 at 10:28, ray_murphy aus wrote:
> As you can see, I'm actually interested in the recording of events but I
>
still
> cannot visualise many situations where event information could be used
> for genealogy or how it can be handled easily by the average user - so
> it looks like we need some examples to motivate us.
Here is such an example -- a report from the database whose structure
(field list) I posted a few messages back:
Events relating to Fred Green Page No 1
Search strategy: 09/12/01 09:56:33
GET KE/EV/EN/NO/PE/EP: fred w3 green
4-Apr-1829 Canada, Quebec, Montreal
BIRTH OF FRED GREEN, MONTREAL
Frederick Thomas, son of William Green, Deputy Assistant
Commissary General and of Margaret Gray his wife was born the
fourth day of April 1829 and baptizen o n the twenty-ninth
day of may following by me.... the sponsors are William
Goodall and Eliza Glasgoe, natural cousins to the infant, and
Deputy Assistant Commissary General Samuel Tubby
People: 1. Green, Fred [144]
2. Green, William John [140]
3. Gray, Margaret [141]
4. Goodall, William [10032]
5. Glasgow, Eliza
6. Tubby, Samuel
Sources: 1. Church Register, Montreal
Events relating to Fred Green Page No 2
4-Jan-1853 Orange River Sovereignty, Bloemfontein
CHARLES & FRED GREEN ARRIVED IN BLOEMFONTEIN FROM TRIP TO
LAKE NGAMI "Charles and Fred Green, brothers of the Resident,
came down from the interior about the 3rd or 4th of the
month. They had been unfortunate in their trip, had neenm to
the lake and some 120 miles to the westward of it, and just
as they had got into the midst of the elephants the fly
(tsetse) got among their horses and killed some 34 horses and
50 head of cattle. They only shot six or eight elephants.
They also lost 50 head of cattle to the Boors, who took them
from sechele where they had left them for their return
journey. Sechele, chief of the Baquainas, a tribe who live
some 450 miles from here, also came down with the Greens to
lodge a complaint against the Trans Vaal Boors for having
attacked him without cause, killed many of his people and
taken some two hundred women and nearly a thousand children
into slavery. A young Edwards, son of a missionary of the
same anme, came down with him as interpreter. Sechele is one
of the finest blacks I ever saw, has a fine open coutnenance,
dresses very neatly and clean. Although he cannot speak
English he reads the Bible in his own language and is I
believe a good Christian. We have had great fun lately, the
Greens being very jolly fellows, particularly Fred Green. We
have had reunions of an evening at Craws', screeching and
howling to the masthead, and also some very good songs and
music. Charles Green, finding that Sechele was not likely to
get much done for him in the Colony, determined to take him
home. He accordingly opened a subscription for the purpose,
and very soon collected � [blank] even in this small town, as
all the reports from beyond the Vaal confirm us in the belief
that slavery is carried on openly by the Boors there."
People: 1. Green, Fred [140]
2. Green, Charles [502]
3. Sechele
4. Edwards, Samuel Howard
Sources: 1. St John Diary, p. 45
2. cf Sillery 1954, "Sechele", p 116f
Events relating to Fred Green Page No 3
12-Feb-1853 Orange River Sovereignty, Bloemfontein
CHARLES & FRED GREEN DINE IN OFFICERS' MESS WITH ST JOHN.
AFTERWARDS MEET HENRY AND ARTHUR GREEN IN THE CLUB. HENRY IS
THE RESIDENT, ARTHUR GREEN IN THE COMMISSARIAT AND FRED IS
DESCRIBED AS SURVEYOR. "Played one game of billiards with
Charles Green, who dined wuith me at mess. Present, Major
Kyle, Captain Bates, Howard and Rowland, all 45th, Cameron,
Staff Assistant Surgeon, and myself, the members of the mess,
and Charles and Fred Green and Dawson, late 45th, guest. In
the evening Lowen the magistrate and De Smidt of the
Comissariat came up. About 9 Charles Green and I adjourned to
the club, where we met his brothers Henry Green, the
Resident, and Arthur Green in the Commissariat, and also Fred
the surveyor. I played one game with Charles Green and
adjourned to my house" (St John diary, p, 52). This is the
last mention of Charles Green in St John's diary -- perhaps
he accompanied Sechele back home, as St John had described
earlier, or perhaps went on to Cape Town, as described by
Tabler (1973:45).
People: 1. Green, Fred [144]
2. Green, Charles [502]
3. Green, Henry [480]
4. Green, Arthur [936]
5. Bates, Captain Robert
6. Kyle, Major Hallam d'Arcy
Sources: 1. St John Diary, p. 52
2-Mar-1853 Orange River Sovereignty, Bloemfontein
FRED GREEN GOES ON HUNTING EXPEDITION FROM BLOEMFONTEIN
WITH W. ST JOHN, JOHANNES DE SMIDT & WILLIAM DAWSON.
Fred Green set out with William St John, an officer of
the Royal Artillery, from Bloemfontein to A.H. Bain's
farm at Tempe, where they had dinner, and on to
Kwaggafontein, from where they set out on a fortnight's
hunting expedition.
People: 1. Green, Fred
2. St John, William Jones
3. Dawson, William
4. de Smidt, Johannes
Sources: 1. St John, Diary, p. 54
Events relating to Fred Green Page No 4
17-Mar-1853 Orange River Sovereignty, Bloemfontein
FRED GREEN & CO RETURN TO BLOEMFONTEIN FROM HUNTING
EXPEDITION
People: 1. Green, Fred
2. St John, William Jones
3. Dawson, William
4. de Smidt, Johannes
Sources: 1. St John, Diary, p. 63
22-Mar-1853 Orange River Sovereignty, Bloemfontein
FRED GREEN, ST JOHN, DICK ORPEN HAVE DINNER WITH A.H.
BAIN
People: 1. Green, Fred
2. St John, William Jones
3. Bain, Andrew Hudson
4. Orpen, Richard John Newenham
Sources: 1. St John Diary, p. 64
31-May-1853 Orange River Sovereignty, Bloemfontein
FRED GREEN DINED WITH WILLIAM ST JOHN IN BLOEMFONTEIN
MESS
Last mention of Fred Green in St John's diary. On 12 July St
John went on a trip to Harrismith, and returned on 13 August,
when the diary ends. Fred Green may have left Bloemfontein by
then.
People: 1. Green, Fred
Sources: 1. St John Diary, p. 79
21-Jun-1857 Damaraland
F. GREEN ATTENDED DUTCH SERVICE
Green's Hereros attended morning service conducted by
Hahn, and in the afternoon came with young Bonfield and
two wagon drivers to the Dutch service.
People: 1. Green, Fred [144]
Sources: 1. Hahn's diary
Now that is a flat file database, not relational, so the names of the
people assocated with each event were typed separately into each record.
That is the kind of duplication that a relational database can avoid.
The list of events was selected by the search argument "fred w3 green",
which means any record that had "fred" within three words of "green".
I could also select all records referring to Bloemfontein, whether Fred
Green was involved or not.
I rather think we're mis-communicating the same ideas to each other.
Let's take the last item of your example:
21-Jun-1857 Damaraland
F. GREEN ATTENDED DUTCH SERVICE
Green's Hereros attended morning service conducted by
Hahn, and in the afternoon came with young Bonfield and
two wagon drivers to the Dutch service.
People: 1. Green, Fred [144]
Sources: 1. Hahn's diary
I don't know whether the original is a machine readable text (e.g. a PDF
from Google books) or a physical text which could be scanned to make it
machine readable. In either case, however let's assume it's rendered
machine readable. In a relational database this could be stored in a
BLOB (Binary Large Object - a general data type which can hold text,
image, PDF, anything which can be held in a file another database
even!). The BLOB would contain just this single entry. It's also
possible that we could have multiple BLOBs to hold images transcripts,
maybe alternative transcripts if the hand-writing's unclear,
translations etc.
We need to provide a source hierarchy to put it into context. Why do I
say a hierarchy? Well, where does Hahn's diary, in this context, come
from? If it's a physical diary in an archive then at the top of the
hierarchy you'd have the archive. Within the archive it might belong to
a collection of papers, the second level, and at the lowest level would
be the diary itself. The database row containing the BLOB would also
contain a pointer back to the diary. You could make a similarly
appropriate hierarchy for a published book or any other source.
You have, in fact, a number of names in this text: F. Green, Hahn and
young Bonfield. (You also have a place name, Damaraland, and an
organisation name, Green's Hereros. Note that to me it looks as if
there may be a mis-spelling there and that Heros was meant but I'm not
in a position to make that call. However my rule about this would be
that names are extracted as spelled in the original.)
Extracting these names is a fairly straightforward operation. We then
have to decide that the F. Green in this item is the same as the Fred
Green in another. This might be straightforward in this context but
might not be in another. If we hark back to my original example some
time ago I have the problem of whether the John Goddard, father of
Jonathan, is the same John Goddard who was the father of another 8 sons
and daughters of John Goddard, whether he was the John Goddard who
married Mary Collier and whether he was the son of Jonathan Goddard or
of William Goddard, both sons born in 1753. Another less than clear
context is variations of spelling; I've got an ancestor whose first name
is rendered in various sources ranging from Amon to Hammond and the only
signature I can find, on his marriage, differs from all those by which
he's indexed; just to add to the joy his surname is one which appears
under many variations.
To deal with this problem I'm suggesting an entirely different category
of record. It represents, as best as we can, the real individuals we
believe existed. It will provide a canonical spelling (e.g. Fred Green)
and some sort of unique handle (e.g. in my case I might have John
Goddard of Booth House where "of Booth House" distinguishes him from all
the other John Goddards). Needless to say places and organisations
would be similarly dealt with. At a detailed level, however, each of
these categories has its own complications: personal names have
different structures in different cultures whilst places belong to
various hierarchies such as civil and ecclesiastical, hierachies which
may have varied over time.
We now link the names to these individual records as they fit in. Your
F. Green name record, linked on the one hand to the Hahn diary extract,
would also be linked to the canonical Fred Green record. As you will
realise it's then perfectly straightforward, with an RDBMS to grab all
the BLOBs referring to a specific Fred Green. Given that we have
similar chains dealing with place names and organisations we can also
grab all the BLOBs referring to Green's Heros or the list of places,
such as Damaraland, which Fred was known to have visited. This last
doesn't require that we keep the evidence on-line but the previous two do.
In the previous paragraph I mentioned links. These would be a separate
category of record with a pointer to the name on the one hand and an
individual on the other. They could also contain an assessment of the
strength of the link. For instance the F. Green in a particular record
might be Fred Green but there's some doubt so it could have some marking
to represent "probably". We could even have negative links to represent
the fact that this F. Green is not Ferdinand Green.
Going back to the example there is another category of information to be
extracted, namely event which you don't list alongside names and
sources. I think you're treating the whole diary entry as an event but
it should be obvious that the same event can be the subject of many
evidential records; just think how many such records deal with the
Battle of the Somme. So we'd then have another set of event names,
canonical events and links. This would enable you to grab all the BLOBs
referring to, say the Dutch service of 21-Jun-1857.
Does this look as if these ideas share the same aims as yours?
Yes, I think we are talking about two rather different requirements or
specifications of a program to do different things.
I'm not looking for BLOBs, or deep source analysis, which seems to be your
chief concern.
In that particular case the source was a diary that has been published. The
original is German, and I'm not sufficiently au fait with 19th-century German
handwriting to want a copy of the original MS, though no doubt one could get
one made in the Windhoek Archives where I believe the original is kept.
>We need to provide a source hierarchy to put it into context. Why do I
>say a hierarchy? Well, where does Hahn's diary, in this context, come
>from? If it's a physical diary in an archive then at the top of the
>hierarchy you'd have the archive. Within the archive it might belong to
>a collection of papers, the second level, and at the lowest level would
>be the diary itself. The database row containing the BLOB would also
>contain a pointer back to the diary. You could make a similarly
>appropriate hierarchy for a published book or any other source.
For my purpose, the published version is sufficient. In most history books,
for example, printed sources are just listed with bibliographical references
-- date, author, title, place, publisher. The repository or library is not
normally mentioned, except perhaps in the case of very rare out-of-print
books.
>You have, in fact, a number of names in this text: F. Green, Hahn and
>young Bonfield. (You also have a place name, Damaraland, and an
>organisation name, Green's Hereros. Note that to me it looks as if
>there may be a mis-spelling there and that Heros was meant but I'm not
>in a position to make that call. However my rule about this would be
>that names are extracted as spelled in the original.)
>
>Extracting these names is a fairly straightforward operation. We then
>have to decide that the F. Green in this item is the same as the Fred
>Green in another. This might be straightforward in this context but
>might not be in another. If we hark back to my original example some
>time ago I have the problem of whether the John Goddard, father of
>Jonathan, is the same John Goddard who was the father of another 8 sons
>and daughters of John Goddard, whether he was the John Goddard who
>married Mary Collier and whether he was the son of Jonathan Goddard or
>of William Goddard, both sons born in 1753. Another less than clear
>context is variations of spelling; I've got an ancestor whose first name
>is rendered in various sources ranging from Amon to Hammond and the only
>signature I can find, on his marriage, differs from all those by which
>he's indexed; just to add to the joy his surname is one which appears
>under many variations.
For the kind of purpose I have in mind, a person record could have notes
recording variations, and in the case of a link of that person to a particular
event, there should also be provision for such problems to be noted, with
perhaps a certainty field, so that one could isolate uncertain identifications
for further research.
>To deal with this problem I'm suggesting an entirely different category
>of record. It represents, as best as we can, the real individuals we
>believe existed. It will provide a canonical spelling (e.g. Fred Green)
>and some sort of unique handle (e.g. in my case I might have John
>Goddard of Booth House where "of Booth House" distinguishes him from all
>the other John Goddards). Needless to say places and organisations
>would be similarly dealt with. At a detailed level, however, each of
>these categories has its own complications: personal names have
>different structures in different cultures whilst places belong to
>various hierarchies such as civil and ecclesiastical, hierachies which
>may have varied over time.
Yes, that is what I have in mind. One could also have one or two fields for
User Ids, one of which could perhaps be the RIN of the person in a
lineage-linked program (and the person records could perhaps also be populated
by a GEDCOM import).
>We now link the names to these individual records as they fit in. Your
>F. Green name record, linked on the one hand to the Hahn diary extract,
>would also be linked to the canonical Fred Green record. As you will
>realise it's then perfectly straightforward, with an RDBMS to grab all
>the BLOBs referring to a specific Fred Green. Given that we have
>similar chains dealing with place names and organisations we can also
>grab all the BLOBs referring to Green's Heros or the list of places,
>such as Damaraland, which Fred was known to have visited. This last
>doesn't require that we keep the evidence on-line but the previous two do.
>
>In the previous paragraph I mentioned links. These would be a separate
>category of record with a pointer to the name on the one hand and an
>individual on the other. They could also contain an assessment of the
>strength of the link. For instance the F. Green in a particular record
>might be Fred Green but there's some doubt so it could have some marking
>to represent "probably". We could even have negative links to represent
>the fact that this F. Green is not Ferdinand Green.
I would see the links as being a link between person and event, in order that
one can have a many-to-many relationship between persons and events. The
person-event link field would contain the notes about the certainty of the
identification, and also something about the person's role in the event.
>Going back to the example there is another category of information to be
>extracted, namely event which you don't list alongside names and
>sources. I think you're treating the whole diary entry as an event but
>it should be obvious that the same event can be the subject of many
>evidential records; just think how many such records deal with the
>Battle of the Somme. So we'd then have another set of event names,
>canonical events and links. This would enable you to grab all the BLOBs
>referring to, say the Dutch service of 21-Jun-1857.
Of course.
In the flat file example I gave, each source reference would have to be
entered separately in each record, just as each person did. But in a relatioal
database one could have many events referring to one source, and many sources
referring to one event.
>Does this look as if these ideas share the same aims as yours?
To a cettain extent, except that I think you are more concerned with source
analysis, and I am more concerned with events and relationships. That doesn't
necessarily mean that the same program couldn't be designed to handle both,
but that might also make it quite complex. It seems that for you the BLOBs are
central, whereas I see the events as central, and BLOBs as a peripheral "nice
to have" thing.
One thing I have in mind is the user who wishes to share information
with other usesr, rather as GEDCOM is used (but better!). This is fine
if you're dealing with a published book. If it's an original MS that's
not going to help, you need to say where you got it. If it's a very old
text existing only in hand-written copies you need to specify what copy.
Also, if it's an original even if you give the repository detail that's
no use to those who can't get there. Being able to export an image or a
transcript would be the only thing of use.
Bear in mind also that anything other than a program written for your
own exclusive use will need to meet other people's purposes and not just
your own. My own approach is to think what's the general requirement of
which the present one is just a corner.
>> Extracting these names is a fairly straightforward operation. We then
>> have to decide that the F. Green in this item is the same as the Fred
>> Green in another.... Another less than clear
>> context is variations of spelling; I've got an ancestor whose first name
>> is rendered in various sources ranging from Amon to Hammond and the only
>> signature I can find, on his marriage, differs from all those by which
>> he's indexed; just to add to the joy his surname is one which appears
>> under many variations.
>
> For the kind of purpose I have in mind, a person record could have notes
> recording variations, and in the case of a link of that person to a particular
> event, there should also be provision for such problems to be noted, with
> perhaps a certainty field, so that one could isolate uncertain identifications
> for further research.
But if you're thinking in terms of an RDMBS implementation information
stored as notes is going to be difficult to query unless you used
something like Informix with the text-search datablade (if the last bit
is meaningless just make that difficult to query full stop).
>
>> To deal with this problem I'm suggesting an entirely different category
>> of record. It represents, as best as we can, the real individuals we
>> believe existed. It will provide a canonical spelling (e.g. Fred Green)
>> and some sort of unique handle (e.g. in my case I might have John
>> Goddard of Booth House where "of Booth House" distinguishes him from all
>> the other John Goddards). Needless to say places and organisations
>> would be similarly dealt with. At a detailed level, however, each of
>> these categories has its own complications: personal names have
>> different structures in different cultures whilst places belong to
>> various hierarchies such as civil and ecclesiastical, hierachies which
>> may have varied over time.
>
> Yes, that is what I have in mind. One could also have one or two fields for
> User Ids, one of which could perhaps be the RIN of the person in a
> lineage-linked program (and the person records could perhaps also be populated
> by a GEDCOM import).
>
>> We now link the names to these individual records as they fit in.
<snip>
>>
>> In the previous paragraph I mentioned links. These would be a separate
>> category of record with a pointer to the name on the one hand and an
>> individual on the other. They could also contain an assessment of the
>> strength of the link. For instance the F. Green in a particular record
>> might be Fred Green but there's some doubt so it could have some marking
>> to represent "probably". We could even have negative links to represent
>> the fact that this F. Green is not Ferdinand Green.
>
> I would see the links as being a link between person and event, in order that
> one can have a many-to-many relationship between persons and events. The
> person-event link field would contain the notes about the certainty of the
> identification, and also something about the person's role in the event.
What I've suggested does just that. In fact the link table is the
classic way of implementing many-to-many links. That's why I suggested t!
>> Going back to the example there is another category of information to be
>> extracted, namely event which you don't list alongside names and
>> sources. I think you're treating the whole diary entry as an event but
>> it should be obvious that the same event can be the subject of many
>> evidential records; just think how many such records deal with the
>> Battle of the Somme. So we'd then have another set of event names,
>> canonical events and links. This would enable you to grab all the BLOBs
>> referring to, say the Dutch service of 21-Jun-1857.
>
> Of course.
>
> In the flat file example I gave, each source reference would have to be
> entered separately in each record, just as each person did. But in a relatioal
> database one could have many events referring to one source, and many sources
> referring to one event.
Back to the link table, then, because that's what you need to provide
many-to-many relationships.
>> Does this look as if these ideas share the same aims as yours?
>
> To a cettain extent, except that I think you are more concerned with source
> analysis, and I am more concerned with events and relationships. That doesn't
> necessarily mean that the same program couldn't be designed to handle both,
> but that might also make it quite complex. It seems that for you the BLOBs are
> central, whereas I see the events as central, and BLOBs as a peripheral "nice
> to have" thing.
>
>
As I said above such a program would have to meet many different related
sets of requirements (or use cases to be trendy but inaccurate).
Also, as I said above, I think the facility to share by exporting and
importing data would be a major element of most people's requirements.
What happens if you don't have the original evidence to include?
Let me give another example. This is an example of failure to provide
original material taken from good, old-fashioned print publishing rather
than electronic:
Sir John Godard is of interest in the local history of C14th Yorkshire
and, through his daughters, of genealogical interest. He seems to have
lived his early life as a soldier before marrying a rich widow and
taking part in public life. He then became a knight of the shire in
Parliament, escheator, High Sheriff and justice. This phase is
documented in the various charges given to him by entries in court
rolls. These stop abruptly in 1392 when he is believed to have died.
We have IPMs and Wills relating to his widow & some of his children but
nothing on him. Evidence of his death and especially his Will would be
of interest except nobody seems to have found this. Except someone did.
A C19th reverend gentleman took an interest in people buried in
Dominican friaries as evidenced by their Wills and his findings were
published in the Antiquarian. These include an entry on Sir John. In a
Will proved early in 1392/3 Sir John expressed a wish to be buried in
the friary in Beverley. That's it. No transcript of the Will nor an
indication of where it was found. A proper source citation would have
been useful to those for whom whatever repository holds it is
accessible. But a transcript would have made it accessible to everyone.
What the reverend published were the sole facts of interest to
himself. As far as the rest of us are concerned most of what he found
in the Will remains as obscure as if he had never found it.
>Steve Hayes wrote:
>> For my purpose, the published version is sufficient. In most history books,
>> for example, printed sources are just listed with bibliographical references
>> -- date, author, title, place, publisher. The repository or library is not
>> normally mentioned, except perhaps in the case of very rare out-of-print
>> books.
>
>One thing I have in mind is the user who wishes to share information
>with other usesr, rather as GEDCOM is used (but better!). This is fine
>if you're dealing with a published book. If it's an original MS that's
>not going to help, you need to say where you got it. If it's a very old
>text existing only in hand-written copies you need to specify what copy.
>Also, if it's an original even if you give the repository detail that's
>no use to those who can't get there. Being able to export an image or a
>transcript would be the only thing of use.
>
>Bear in mind also that anything other than a program written for your
>own exclusive use will need to meet other people's purposes and not just
>your own. My own approach is to think what's the general requirement of
>which the present one is just a corner.
Indeed, and that's what I'm thinking of too.
Think, for example, of a biographer. Some sources may be original letters
written by, to, or about the subject of the biography. The writer may want to
store complete scans of the letters written by the subject, transcripts of
those written to the subject, and extracts from those written about the
subject. Some of these things can be stored as BLOBs, but the BLOB is not the
main and central item, and could be absent altogether (they consume a lot of
disk space).
As I see it, you are thinking primarily of a database to manage source
material, and I am thinking primarily of a database to interpret and order
material gathered from a variety of sources. So you start with the BLOB as the
main thing with everything else built around it. I start with the event, with
the BLOB (if present) as a twig
Event --> Event-Source link --> Source --> Repository
There could be a BLOB attached to the event-source link (reporduction of a
baptism certificate or register page), or it could be attached to the source
itself, if MS, but if a printed book it would simply give bibliographical
details.
Incidentally, I've now managed to download a recent version of TMG, which
seems to work on my computer (earlier versions caused it to crash).
Some people said that it could do the kind if thing I was looking for, but it
seems to be only able to link two people to an event.
> Some people said that it could do the kind if thing I was looking
> for, but it seems to be only able to link two people to an event.
It should be basically unlimited. For most events, there are
'Principals' (limited to two) and 'Witnesses' - TMG-speak for "A person
associated in some way with an event � not necessarily an
eyewitness" (which are unlimited). Some tags take only 'Witnesses'; the
History tag comes to mind. For reporting purposes, you can extensively
customize the sentence associated with a person/event.
--
Joe Makowiec
http://makowiec.org/
Email: http://makowiec.org/contact/?Joe
Usenet Improvement Project: http://twovoyagers.com/improve-usenet.org/
No problem, disk space is cheap. There are plenty of 1Tb disks being
offered on ebay for under �100, ditto 500Gb disks for laptops. And if
you're going to store it it will still occupy the space whether it's in
a file or a database BLOB.
> As I see it, you are thinking primarily of a database to manage source
> material, and I am thinking primarily of a database to interpret and order
> material gathered from a variety of sources.
No. Go back to what I said originally. If the database is properly
designed it doesn't have to be specifically event-based or linkage-based
and it doesn't have to be source-material-based either. These aspects
would simply be what RDBMS terminology calls projections of the
underlying data.
> So you start with the BLOB as the
> main thing with everything else built around it. I start with the event, with
> the BLOB (if present) as a twig
>
>
> Event --> Event-Source link --> Source --> Repository
>
> There could be a BLOB attached to the event-source link (reporduction of a
> baptism certificate or register page), or it could be attached to the source
> itself, if MS, but if a printed book it would simply give bibliographical
> details.
OK, you're simply doing what I'd do to look up data - following the
chain of links. But if the database embodies any relational integrity
measures you'd enter stuff in the other direction. You wouldn't be able
to enter a Source for which there was no Repository. So you'd enter the
Repository first and link all the Sources for that Repository as you
came by them. If your new Source mentions Events already in the
database you create new links for them. And if you want to wear the
hair-shirt and rummage for the original every time you want to read it
then just leave the Source's BLOB empty.
>
> Incidentally, I've now managed to download a recent version of TMG, which
> seems to work on my computer (earlier versions caused it to crash).
>
> Some people said that it could do the kind if thing I was looking for, but it
> seems to be only able to link two people to an event.
I rather think you make my point. TMG doesn't do what you want. That
doesn't stop thousands of other people using it. But if it /also/ did
what you wanted they'd still be able to use it as before, others like
yourself would also be able to use it and some of the existing users
would find they could use it in a new way.