Genealogy software online groups

31 views
Skip to first unread message

Ken Finnigan

unread,
Sep 30, 2024, 3:50:23 PM9/30/24
to root...@googlegroups.com
Hi All,

I've been following this group for a while and I was curious if there are other online groups where software and genealogy are discussed? Or is this group the only one?

Thanks
Ken Finnigan

Marshall Lake

unread,
Oct 1, 2024, 1:48:15 PM10/1/24
to Digest recipients

> Ken Finnigan <k...@kenfinnigan.me>: Sep 30 03:50PM -0400
>
> I've been following this group for a while and I was curious if there
> are other online groups where software and genealogy are discussed? Or
> is this group the only one?

Check <stackexchange.com>. It's a great area for discussions, and getting
questions answered on various topics.

Searching for "genealogy software development" (without the quotes) gets
many hits.

--
Marshall Lake -- marsha...@gmail.com -- http://www.mlake.net

Ken Finnigan

unread,
Oct 6, 2024, 9:37:36 PM10/6/24
to root...@googlegroups.com
Hi Marshall,

Apologies, I wasn't very clear with my initial question.

I'm specifically looking for online groups such as this where those developing genealogy tools and software gather. This group is the only one I've found so far, but I was curious to know if those in this group have come across others?

Thanks

--
Ken Finnigan - ken@kenfinnigan,me - https://kenfinnigan.me/

--

---
You received this message because you are subscribed to the Google Groups "rootsdev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rootsdev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rootsdev/33fdb17-2bb0-d9df-c5ce-71e23184894f%40mlake.net.

Thomas Wetmore

unread,
Oct 7, 2024, 5:17:52 AM10/7/24
to root...@googlegroups.com
Ken, 

FWIW, I know of no other.

Tom W.

On Oct 6, 2024, at 9:37 PM, Ken Finnigan <k...@kenfinnigan.me> wrote:

Apologies, I wasn't very clear with my initial question.

I'm specifically looking for online groups such as this where those developing genealogy tools and software gather. This group is the only one I've found so far, but I was curious to know if those in this group have come across others?

Thanks

Enno Borgsteede

unread,
Oct 7, 2024, 7:22:23 AM10/7/24
to root...@googlegroups.com
Hi Ken,

> I'm specifically looking for online groups such as this where those
> developing genealogy tools and software gather. This group is the only
> one I've found so far, but I was curious to know if those in this
> group have come across others?

I can't speak for other members, and I don't know any other groups,
other than the ones on stack exchange, and groups specialized in a
particular program like Gramps, for which we have a forum on Discourse.

Are you looking for any specific subjects to discuss?

Regards,

Enno


lkes...@lkessler.com

unread,
Oct 7, 2024, 11:34:59 AM10/7/24
to root...@googlegroups.com

Ken,

 

Good question.

 

There is this group that has been around for a while:

 

  • gedc...@genealogy.net is a maillist for the GEDCOM-L developers and others. These are mostly European developers and they usually post in German, but there is different thinking and different ideas here.

 

And you might want to check out the Genealogy Software Forums at www.GenealogySoftware.net.  This is a new site started just a few months ago by Chad Osten.  He’s got a lot of excellent content at the site and he’s hoping to get some discussion happening in his forums.

 

Louis Kessler

www.beholdgenealogy.com

Ken Finnigan

unread,
Oct 7, 2024, 2:26:38 PM10/7/24
to root...@googlegroups.com
Thanks for the links Louis!

Ken Finnigan

unread,
Oct 7, 2024, 9:53:14 PM10/7/24
to root...@googlegroups.com
Thanks Tom.

Hi Enno,

No specific topics at this time, though I am planning to get into developing genealogy software/tools over the coming 6 mths and wanted to verify if this was the best, or only, place to ask questions related to it.

Thanks
Ken

--

---
You received this message because you are subscribed to the Google Groups "rootsdev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rootsdev+u...@googlegroups.com.

Enno Borgsteede

unread,
Oct 8, 2024, 10:22:19 AM10/8/24
to root...@googlegroups.com
Hi Ken,

> No specific topics at this time, though I am planning to get into
> developing genealogy software/tools over the coming 6 mths and wanted
> to verify if this was the best, or only, place to ask questions
> related to it.

OK, thanks. I asked it, because IMO a lot depends on the sort of things
that you want to do, like whether you want to build on current big
'standards' like GEDCOM, and write tools for that, adopt another model,
or build you own. Would you be willing to expand concepts for evidence
based genealogy, where you start with the source, and not with a tree to
which you add sources later, like most people do now? Are you interested
in better citations, like I've been since I joined Better GEDCOM around
2010?

I'm working with Gramps, since about 2010, which has its own data model,
and its own export format, which we call Gramps XML, but which is still
conclusion based, and I don't like that, but don't have the energy to
change that myself. It's open source, and works in about 40 languages,
but still sort of dominated by English/American culture, and the data
model that's sort of pushed by GEDCOM.

I must add that I'm quite happy with mail lists, like this one, because
they give me the opportunity concentrate on the text. Web forums drive
me away, because they're too messy, and require extra work, because I
have to visit them, one by one, where mail always arrives in one place.

Regards,

Enno


Ken Finnigan

unread,
Oct 11, 2024, 3:09:38 AM10/11/24
to root...@googlegroups.com
Hi Enno,

Though whatever I build would integrate with GEDCOM, as best it could, right now I wouldn't see it being the core model.

I've read a lot of comments online over the last couple of years from family historians and genealogists talking of a "source first" approach. This is the approach I want to take with whatever I build, and likely utilizing graph models to define connections between sources and people. I'd also like to explore an idea which, if I recall, Tom has spoken about previously, that of "personas" where an individual can actually be made of different collections of source material that are aligned.

With citations, I'm definitely interested in better and more consistent citations. I've heard discussed previously a large problem is citations from different companies referencing the same underlying data set, making it difficult to uniquely identify a record.

Regards
Ken

--

---
You received this message because you are subscribed to the Google Groups "rootsdev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rootsdev+u...@googlegroups.com.

paul...@gmail.com

unread,
Oct 11, 2024, 8:35:06 AM10/11/24
to root...@googlegroups.com

Hehe Ken, nothing could be worse than using GEDCOM as a “core model”. Anyway…

 

Every source citation that references a "person" *by definition* contains a persona: a set of details describing some person (real or fictional) with some degree of "precision".

 

Properly done, citation content will highlight all personal identifying detail from the background of other narrative.

 

"At 2 a.m. on Saturday in Victoria Park, Sergeant Plod detained [a man] [in his 40s] who gave the [name Jo]."

 

That example is deliberately vague because other lines of evidence may later be adduced to modify, reinforce, or contradict this event narrative.

 

Each statement is enduring, whether or not ever linked to some "person" of interest to the researcher. Any link made should be qualified with a level of confidence. Links to multiple (alternative) persons are valid.

 

A collection of statements around one event should also be assessed as a whole, with overall confidence as to the uniqueness of person identity, plus justification for resolving uncertainties or contradictions.

 

It should be remembered that historical persons in general have no independent reality/existence and are no more than a construction, a set of references, i.e. personae.

 

BTW, a citation is the key entity (a.k.a. statement, assertion, piece of evidence, etc). It *references* some (possibly hierarchical) source of whatever provenance or reliability.

 

Therefore, I object to the term "source first" when "citation first" is better. "Evidence first" is better still, embracing the separation of evidence from how and where it was obtained.

 

All the best

Paul White

 

From: root...@googlegroups.com <root...@googlegroups.com> On Behalf Of Ken Finnigan
Sent: 08 October 2024 14:38
To: root...@googlegroups.com
Subject: Re: [rootsdev] Genealogy software online groups

 

Though whatever I build would integrate with GEDCOM, as best it could, right now I wouldn't see it being the core model.

 

I've read a lot of comments online over the last couple of years from family historians and genealogists talking of a "source first" approach. This is the approach I want to take with whatever I build, and likely utilizing graph models to define connections between sources and people. I'd also like to explore an idea which, if I recall, Tom has spoken about previously, that of "personas" where an individual can actually be made of different collections of source material that are aligned.

 

With citations, I'm definitely interested in better and more consistent citations. I've heard discussed previously a large problem is citations from different companies referencing the same underlying data set, making it difficult to uniquely identify a record.

.

Enno Borgsteede

unread,
Oct 11, 2024, 8:59:07 AM10/11/24
to root...@googlegroups.com
Hi Ken,

> Though whatever I build would integrate with GEDCOM, as best it could,
> right now I wouldn't see it being the core model.
OK, I understand. And at the same time I still wish that we can expand
on GEDCOM in such a way that other authors will adopt it. And that's
because in a sense, GEDCOM is a driving force, like USB, and SCART, if
you are old enough to know what that is.
> I've read a lot of comments online over the last couple of years from
> family historians and genealogists talking of a "source first"
> approach. This is the approach I want to take with whatever I build,
> and likely utilizing graph models to define connections between
> sources and people. I'd also like to explore an idea which, if I
> recall, Tom has spoken about previously, that of "personas" where an
> individual can actually be made of different collections of source
> material that are aligned.

I like that, and there are a few programs that come quite close.
FamilySearch is an example, where you can literally see the sources and
the individuals aligned. And Centurial is another. It's a .NET program
that was created by a fellow Dutch software engineer, but it's on hold,
closed source, and slow, and very bureaucratic.

There are also half-breeds like Clooz and Evidentia, and I call these
half-breeds, because they allow you to attach sources to personae inside
their own databases, but still export standard conclusion based GEDCOM
files, and there's no easy way to untangle the source data, which is
something that I demand as a genealogist. They're also closed source, so
I will never use them, ever.

Technically speaking personae are quite easy to implement using existing
person objects, as they exist in our current software, because these can
already be linked to the sources that they can be derived from. That
means that most of your graph is already there, and you can say that
GEDCOM is a graph model by itself. And since you can also already create
associations between persons, one way of implementing the persona
concept is as simple as associating a 'conclusion' or 'tree' person with
the persons derived from the sources. What I mean here is that persons
in different records can all be associated with one person in the tree
who is supposed to share the same identity. That person is an individual.

> With citations, I'm definitely interested in better and more
> consistent citations. I've heard discussed previously a large problem
> is citations from different companies referencing the same underlying
> data set, making it difficult to uniquely identify a record.

That is one of the problems that I see indeed, and IMO, that can also be
solved with a simple and similar extension of the GEDCOM standard, where
you create associations between source objects, which should have all
the fields that we now have in the source and citation structures in the
standard.

With these associations, you can express the fact that a source is a
part of a collection, like we have in most of our archives, which have
two or more layers of storage, and another route, where a film is a
reproduction of one or more books, with frames showing one of more pages
of the original. And you can take that even further, when you create
indexes from these films, like FamilySearch does, and other companies.
These indexes can also be associated with their sources, which can be
those films, or the originals, in indexes created by the archives, like
here in The Netherlands, where one can often find new color scans from
these books too.

This means that with a simple trick, you can follow the fonds, and
provenance:

https://en.wikipedia.org/wiki/Fonds

GEDCOM X does that already, but there is no standard for citations yet.

Regards,

Enno


Thomas Wetmore

unread,
Oct 11, 2024, 11:10:37 AM10/11/24
to root...@googlegroups.com

On Oct 8, 2024, at 10:37 AM, Ken Finnigan <k...@kenfinnigan.me> wrote:

Hi Enno,

Though whatever I build would integrate with GEDCOM, as best it could, right now I wouldn't see it being the core model.

1. You can think of Gedcom in different ways. At one extreme is Gedcom5.5.1, where it is a standard for encoding lineage linked genealogical data. At the other extreme it is equivalent to XML or JSON or semantic net formats, a general purpose syntax that can encode any information.

2. You need a backing store for genealogical software. The usual choice is an SQL database. There are other options. My choice for many years was a B-Tree where the records were pure Gedcom. It was an excellent choice. Now I eschew persistent databases and use a Gedcom file as backing store. When software starts up it reads the file into hash tables. It takes less than a second to read my 22,000 records. Performance is instantaneous.

3. Source first or evidence based approaches are great, but it is hard to be disciplined enough to make it part of your computerized genealogy, and hard to find a program that supports it and the conclusion model together. A review. We want information about persons and families. We want to think of them as real people who existed. They are conclusion persons or individuals. In contrast a persona (you can call a persona a person if you call a conclusion person an individual) is the mention of a person in evidence found in some source. Software can also store the sources, evidence and personas. The job of research is deciding which personas refer to which conclusion person. What you do with that decision at a software level isn't very well thought out in my opinion.

4. Thinking of Gedcom in the second sense above, it is capable of being the database model for an evidence and conclusion systems. It's just as easy to encode information in Gedcom as it is in XML for example. All in all, though, XML or JSON is probably the better choice, mostly because people would think you were crazy for using Gedcom in this context.

Thomas Wetmore

unread,
Oct 11, 2024, 12:03:30 PM10/11/24
to root...@googlegroups.com
Paul,

Other than your initial comment on Gedcom I think you have done a great job summarizing things.

I've been using Gedcom as the "core model" in my software for a long time. I don't use any Gedcom standard; I use it as the syntax to encode information. I use it for sources, evidence, personas, persons, families, database records, etc. I've gone off on a few tangents and used JSON and XML, too, and may do so again, but I've had no issues with Gedcom, well, other than from people who think it's either impossible or stupid!!

When I use the "source first" technique the first thing I do is extract personas from an item of evidence, where a persona is just a name and the few properties bound to it in the evidence. A single item of evidence can generate a number of personas (think of a census record). Beyond the personas is the item of evidence that binds them together through the roles they play and has properties of its own. I think source based genealogy should also store these personas and evidence records. And there is the source of the evidence, the census in this example. If you want it all on your computer you need that to.

How a computer program could support conclusion making is wondrous to think about. As some point in research you're going to have a bunch of personas around that you conclude refer to the same individual. There might be an individual in your database for those personas or you might have to create a one. What does your software do to support the decision? The method used by conclusion-only programs (where the evidence is just slips of paper or some computer file, and the personas are figments in your imagination) is that you tweak the conclusion individuals with information from personas that refer to it. Good luck unwinding things if you decide certain personas don't really belong. But if your software also holds the personas you can do things differently. The model I like best is thinking of an individual as a container for all the personas I have decided refer to the individual. I also like the idea that the conclusion person has its own name, own birth, death, etc, information, and you can tweak those with info form the personas, but you never change the personas. When you change your mind, you can remove some personas from the individual, or split an individual into two;  you might still have to unwind a tweak. Instead of thinking of an individual as a simple container of personas, you can also think of an individual as a tree of personas where the tree is structured to show the order in which you made the decisions to bring them together. I mention this because many years ago I wrote a program that used this tree approach. A human being did not decide what personas made which individuals, but a program's algorithms did it. It built the personas into trees using different heuristics, and at the end the final trees were treated as real individuals. In this application there were billions of persona, all extracted from the internet, using some clever proprietary techniques.

Best,

Tom Wetmore

Ken Finnigan

unread,
Oct 11, 2024, 2:28:29 PM10/11/24
to root...@googlegroups.com
Thanks Paul/Enno/Tom for your great comments and feedback.

The way I'm currently thinking about it lines up a great deal with how you describe Tom.

I envision the separate personas for each source/evidence being retained, with a heuristic determining at what point two, or more, personas can be considered the "same" individual. Possibly through providing different weighting to primary vs secondary sources/evidence, etc. Potentially also providing the ability for the user of the program to set the "tipping point" of the level of certainty that they're the "same" individual.

I like the idea of treating the personas as fixed, and the individual/conclusion is a combination of some set of personas. Maybe treating it akin to source control with a history of which piece of information from a specific persona was "copied" to a piece of data on an individual, providing the ability to unwind/disentangle them as needed.

For the storage side of things, I have considered not utilizing any kind of store and everything being in memory, read from and written to GEDCOM files. This approach would fit well with a tool being served through a browser, or locally, removing any concerns about what might happen to the data when stored, or is the storage of that data secure and safe.

I also wonder whether there's the possibility of having evidence collected in a tool and then "playing" a GEDCOM file of individuals onto that evidence to show how the two are connected, if at all, and by which pieces.

Ken

--

---
You received this message because you are subscribed to the Google Groups "rootsdev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rootsdev+u...@googlegroups.com.

paul...@gmail.com

unread,
Oct 11, 2024, 3:16:40 PM10/11/24
to root...@googlegroups.com

 

 

Hi Tom, many thanks. I like your idea of persona tree but think of it more as an inference tree.

 

And it's worth mentioning too that personal names are no more than attributes of personae, certainly not a collection of values lumped without time or context into a person record.

 

A quick summary of my objections to the GEDCOM "model" would go something like this (and sorry, cannot atm refer back to notes for a more comprehensive set). This is really just a flavour.

 

* The limited range of standard record types is symptomatic of the wrong *mindset* (extensions do not compensate for that). Missing are a whole range of "buildings", ships, and other containers; geographical, habitation and administrative "regions"; institutions (companies, churches, etc). Plus, the whole idea of independent Events ("instantaneous" and extended) that link (or are linked to) so many other record types.

 

* Cross-references, when implemented at all, are only one-way. Those are a pale imitation of what is really needed: many-many relationships that must include roles with further qualification (such as time/duration, place) that I break out as "participation" (a role is just one attribute of that).

 

* Such relationships can start to tackle the current limitations in modelling-encoding flavours of interpersonal relationships (god parentage, sponsorship, informal fostering, apprenticeship, friendship, neighbourship...). As well as census enumeration (at a time, in a habitation, as a household with family and other members, each with specific relationships and other attributes).

 

* And then, more notably, the impossibility of representing event relationships such as birth and its civil registration, illness/injury and consequential death, succession of organisations and administrative units, sub-events (war, campaign, battle, operation). An endless list and whole new world.

 

The obvious challenge with all these formalisms is making them digestible via explanation and user interface, while allowing for system operation at a more basic level(s?).

 

Always great to chew the cud with you, Tom.

Don't have a lot of time for that now.

 

Paul White

.

paul...@gmail.com

unread,
Oct 11, 2024, 3:21:52 PM10/11/24
to root...@googlegroups.com

Ken wrote:

 

“… source control with a history of which piece of information from a specific persona was "copied" to a piece of data on an individual, providing the ability to unwind/disentangle them as needed.”

 

Thumbs up.

 

Thomas Wetmore

unread,
Oct 12, 2024, 11:21:39 AM10/12/24
to root...@googlegroups.com

On Oct 11, 2024, at 3:16 PM, <paul...@gmail.com> <paul...@gmail.com> wrote:

And it's worth mentioning too that personal names are no more than attributes of personae, certainly not a collection of values lumped without time or context into a person record.''

A persona might not have a name. We might want to create one from evidence that says "his wife was born in Latvia in 1876." Presumable the evidence also provides a persona for the husband that includes information about him.

 
A quick summary of my objections to the GEDCOM "model" would go something like this (and sorry, cannot atm refer back to notes for a more comprehensive set). This is really just a flavour.
 
* The limited range of standard record types is symptomatic of the wrong *mindset* (extensions do not compensate for that). Missing are a whole range of "buildings", ships, and other containers; geographical, habitation and administrative "regions"; institutions (companies, churches, etc). Plus, the whole idea of independent Events ("instantaneous" and extended) that link (or are linked to) so many other record types.
 
* Cross-references, when implemented at all, are only one-way. Those are a pale imitation of what is really needed: many-many relationships that must include roles with further qualification (such as time/duration, place) that I break out as "participation" (a role is just one attribute of that).
 
* Such relationships can start to tackle the current limitations in modelling-encoding flavours of interpersonal relationships (god parentage, sponsorship, informal fostering, apprenticeship, friendship, neighbourship...). As well as census enumeration (at a time, in a habitation, as a household with family and other members, each with specific relationships and other attributes).
 
* And then, more notably, the impossibility of representing event relationships such as birth and its civil registration, illness/injury and consequential death, succession of organisations and administrative units, sub-events (war, campaign, battle, operation). An endless list and whole new world.
 
The obvious challenge with all these formalisms is making them digestible via explanation and user interface, while allowing for system operation at a more basic level(s?).

I agree with your objections. But for the foreseeable future Gedcom is the way genealogical data will be shared by the commercial genealogical market. Each program has its own internal model, and each converts Gedcom to their model on import, and to Gedcom on export. Both transformations probably loose data. I have always stuck with Gedcom as the internal database structure to remove the need for transformations. Computers are so fast that leaving all data as Gedcom node values incurs no meaningful performance penalties.

A point I stress is that Gedcom is a syntactic and a semantic standard. Most think of it for its semantics, e.g., Gedcom5.5, and so on. At the syntactic level, though, it is as flexible as XML. You can use it for ships, buildings, events, and so on. In my old LifeLines program I supported arbitrary user-defined record types, but I never ran with it.
 
Always great to chew the cud with you, Tom.
Don't have a lot of time for that now.

I have too much time.
 
Paul White

Tom Wetmore


Reply all
Reply to author
Forward
0 new messages