At the recent FHT workshop at BYU, the topic of GEDCOM's deficiencies
was brought up. I know there are folks on both sides of the aisle on
this one. I know some who feel GEDCOM works just fine, while there are
some who feel there are significant deficiences.
I fall just to the side of the line of those who feel there are
deficiencies in GEDCOM, though I think there's a lot that's good
there.
I would like to start a conversation to nail down a comprehensive list
of the strengths and weaknesses that exist with the current GEDCOM
standard. If such a list currently exist, please point me to it.
Do you love GEDCOM? Tell us all why.
Think GEDCOM is not sufficient? Tell us all why.
Know of folks who should be part of this conversation but aren't on
the list? Send them an invite to join, please.
Thanks,
-- Dan
________________________________
________________________________
From: beyo...@googlegroups.com on behalf of Dan Hanks
Sent: Fri 3/28/2008 9:30 AM
To: beyo...@googlegroups.com
Subject: [BeyondGen] What's wrong with GEDCOM?
I think this is the crux of the question I am asking to the group, are
there others for whom GEDCOM is not sufficient (as it is for you,
John)? And if not, why not? For the others, what is GEDCOM not able to
do that you would like to do - as far as modelling genealogical
information?
One question I have regarding GEDCOM is how well it can model
different kinds of parent-child relationships? I.e., Can you indicate
that child A of parent X is biological, while child B of parent X is
adopted, for example?
Thanks for your input John,
-- Dan
I would support an open GEDCOM committee and I support you in your efforts
to create such a committee. I believe that I am sensing that same support
from the other major genealogy product vendors. The problem is that the LDS
Church currently owns the copyright for the GEDCOM spec. If we formed a
committee and approached them, I think that they would probably transfer
rights to the committee. It is certainly worth a try.
Aloha,
John Vilburn
Ohana Software LLC
Yes, the current GEDCOM spec supports this though I doubt most genealogy apps would read it if you included it. From the GEDCOM spec:
CHILD_TO_FAMILY_LINK:=
n FAMC @<XREF:FAM>@ {1:1}
+1 PEDI <PEDIGREE_LINKAGE_TYPE> {0:1}
+1 STAT <CHILD_LINKAGE_STATUS> {0:1}
+1 <<NOTE_STRUCTURE>> {0:M}
PEDIGREE_LINKAGE_TYPE:= {Size=5:7}
[ adopted | birth | foster | sealing ]
A code used to indicate the child to family relationship for pedigree navigation purposes.
Where:
adopted = indicates adoptive parents.
birth = indicates birth parents.
foster = indicates child was included in a foster or guardian family.
sealing = indicates child was sealed to parents other than birth parents.
CHILD_LINKAGE_STATUS:= {Size=1:15}
[challenged | disproven | proven]
A status code that allows passing on the users opinion of the status of a child to family link.
challenged = Linking this child to this family is suspect, but the linkage has been neither proven nor
disproven.
disproven = There has been a claim by some that this child belongs to this family, but the linkage
has been disproven.
proven = There has been a claim by some that this child does not belongs to this family, but the
linkage has been proven.
Taking your example above:
0 @A@ INDI
1 NAME Child /A/
1 FAMC @F1@
2 PEDI birth
0 @B@ INDI
1 NAME Child /B/
1 FAMC @F1@
2 PEDI adopted
But this is not what a lot of people want to transfer. They want to store and
exchange their research progress. This includes not only the final result but
also provenance how the information was found.
In a couple of days I am going to give a speech at a conference in Germany
about genealogical data models. So I already collect some information about
the topic.
1) In genealogy we explore the past (everything that existed and happend). We
know that we won't be able to cover it completly. No matter how hard we try
we can only get very close.
2) All we know about what happend in the past is written in documents. Not
everything has been documenented and there are wrong documents (created by
mistake or purposely), too.
3) Finally we combine information from these sources and make conclusions
because it seems rational for us. During the research we might find other
sources that confirm these conclusions or disprove them.
For a complete documentation of genelaogical research we need to provide
information about layers 2 (the sources) and 3 (researcher's assertions)
without mixing them up. This gives us a (limited) view to layer 1.
Using the GEDCOM you try to model layer 1 directly using as much information
from layer 2 that is needed to prove the created layer 1 data and
possibly "correcting" some of it to match the created layer 1 data. Duplicate
and conflicting is ignored. Research progress from layer 3 is completely
ignored.
Jesper
--
Jesper Zedlitz E-Mail : jes...@zedlitz.de
Homepage : http://www.zedlitz.de
ICQ# : 23890711
Hi. I’m new to this list and don’t want to appear to be pushing my own product. I’m here because I’m interested in improving the GEDCOM standard and my product’s integration with family tree applications via GEDCOM.
That said, I’d like to tell Jay that my product, facTree, is an attempt to meet his first two points. facTree forms currently exist for all of the US population schedules and allow you to input data on a form that looks like the census form while it converts that information into GEDCOM records behind the scenes. You can then import the resulting GEDCOM file into whatever family tree application that you use and merge it using that application’s merge capabilities. A free version that includes the form for the 1880 census is available at http://www.thegenealogyshop.com/Downloads.html. We plan to expand into other form types including birth, death, marriage, draft registration, etc.
facTree basically creates a GEDCOM that incorporates all of the facts available from a single source, both direct, and indirect based on user-customizable assumptions. The impetus was the desire stated by Jay, to have a data entry mechanism that mimics the source document. This increases accuracy and speed.
Annette
Mark Turner mentioned seomthing in his presentation that I though was
a pretty important point. He was speaking with Elizabeth Shown Mills
(author of "Evidence" and "Evidence Explained,") who suggested that
people learn how to do genealogy research by the software they use. In
most cases that means we start with an empty form and start filling
out names, dates and places, and source materials are an afterthought.
I've given some thought to software that starts the other way around,
by asking you, "What are you looking at right now?" and guiding you
through the process to extract the information from the document you
happen to be looking at, making sure it's properly cited so that
others can more easily find your sources when they come after you. Or
perhaps the software would start even earlier in the process, by
asking you,
"What do you want to find out?"
"I want to find out more information about my grandfather."
"Do you know about when and where he was born?"
"Around 1870 in Park City, Utah".
"Ok, at thet point in time Park City was known as Parley's Park. Here
are some records you may wish to search to find out more...when you
have obtained copies of these documents, come back and we'll gather
the information about those records into your database..."
I believe there was/is some software called GenSmarts that does this
kind of thing based on the data you already have in your record
manager. I'd like to see software that walks you through the research
process, perhaps getting less and less verbose as you gather more
information and become more familiar with the process. All the while
as you are gathering documents it's helping you accurately store the
information, cite the source it came from and so forth. As mentioned
in another post, this kind of software would be helping you accumulate
"facts" from source documents, and helping you to make conclusions
about those facts you have gleaned.
Underlying all this has to be some form of a "fact engine" that allows
you to store arbitrary facts about individuals, events, places, and so
forth, each of which is linked to a specific source document from
which the fact came from, all the while allowing you to overlay the
conclusions you have made based on these facts.
I think that's why I've been discouraged a bit by trying to use some
kind of 'one-size-fits-all' sort of data model where the data fields
are already defined. I see the solution as being some kind of model
that allows you to define the fields as you go as it were.
Thanks for your input,
-- Dan
How well does GEDCOM allow for the storing of more than one name for
an individual, based on different sources? I have a relation for whom
I found 4 differnet variations of her name, from various sources (a
DUP history, her gravestone, census, etc).
The open-source Gramps software allows you to store name variants,
each tied to a specific source. Does GEDCOM allow you to do this?
<looking up the answer myself.../>
Consulting the GEDCOM standard, I see that an individual record allows
{0:M} <<PERSONAL_NAME_STRUCTURE>> records, each of which may have
{0:M} source citations associated with it. So it looks like yes,
GEDCOM supports this very thing. +1 for GEDCOM :-).
-- Dan
I mostly agree with John that the biggest detriment to GEDCOM is the
lack of authority, and thus the ability to grow and adapt, as well as
the ability to measure compliance. With some care and feeding, GEDCOM
could be updated to handle most needs, as well as tidy up it's
eccentricities and ambiguities.
However, I'd really like to see it move towards XML. I believe that
would allow for better extensibility (namespaces) and a wealth of
tools to parse and manipulate the data. I realize that deprecates the
wealth of experience in existing GEDCOM parsers, but for future
growth I think it's worth it.
The GEDCOM data model is adequate, but I think a richer model that
distinguishes conclusions from extractions and adds detail about the
research process would increase progress in research and
collaboration, and likely spawn a whole new set of tools and services.
However, without "authority" or some sort of community agreement,
moving to XML or new data models is mostly moot, as they will revert
to the same fate as GEDCOM--poor compliance, ambiguity, non-standard
extensions, and the perception (or reality) of lost data.
It's not an easy problem to solve, but the lack of good data transfer
mechanisms really hinders collaboration and hurts everyone.
At worst, GEDCOM needs some modernization and clarification in a
standardized process. At best, I'd like to see a richer more
extensible data model.
Logan
I've added the extra lines to this example for readability.
Notice the 2 BIRT records. In GEDCOM when you have multiple of the same event types, by convention the event that comes first in the record is the preferred one. This is the one that should appear in any charts or reports where only 1 is shown.
You'll also notice that the first BIRT record has a source citation pointing to source S1 which in this case would be the birth certificate. The second BIRT is sourced by S2 which would point to the obituary.
At the conference I talked about some changes that I made to further improve this. The changes I made were largely made to improve integration and translation between the FamilySearch API XML model and the GEDCOM model and prevent data loss. You can find a copy of a document describing the alterations needed to support this here:
http://code.google.com/p/php-fsapi/source/browse/trunk/PHP-FamilySearchAPI/FamilySearch%20API%20XML%20to%20GEDCOM%20Mapping.doc
In the area of multiple opinions, the additions that were needed were support for assertion level modification tracking (modified dates, versioning, and contributors). Basically the changes would allow you to keep track of who submitted each of the two BIRT assertions above and when they were last changed. This information is really only important in a multi-user system like PhpGedView or FamilySearch. As far as your requirements to be able to keep track of multiple assertions based on source citations, the pure GEDCOM 5.5 spec allows for that.
--John