Another Merge Records question

33 views
Skip to first unread message

Jay Fletcher

unread,
Nov 9, 2024, 12:57:59 PM11/9/24
to GEDitCOM II Discussions
Hi John,

I'm making a modified version of the Merge Records extension to meet my needs for merging GEDCOMs that I have downloaded from an online database into my family tree. Since all of the GEDCOMS come from the same place I can count on them having almost identical data for each person, so I would like to change the match scoring algorithm to better suit this situation.

I noticed that there is a function called DateQuality that is called several times in your Python script, but it is never defined. This is in contrast to similar functions called PlaceQuality and NameQuality that are defined.

I'm not an expert in Python, but I'm surprised that the script even functions without DateQuality being defined. Perhaps execution just passes through the calls to the function without doing anything?

Do you have a version of the extension that includes a definition of DateQuality that you could share with me? I would like to use it as part of my modifications.

Thank you,
Jay

Jay Fletcher

unread,
Nov 11, 2024, 3:16:45 PM11/11/24
to GEDitCOM II Discussions
Perhaps a better question is where can I find the definition of the DateQuality function? Also, how is the command "SetTauCutoff(55.,2.)" related to the DateQuality function?

Richard Blake

unread,
Nov 11, 2024, 3:27:37 PM11/11/24
to geditcom-ii...@googlegroups.com
Hi

DateQuality is defined in the GEDitCOMII.py module.

Regards, Richard

--
You received this message because you are subscribed to the Google Groups "GEDitCOM II Discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to geditcom-ii-discu...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/geditcom-ii-discussions/40ece590-6f7d-48ae-91fc-5a3a563c1c7fn%40googlegroups.com.

Jay Fletcher

unread,
Nov 11, 2024, 4:02:13 PM11/11/24
to GEDitCOM II Discussions
Thank you, Richard. Do you happen to know where the mathematical formulas in the DateQuality function come from? I would like to understand how they work.

Richard Blake

unread,
Nov 11, 2024, 4:17:28 PM11/11/24
to geditcom-ii...@googlegroups.com
I'm afraid not. I took a look but I couldn't make sense of it. One for John I think.

John Nairn, Developer

unread,
Dec 23, 2024, 12:57:13 PM12/23/24
to GEDitCOM II Discussions
About 10 years ago, I started writing a paper on Date Quality and it explains what is implemented in the "Merge Records" extension. It has interesting "fuzzy" math and final solution involving Laplace Transforms (which I thought was interesting). I never finished the paper because I did not know where to send it for publication and I got involved in other things as well. If you are interested, here is link to a rough draft of that paper:


The description of Date Quality is mostly complete. The end was to have more examples and that part is incomplete. The appendix was planned to show the algorithm used in the "Merge Records" extension. I welcome any comments. Maybe I will finish it some day.

The new GEDitCOM II version yesterday did revise the "Merge Records" extension. It added an option to automatically accept matches above a certain match score. Those match scores use Data Quality but also use a lot more (with somewhat arbitrary weighting factors).

John  Nairn
Reply all
Reply to author
Forward
0 new messages