GEDCOM v8: Things I had hoped would be in v7 #691
Replies: 5 comments 3 replies
-
|
Discussed in steering committee 26 AUG 2025
|
Beta Was this translation helpful? Give feedback.
-
|
Thank you for your comment. I created this "discussion" entry because I think several of the proposals have taken us down a path that requires discussion before a proposal is submitted. Some proposals a too big to be one proposal! I also did not want us to forget about these topics, when some new proposals are made that touch on other discussions but do not incorporate those ideas. I realize planning by committee can be a frustrating process, but I see a need for slow, thoughtful discussion on these topics since many of these changes will be hard to undo in the future if we get them wrong! |
Beta Was this translation helpful? Give feedback.
-
|
I must say I was very surprised to read this post and the subjects that are in it. Why? Because many of these points were already addressed in my PR #679, so I don’t quite see why they are being raised again separately. Let me highlight where the overlap is:
Further pointsMy PR also introduces:
A higher-level questionMany of the issues that keep being listed (citations, extensions, missing roles, missing events…) may come from the fact that the current GEDCOM design itself is not optimal anymore. Each time, the response is to patch a missing structure or add an extension, but maybe it’s time to ask whether the foundation itself is the real problem. My PR is one attempt to take that step back: to show that if the foundation is more flexible and consistent, many of these recurring problems solve themselves. Do you need more help to explain how things work? And if so, what kind of help would you prefer? I realize the PR may have been a lot to digest, but many of these points are already inside it. |
Beta Was this translation helpful? Give feedback.
-
if you don’t understand why then I can’t tell you without writing a long dissertation on design and teamwork! You clearly don’t understand that a user can currently define their own “facts” even before v7 GEDCOM, but that’s not the point. We are a “standards committee and should set some FACT value standards. I’m not sure if you understand “normalization”! A user can already add all kinds of assets! I can go on about things but I’ve already said a lot about them and you have dismissed them. I’m not sure your design fixes the citation issue, or even understands the issue! If you think that adding a “Premiss” Based section to the GEDCOM transmission “. I would be in favor of starting a design group toward that outcome. So I’m looking to go back to a thoughtful design approach, but if your design continue to have legs then so be it! I’m out! |
Beta Was this translation helpful? Give feedback.
-
I do not understand this statement. I do not see a link in between definition of an extension tag and the I do see another problem: Excessive use of extension tags where standard structures are available. So you are wondering that the data transmitted by 1 EVEN
2 TYPE militaryMost application will show this. I admit, it would be better, to have a TYPE payload not depending on a language. So improvement to the standard can be done and is necessary to improve data exchange - but a lot of todays transfer problems result from use of standard which is not the recommended way. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
When I first saw that a new GEDCOM v7.0 was about to be published, with help from people that were part of the "BETTERGEDCOM" movement (FHISO) I had hoped that many of the structural changed that we discussed would be present in the new update. Version 7 was for the most part a much needed cleanup of the internal workings of GEDCOM v5.5.1 with limited structural changes. For example, v7 embraced international standards for language tags, media types and data character encoding. GEDCOM file standards remove line length limits ( that simulate flat file constructs), modernizing the data transmission. Tags that allowed both phrases and numeric values have now been Normalized! These are just a few of the changes needed to move GEDCOM forward to a more modern standard.
This being said, GEDCOM v7.x still has many issues that need to be resolved. This is my attempt (open for discussion) to outline some issues that should be looked at for the next major release. I'm not putting these items in an particular order of importance, more of a list as I think of them!
Normalization.
As a database designer I look heavily at the Normalization of data, by this I mean looking at places where data regularly gets duplicated that could be stored in a singular place. The problem of multiple instances of the same data has a profound affect on making changes to a GEDCOM that has been received from another creator where the receiver needs to make a change to all duplicate information but has no way to identify that data. Two such data points come quickly to mind:
The Source_Citation structure. Every assertion should cite a Source Artifact as evidence for the assertion. The current GEDCOM model associates the citation data directly with the asserted information ("FACT"). In cases where a single Source Artifact asserts multiple FACTs, each FACT would receive the same citation data. This works well when creating a transmission for the first time, however if the transmission is read into a new application and the citation data needs changes, recreating the list of citations that need the change may be impossible. The Source_Citation structure should have one place that contains a citation data.
The Place_Structure (PLAC tag). Each asserted "FACT" could occur at a place in the world. These "places" can be located on a map using Latitude and Longitude coordinates and could be part of the Place_Structure. Since many FACTs can occur at the same place, for data integrity purposes, each occurrence of that place should have the same data. Similar to the Source_Citation above, the initial creation of a GEDCOM can easily generate an identical Place_Structure. However, a imported shared GEDCOM would have issues updating all instances of the same Place_Structure.
Extending the GEDCOM
The current GEDCOM design does not take into account new technology and understandings of some genealogical constructs.
NAME and identity structures taking into account non-western and other customs for international compliance. As we can see by the many queries that ask "How do I enter this name", GEDCOM has an incomplete understanding of the nuances of name use and identities by various cultures both modern and historical.
"FACT" (Attributes and Events) additions used in both Genealogy and Family History datasets. This board has been inundated with requests for adding additional tags to represent "FACTs" that various application have used. GEDCOM needs a way to both include more attributes and event tags to the current design but to have a way for applications to add less common or unique to their implementation "FACT" tags.
The current method of adding a new tag by an application is to introduce an extension (a new tag value preceded by a "_") and asking the application to create an EXID entry with a URL to a site that defines the extension. In some cases these extensions would be minimal and transmission receiving application can easily ignore many of these extensions. However, as it pertains to new "FACT" tags is this still an appropriate use case for an EXID? The problem here arises that the EXID is great information for the application developer but has very little value to the user of the software! The software may drop the new "FACT" tag or flag it as unknown but the user may not have any way to include the tag in their software without an update from the developer which may or may not happen in a timely manor.
For Example: The data loss for an event such as _MILT ("Military") could make a transmission useless for the receiver! It might be better to create a data-dictionary as part of the GEDCOM transmission for "FACT" extensions that outlines new attributes/events with detail that the reading/importing application can use directly such as display label and a provided description in the data-dictionary could allow the user to understand how to use the data in custom reports.
Address_Structure: As pointed out in Tineke's PR some addressing information should have the ability to be associate with a specific individual at a repository. This could be an easy update to the structure, but should be discussed!
Deprecation of the Family_Record. This is a radical change that could upset the status quo. But the term "family" may not be valid in the modern world. Should we start modeling person to person relationships rather than connect people to a Family Relationship? This would require some way for relationship ENUM values to be recorded in the GEDCOM as part of a Data Dictionary so that the GEDCOM model does not have to support a never ending list of interpersonal types. This change would also make the FAM.HUSB and FAM.WIFE less of an issue. This change would eliminate pointers in two places, and change the way Associations, Adoptions, Fostering and other concepts are constructed.
Create an Accessioning structure. Some people want to tract a photo or other objects that they received from other people or organizations. It would be wrong to call/use the "Source_Record" for this type of information, rather we should provide for an "Attribution" tag or record.
Beta Was this translation helpful? Give feedback.
All reactions