Announcement

Collapse
No announcement yet.

Txt2gedcom-sour

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Txt2gedcom-sour

    The beginning of all data entry is the source document. Whenever I begin capturing data about individuals, I first capture the source information (e.g. birth certificate).
    Although Reunion allows you to import csv-files (e.g. by Excel) to create Individuals, Reunion has no ability to directly import sources from a csv-file.

    Recently, while working on a one place study, I needed to create individual sources for each one of some 40.000 birth, marriage und death certificates. Each certificate is available as a standalone digital multimedia (pdf) document.

    For me, manually capturing each source document was not an option, especially since the data was already available a csv-file (Excel).

    To overcome this problem, I contacted a young programmer, who wrote a custom web-based solution (TXT2GEDCOM-SOUR) to convert a csv-file into a GEDCOM file. To overcome the GEDCOM-requirement, that each source requires an individual (INDI) to which the source is attached, the program just creates a „dummy individual“ for every 256 sources. This „dummy" is simply deleted after the standard GEDCOM-import.

    Since Reunion allows for custom-GEDCOM tags in Sources and also allows for 50 fields per source record, a csv-table with up to 50 columns can be imported.

    Well, I’m happy with the results. I imported some 40 000 sources without a hitch, deleted a couple of hundred „dummy individuals“ and voila, I have a clean sources file with some 40 000 sources and need not do any manual capturing. The next steps is be to link the source to the individual(s) and to attach the digital multimedia to each source.

    Let me know, if you have the need to convert a large csv-table in such a manner. All which is needed is that the table contains in line 2 a GEDCOM-tag (or custom GEDCOM-tag), The title in line 1 is irrelevant and is just used to for automatic character set recognition.

    You can send your csv-file to the following e-mail address » sauerrl@me.com «

    Regards
    Reiner


    SauerRL@me.com • info@reunion-de.de
    Web: http://www.schevenhuette.com
    Web: http://www.reunion-de.de

    #2
    I've done something similar in the past using some hastily-written scripts to massage text into a usable GEDCOM. Nothing on anything close to this scale, though. I'm impressed!

    Originally posted by Reiner L. Sauer View Post
    To overcome the GEDCOM-requirement, that each source requires an individual (INDI) to which the source is attached, the program just creates a „dummy individual“ for every 256 sources. This „dummy" is simply deleted after the standard GEDCOM-import.
    There is no requirement in the GEDCOM specification that each source be linked to an individual. In fact, a properly-formed GEDCOM is allowed to contain only source records and no individuals at all (at least in GEDCOM 5.x and later - GEDCOM 4 is ill-defined). Reunion 13 will complain that such a file "was not formatted according to the GEDCOM specifications," though, so a single INDIvidual is required for a successful import into Reunion. But even Reunion does not require the sources to be linked to an individual in any way. A GEDCOM file containing a single dummy INDI with no source citations can contain any number of source records. The dummy individual doesn't even have to have any content. A simple

    0 @I1@ INDI

    will suffice.
    Brad Mohr
    https://bradandkathy.com/genealogy/

    Comment


      #3
      Dear Brad,
      Thanks for pointing this out. Apparently the conversion routine is over-engineered.
      May be next time, if there's a need.
      Thanks and regards
      Reiner
      SauerRL@me.com • info@reunion-de.de
      Web: http://www.schevenhuette.com
      Web: http://www.reunion-de.de

      Comment

      Working...
      X