My Kindle Books

  • www.amazon.com
  • www.amazon.com
  • www.amazon.com
  • www.amazon.com
  • www.amazon.com

Record Matches

How I Use Genealogy Software and Online Family Trees

 Roberta Estes wrote a blog post yesterday titled “Genealogy Tree Replacement – Should I Or Shouldn’t I?” on her DNAeXplained – Genetic Genealogy blog.  I commented on it, but want to expand on my comments a bit.

Roberta explains the situation we all face – we may have a family tree in our genealogy software programs (e.g., RootsMagic, Family Tree Maker, etc.) and on a number of websites (e.g., Ancestry, MyHeritage, etc.).  But it’s impossible to keep them all up-to-date without making a new GEDCOM file every so often and uploading it to the online trees.  If I do this, do I delete my old tree which may have many record hints and record images attached to the tree person profiles?

Here is a list of the family trees that I actively manage and what I do to try to keep them all updated:

1) I have a master tree in RootsMagic that now has about 60,000 profiles – ancestral families of mine, my wife’s and my sons-in-law; plus one name studies for Seaver/Sever/Sevier, Carringer, Auble, and Vaux; plus descendants of several other ancestral surnames; plus descendants of my 4th great-grandparents (to aid in DNA matching).

*  I do all of my data entry in RootsMagic, adding names, dates, places, events, notes, sources, and media.  I download record images that I want to save to my desktop computer files – they go into a Family file in a Surname File in a Group file – with a common file naming convention.  

*  RootsMagic permits me to “TreeShare” my person profiles with an Ancestry Member Tree, one person at a time and one event at a time, from within the program.  I can either add content to my Ancestry tree or from my Ancestry tree to RootsMagic.  I do this every week.  I rarely download anything from Ancestry into RootsMagic because Ancestry’s source citations are poorly crafted and any record image I might download goes into another file with a non-descriptive name.  
*  RootsMagic permits me to “match” my person profiles to FamilySearch Family Tree profiles from within the program.  So I can add Family Tree persons, events, notes and sources to RootsMagic, or from RootsMagic to Family Tree.  
*  RootsMagic provides WebHints with links to Record Hints on Ancestry.com, FamilySearch, MyHeritage and Findmypast.  I can click right into these sites from my RootsMagic program and go to the records on the WebHints list.  I can then enter information from those WebHints right into RootsMagic,
*  RootsMagic permits me to create a GEDCOM file of all or part of my family tree, which I can then upload to another software program or to an online tree.
*  I have other family tree software programs on my desktop computer – Legacy Family Tree, Family Tree Maker, Family Tree Builder and RootsFinder.  I use these occasionally to take advantage of program capabilities that are, IMHO, better than what is in RootsMagic.  I upload a new GEDCOM file to these programs when I need to.

2) I TreeShare my RootsMagic tree with my major Ancestry tree (connected to my AncestryDNA test)  every week to keep the Ancestry tree up-to-date.  I try to prevent Ancestry record images (poorly named) and sources (poorly crafted) from coming into RootsMagic. My Ancestry tree immediately creates Record Hints for new or changed profiles, which I can then mine and add to my RootsMagic tree.  I also do searches on Ancestry.com to catch every record for a person.

*  AncestryDNA uses my Ancestry Member Tree (which has many descendants of my 4th great-grandparents!) to find Common Ancestors of my AncestryDNA Matches using the ThruLines feature.  Common ancestors are identified from my tree information, the trees of my DNA matches, and their Big Tree.  I have over 34,000 DNA matches, but only 400 Common Ancestor matches are identified.  

3) I upload a new GEDCOM to MyHeritage every year but save the previous family tree file and delete earlier family tree files. MyHeritage provides Smart Matches and Record Matches for each person profile, and also matches by Source.  I can access the Record Matches from the WebHints in RootsMagic. When I find useful Record Matches on MyHeritage, I add them to my RootsMagic file,  I also do searches on MyHeritage to catch every record for a person.

*  MyHeritageDNA uses my MyHeritage tree to find Common Ancestors of my MyHeritage matches using the Theory of Family Relativity feature. I have almost 9,000 MyHeritageDNA matches, but only 8 Theory of Family Relativity identified matches.  

4) I upload a new GEDCOM to Findmypast every once in awhile.  Findmypast provides Record Hints for tree profiles on their website, or on the RootsMagic Web Hints feature.  I add useful information to my RootsMagic tree.  I also do searches on Findmypast to catch every record for a person.

5)  American Ancestors uses RootsFinder as their online family tree, and I uploaded a GEDCOM file there a year ago.  However, my family tree takes up 1.5 gb when I access it, and slows my desktop computer significantly.  It provides Record Hints but I rarely search for them.   I also do searches on American Ancestors to catch every record for a person.
6) I have an ancestors only tree at FamilyTreeDNA which rarely provides any useful matches.

7)  I have an ancestors only tree at GEDmatch which rarely provides any useful matches.

8) I match profiles in FamilySearch Family Tree (a collaborative tree) with profiles in my RootsMagic tree, or create new FamilySearch Family Tree profiles from RootsMagic profiles, and share information both ways, including sources and notes.


*  FamilySearch provides record matches for person profiles, which I can access from the tree profile or from the FamilySearch WebHint in RootsMagic.  When I find useful record matches on FamilySearch, I add the information to my RootsMagic tree.   I also do searches on FamilySearch to catch every record for a person.

9)  I have added information for most of my ancestral families to Geni.com (a collaborative tree) over time, but it takes time to add information there with or without a GEDCOM.  

10)  I have added information for most of my ancestral families to WikiTree (a collaborative tree) over time, but it takes time to add information with or without a GEDCOM.

=====================================

What genealogy software do you use, and why do you prefer it?

What online family tree(s) do you use, and how do you keep them up to date?

=============================================


The URL for this post is: 


Copyright (c) 2020, Randall J. Seaver

Please comment on this post on the website by clicking the URL above and then the “Comments” link at the bottom of each post. Share it on Twitter, Facebook, or Pinterest using the icons below. Or contact me by email at randy.seaver@gmail.com


MyHeritage Adds Huge Collection of Historical U.S. City Directories

The following announcement was written by MyHeritage:

We are pleased to announce the publication of a huge collection of historical U.S. city directories — an effort that has been two years in the making. The collection was produced exclusively by MyHeritage from 25,000 public U.S. city directories published between 1860 and 1960. It comprises 545 million aggregated records that have been consolidated from 1.3 billion records, many of which included similar entries for the same individual. This addition brings the total number of historical records on MyHeritage to 11.9 billion records.

Search the U.S. City Directories

The new city directories collection on MyHeritage is a rich source of information for anyone seeking to learn more about their family in the United States in the mid-19th to mid-20th century. The directories contain valuable insights on everyday American life spanning the time period from the Civil War to the Civil Rights Movement.

What are City Directories?

Cities in the United States have been producing and distributing directories since the 1700s as an up-to-date resource to help residents find local individuals and businesses. City directories typically list names (and spouses), addresses, occupations, and workplaces. Sometimes they include additional information.

Example: pages from the 1888 Nashville City Directory (click to zoom)

Thanks to their level of detail, city directories can provide a viable alternative to U.S. census records during non-census years, as federal censuses are taken once every ten years, and in many cases city directories were published annually. They can also fill in the gaps in situations where census records were lost or destroyed. In 1921, a fire at the U.S. Department of Commerce destroyed most of the records from the 1890 census. Despite the loss of the records in the fire, much of the data can be reconstructed using the 1890 city directories on MyHeritage, which consist of directory books from 344 cities across the country, including 88 of the 100 most populated cities during that year.

Unique processing by MyHeritage

The city directories in this collection were published by thousands of cities and towns all over the U.S., and each directory is formatted differently. The huge amount of content and its variety made the project more challenging and required the development of special technology to process the city directories.

We first used Optical Character Recognition (OCR) to convert the scanned images of the directories into text. This process can result in errors in the output, and we created algorithms to detect and correct some of these errors.

Then, we needed to parse the records to identify the different fields in each record: names, occupations, addresses, and more. The differences in formatting between the books presented an additional challenge. Our team employed methods such as Name Entity Recognition (NER) and Conditional Random Field (CRF) to train an algorithm using a per-book model — meaning that for each of the 25,000 books, we manually labeled a sample of the records and used it to train the algorithm how to parse that directory. Using this model, the algorithm was able to parse the entire book into a structured index of valuable historical information.

In the example below of a city directory record for Ralph McPherran Kiner, an American Major League Baseball player and broadcaster, we see how our system overcame and corrected an OCR error. The incorrect address in the 1957 record is 55801 Yorkshire av, whereas the 1958 and 1960 records list the address as h5801 Yorkshire av, and the “h” implies that Ralph is the homeowner. We inferred that the first “5” in the first record was an OCR error and should actually be an “h”, and were therefore able to determine that Ralph lived at the same address during these years.

Example of a record with an OCR error that was overcome (click to zoom)

Consolidating records and creating a searchable index

After all the information was parsed, we consolidated the records in an unprecedented way. We identified records thought to describe the same individual who lived at one particular address over several years, as published in multiple editions of the city directories. We then consolidated all of those entries into one aggregated record that covers a span of years. This reduced “search engine pollution,” wherein a search for a person would have returned multiple, very similar entries from successive years, obscuring other records. The aggregation makes it easier to spot career changes, approximate marriage dates, re-marriages, and plausible death dates. To our knowledge, the algorithmic deduction of marriage and death events from city directories is unique to MyHeritage.

In the example below, we consolidated 31(!) records from the years 1912–1959 into a single record. Based on the information collected over the years, it is likely that Alfred and Mary Albert married circa 1914. We were also able to determine that Alfred died circa 1959.

Example of a consolidated record (click to zoom)

The aggregated record also shows that Alfred changed his profession several times during these years, and he went from being a conductor to a carpenter to a motorman.

This is the power of consolidation: it converts many “dull” records into a single, rich biography that tells a life story!

Examples of challenging problems – and how we solved them

Multiple entries

Many published city directories saved typesetting (which was expensive) and paper by using a symbol to indicate that multiple entries had the same last name, such as ditto marks or dashes. Some entries continued onto a second line, while others occupied only one. The algorithm had to understand the difference between surname text and the text that often appears directly below it.

For instance, in the example below, the record extraction algorithm successfully inferred that Bartsch is a surname and that the ditto mark in the next line also means Bartsch.

Record extraction algorithm infers surnames from ditto marks

The algorithm also determines where a record begins and where it ends. For example, the record below spans one line:

This record, however, spans two lines:

If the algorithm hadn’t inferred this, we would have created an additional record for “Waller” and missed identifying it as the street name in the record about Wm F. While this process works very well, there are still some directories in which this type of record extraction is not 100% robust.

Abbreviations

A table of common abbreviations appears at the beginning of each city directory, listing abbreviations for names, occupations, residence status, and addresses that are used throughout the book. The records are often hard to decipher without the use of the abbreviation tables.

Abbreviation table from the 1931-1932 Jacksonville City Directory (click to zoom)

To integrate the abbreviation tables into the collection, we manually keyed in the table from each book and used it to expand the abbreviations in the records.

Our handling of first name abbreviations in this collection is particularly helpful, because if you’re searching for a “Patrick”, we’ll find him for you even in records where he’s listed as “Patk”, so that you won’t have to think about all the possible ways to search for each name – we’ve got you covered!

In the following example, we’ve expanded the abbreviations for the occupation sten to stenographer, clk to clerk, the workplace Fla Natl Bank to Florida National Bank, and residence status r to rents. This improves readability and enables searching and matching to family trees with much higher accuracy.

Example of expanding abbreviations within a record (click to zoom)

Important insights from the collection

Inferred life events

Consolidated city directory records enabled MyHeritage to automatically infer dates of marriage or death based on changes in the record data.

In the example below, Henry Bennett from Oakland, California most probably got married in late 1923 or early 1924, and the Oakland City Directory from 1924 lists Nancy as his wife. We therefore created a marriage event with Nancy clearly marked as implicit, dated circa 1924.

Example of an inferred marriage date (click to zoom)

In the example below, Matthew and Sally Lewin are listed as spouses and reside together at 305 New Scotland Ave in Albany, New York until 1945. In the 1946 listing Sally appears as widowed, so we inferred that Matthew died circa 1946.

Example of inferred death date (click to zoom)

Change in homeowner status

Throughout the records we can see if the person living at any address was a renter, denoted by an “r” in most records, if they were a boarder, denoted by a “b”, or if they were the homeowner, denoted by an “h”.

By following a consolidated record over the years, we could see if someone changed from renting to owning their home at the same address.

In this example, we see that James Thompson was a renter until 1921. Sometime between 1921 and 1923 he became the owner of his residence.

Example of change in homeowner status (click to zoom)

Finding others who lived at the same address

The city directories collection allows users to see who else has lived at the same address. Simply click on “See who else lived at this address” in the record page to run a search by address.

This feature can be useful for locating ancestors, descendants, or other family members of the person you are researching who lived at the same address in other periods. Often multiple generations of a family lived at the same address, or a family home may have been passed on from one generation to the next.

In the following example, James and Glenna Japhet lived at 623 W Olmos Drive in San Antonio, Texas.

Example record from San Antonio, Texas, 1948 (click to zoom)

When checking to see who else lived at the same address in city directory records, we see that aside from James and Glenna, another person with the last name Japhet is also listed in the directories as having lived at that address: a woman named Laverne Japhet.

Results showing others who have lived at same address (click to zoom)

 It seems as if Laverne is either James’ second wife or the same person as “Glenna L”. This opens new avenues for more research.

Cost

Searching the U.S. City Directories is free, but a subscription is required to view the records.

Users with a Data or Complete subscription can view the full records including the high-resolution scans of the original directories, confirm Record Matches, extract information from the record straight to their family trees, and view Related Records for the person appearing in a historical record they are currently viewing.

Summary

The U.S. City Directories collection on MyHeritage is a treasure trove for anyone searching for more information about their ancestors in the United States. We have worked very hard to prepare this collection for our users, and believe it is the smartest online U.S. city directory collection ever made. Over the next few months, we are planning to expand this important collection even further by publishing thousands of additional city directories. This addition will include directories from more cities, and directories published prior to 1860 and after 1960.

Search the U.S. City Directories now

Enjoy!