Soundex Makes Surnames Common for Today’s Genealogist
A high-tech algorithm called the Soundex Code indexes many genealogy records. Well, it was high tech in 1918 when Robert Russell invented it.
In a nutshell, Soundex codes provide a means of identifying words, especially names by the way they sound. The Works Progress Administration (WPA) crews working in the 1930s to organize the Federal Census data from 1880 to 1920 used them extensively. Soundex has also been used for many state and local census records and is popular in genealogy software and databases.
In the days when nearly all the data for the census of population was collected by actual enumerators and individuals who walked from door to door, it was discovered that many of these people spelled surnames phonetically. Thus, one might spell Smith as "Smith" while another might spell it as "Smyth" and still another "Smythe." The census records were to be indexed by the sound of each name rather than by its spelling, and Soundex was the code system used to organize this index.
Is Soundex Really Needed?
If you search many records of interest to genealogists, eventually you will need to use Soundex Codes. Why? Well, you can often find a person's entry by his or her Soundex Code even when the names have been misspelled. This becomes important when you realize that many census takers did not speak the language of the people being enumerated. In fact, in the first 150 years of US census records, most Americans were illiterate and did not know how to write their own last names.
The spelling of many family names also has changed over the years, but often the Soundex Code remains the same.
Spelling of names varies widely in historic records, especially when language difficulties have intervened. For instance, I could not find my French-speaking great grandparents listed in the U.S. Census. I searched and searched but found no entries for Joseph and Sophie Theriault.
I did a Soundex search. The Soundex code for Theriault is T-643. I found several entries for T-643 in Ashland, Maine, including one for the family of Joseph and Sophia Tahrihult — improperly spelled, but with the same Soundex code.
The census taker had a Scottish name, and they listed him on another census page in the same town as being born in Scotland. I guess that he did not speak French. Wonder if he had some difficulty when speaking with my great-grandparents? Neither of them spoke English nor could read or write English.
No wonder Theriault became Tahrihult!
Learning the Soundex Code
The Soundex Code is easy to learn, although I still use a small reference card when I visit the archives to look at records. Every Soundex code comprises a letter and three numbers, such as W-252. The letter is always the first letter of the surname, and the hyphen is optional. It assigned the numbers to the remaining letters of the surname according to the Soundex guide shown below.
Here is the Soundex coding guide to create a four-character code:
Shortcomings of the Soundex Code
If a surname has a prefix, such as Van, Con, De, Di, La, or Le, the code should ignore these prefixes. However, coders sometimes miss this rule, so they might assign the Soundex code either with or without the prefix. Because they might list the surname under either code, a thorough search of the Soundex index should include both forms.
While Soundex is a great tool, and in widespread use, it certainly is not perfect. It fails when the TSR letters are different. For instance, it coded Knowles as K-542, while both Noles and Nolles are N-420. Likewise, Cantor is C-536 while the similar sound of Kantor is K-536. Understand these shortcomings when using this search method.
Soundex Is Still Used Today
While the Soundex is an older system, you can still find it in modern genealogy applications as a search option or an available tool.
Many genealogy applications, such as Family Tree Maker or RootsMagic, can calculate a Soundex code. Family Tree Maker has a Soundex Calculator available. The following figure shows the steps for finding the calculator in the 2019 version.
Many other improved Soundex methods have been developed in recent years and are in widespread use in many computer databases. MyHeritage describes an improved method Megadex they invented and use in their application.
You can use the Soundex method when searching with Ancestry, as shown in the following figure. Hudson was sometimes captured as Hutson, so the Soundex search can reach further into the records for an answer. Ancestry’s search engine is powerful so they have similar algorithms you can use.
While many chide the US Government’s lack of any good ideas, the Soundex code has reached through the generations to solve issues for modern genealogists. This algorithm has been improved and used in multiple modern applications.
- National Archives Explains the Soundex Indexing System
- The WPA Census Soundexing Projects
- Other Software: Yet Another Soundex Converter (YASC)
We updated this post based on an article in the Bluegrass Roots magazine in the Fall 2002. The editor didn’t list the author, so perhaps the editor wrote it or it was shared from another society.
More Bluegrass Roots Content
In 1866, the Kentucky Legislature enacted an imperfect law to legitimize the marriages of formerly enslaved persons.
Fort Boonesborough is Kentucky's second oldest European-American settlement. Early Kentucky settlers met at Fort Boonesborough to establish a local government. Judge Richard Henderson, promoter of the Transylvania Company held the meeting.