SUMMIT COUNTY CHAPTER
of the Ohio Genealogical Society
P O Box 2232 Akron OH 44309-2232 
e-mail:  summitogs@yahoo.com

Using the Soundex Coding System

   A soundex code is a four character representation based on the way a name sounds rather than the way it is spelled. Theoretically, using this system, you should be able to index a name so that it can be found no matter how it was spelled.

Census indexes

The WPA used the soundex coding system in the 1930s to do a partial indexing on 3x5 cards of the 1880 (all households with a child age 10 or younger) and 1900 censuses and a nearly full indexing of the censuses of 1910 (not all states completed) and 1920.

The soundex indexes of the 1880, 1900, 1910 and 1920 census records are available on microfilm at the National Archives (and its branches) and many libraries or other archives. These microfilms also can be purchased or rented from the National Archives or borrowed through Family History Centers. The names are arranged on the soundex indexes by first letter, then numerically within that letter, then alphabetically by the first name of the head of household within each
different soundex code. There is usually a separate card for each individual within the household whose surname is different from that of the head of household.

Besides telling where the original record can be found, the microfilmed soundex cards usually give basic information about each person in the household, such as place of residence, age, sex, relationship to head of household, state born, state where parents were born, etc. However,
all of the information that is contained in the original census records is not included.

Figuring the code

Every soundex code consists of a letter and three numbers, such as
B525. The letter is always the first letter of the surname. The numbers
are assigned this way:

                 1  =  b,p,f,v      2  =  c,s,k,g,j,q,x,z
                 3  =  d,t          4  =  l
                 5  =  m,n          6  =  r
                 disregard  -  a,e,i,o,u,w,y,h

To figure out a surname's code, do this:           JOHNSON
   - Eliminate any a,e,i,o,u,w,y,h                         JNSN
   - Write the first letter, as is, followed
     by the codes found in the table above         JNSN = J525

No matter how long or short the surname is, the soundex code is always the first letter of the name followed by three numbers. If you have coded the first letter and three numbers but still have more letters in the name,ignore them. If you have run out of letters in the name before you have three numbers, then add zeroes to the code:

            WASHINGTON = WSNGTN = W252 (ignore the ending TN)
            KUHNE      = KN     = K500 (add zeroes to the end)

Prefixes

If you have a surname with a prefix like Van, Von, De, Di, or Le, code
it with and without the prefix because it may be listed under either
code.  Van Hoesen could be coded as VanHoesen or as Hoesen. Mac and Mc
are NOT considered prefixes.

Double letters

Any double letters side by side should be treated as one letter. For
example LLOYD is coded as if it were spelled LOYD. GUTIERREZ is coded
as if it were GUTIEREZ.

Side by side letters with the same value

You may have different letters side by side that have the same code
value. For example PFISTER (P & F are both 1), JACKSON (CKS are all 2).
These  letters should be treated as one letter. PFISTER is coded as
PSTR (P236) and JACKSON is coded as JCN (J250).

Thus, variations in spellings or mispellings should produce the same
code number:

                    SMITH = S530        SMITHE = S530
                    SMYTH = S530        SMYTHE = S530

Other variations

Note, however, that some names which are pronounced essentially the
same produce different codes. An example is the "tz" sound in German
names, which is normally pronounced the same as "ce" or "se." Also, the
German "B" is often pronounced as the English "P." Thus the German name
Bentz could be spelled that way or as Benz, Bens, Bents, Bennss, Bense,
Bennss, Bants and Banz, or as Penz, Pentz, Pence, Pens, Pense, Penz,
Pents, Penns, Pense, Penze, Pentze, etc. Indeed, it has been found in
census record indexes under all of these - and more. Remember: Those
making the index have as hard a time reading the handwriting of census
takers as we do. They will sometimes mistake an script "z" as a "y" and
record Penty instead of Pentz, or mistake a "c" for an "e" and record
Penee, for examples.

Therefore, to make sure you don't miss finding your ancestor, you may
have to look under a half dozen or more different soundex codes if you
are searching for the name PENCE (soundex code 530):

   BENTZ (and equivalents) = B532       PENTZ (and equivalents) = P532
   BENZ  (and equivalents) = B520        PENZ  (and equivalents) = P520
   BENTY (and equivalents) = B530       PENTY (and equivalents) = P53
   PENEE                                               = P500

Think through the possible variant spellings (and misspellings and misreadings) of the surname you are searching before concluding that it can't be found in the soundex listings. Use your imagination. No mistake is beyond possibility! For instance, the name Pence has been
indexed as Peirce (the reader mistook the written letter "n" for an "i-r" combination) and vice versa.


  Provided by: SUMMIT COUNTY CHAPTER, OGS
        P O Box 2232
       Akron OH 44309-2232
       e-mail: SummitOGS@ald.net
 

Back to the Summit County Genealogy Home Page


Last modified March 24, 2005
                Copyright ©2000 Summit County Chapter OH Genealogical Society. All rights reserved.