Notes on Industry, Occupation (Occup), Ancestry (Ancstry), Race, Language (Lang2) and Place of Birth (Pob)
The Census Bureau documents the above 6 fields differently than the others,
presumably because there are so many possible code values for each. They
have created appendices which contain the encodings.
In some cases, the encodings are at multiple levels (such as in Occupation,
000-202 are "Managerial and Professional", which is then broken down
into sublevels, then the sublevels are broken down into the lowest levels.
This requires us to choose the encoding level that we want to use as a default,
but our customers may choose to encode it any way they want to. We have
taken the lowest level categories for our default encoding.
In some cases, the encodings are have ranges that are not further broken
down, such as the case where 049-052 corresponds to nuclear engineers. In
cases like this, we wondered if there were any difference between 49, 50, 51,
and 52. So we ran some analyses to find out (below).
In some cases, the opposite situation is true, where multiple labels are
given to the same code. For example, in Ancestry, the code 005 is given to
BASQUE, Euskalduna, and Euzkadi. In cases like this, we had to choose one
label to go with the code. Except where obvious errors were found (more on
that below), we chose the first label given for any particular code. Of
course the customer can relabel them as they wish, just as they can recode.
Wherever codes for apprentices are given (in Occupation), there is a high
level range for the profession, and there is a portion of that range given to
the apprentice. But the other portion of the range is not specified.
It would seem that the non-specified portion belongs to the professional, and
the specified portion belongs to the apprentice. But we'd rather not make
assumptions, especially when we have tools that can provide answers.
In order to address our concerns (Are the ranges at the lowest level
meaningful? Did we introduce any errors when we picked out the lowest
encodings? Are there errors in the documentation? What's going on
with the apprentices?), and also to show our chosen default encodings to the
customer, we have run nationwide tabulations for each of the above 6
fields.
For each field, we accumulated weighted person count (Pwgt1), unweighted
person count (People), and weighted household count at the person level (PHouseholds).
We dimensionalized each based upon both the numeric codes and the labels.
The results allow us to:
- Verify that our labels match the proper codes
- See if ranges within a low level label have meaning (i.e. Is more than one
number in the range used?)
- Explore potential errors in the Census Bureau documentation
- Look into the "apprentice situation".
- Have useful tabulations for future reference.
Here is what we've found:
- Our labels match properly.
- The ranges (at the lowest level) do not have meaning. Only one
number in each range is actually used.
- There are a few errors in the Census Bureau documentation.
Errors:
- The "industry" documentation specifies that the range 0-10
refers to "Agricultural production, crops". There are nearly
100,000,000 people coded as '0', and nearly 2,000,000 people coded as
'10'. We suspect that the people coded as '0' do not have an industry
associated with them. Given that this is well over 1/3 of the nation's
population, we suspect that they are mostly children and retired
people. But just to check, we will run an analysis to find out what
the income of these '0' coded people is. The documentation should read
that people coded as '10' are involved in agricultural crop production.
- The Ancestry encodings were corrupted when we received them, so we looked
at another source. There were a few errors in that source also.
"BELOURUSSIAN" was coded as '02', when it should have been
'102'. "Webel Druze", (under Syrian) was coded as '329', it
should have been '429'. Both "West German" and
"GREEK" are coded as '45'. Some of the encodings are not in
numeric order, which is inconsistent with all of the other Census Bureau
files. As a general rule, we have found errors from the Census Bureau
to be few and far between. But we are not comfortable with this file,
so we run the regular tabulation (labels & numeric codes) to check our
own work, then we run another tabulation which cross-references the data
based upon Ancestry and Place of Birth. We expect to see a high
incidence of people born in a country which reflects their ancestry.
This settles the GREEK/West German question (45 is Greek), and also provides a resource to identify any other
hidden errors (one was found - it's described here).
Home -
Company -
Contact -
Terms -
Sample Tabulations -
Corporate Analyses -
Run A Tabulation -
Data Source -
Household Data -
Person Data -
Free Offer Details
Copyright © Innovative Computing, Inc. 2002
|
 |
 |
 |
 |
| A time travel adventure, Time Changer is the story of Bible Professor Russell Carlisle (D. David Morin) who has written a new manuscript called "The Changing Times". His new work is about to receive the unanimous endorsement from his peers at Grace Bible Seminary until his fellow Professor, Dr. Norris Anderson (Gavin MacLeod), has a difficulty with something Carlisle has written that he feels will greatly affect the future. Using a secret time machine, Dr. Anderson sends Russell Carlisle over 100 years into the future to see where his thinking will lead. (99 min) |
 |
| The above space is provided gratis, because Innovative Computing, Inc. believes that everyone should see this film. |
|