Some Immigrants Have "Melted
Some haven't yet. The
USA is a nation of immigrants - even the "Native Americans"
immigrated from Asia long ago. To understand our "melting
pot", we need to count the "melted" separately from the
"non-melted". And there's a trick - using language - to
count them separately. Follow along and you'll
see how we learn a lot from a little tabulation. Depending on what
language and income group you're in, you may find it either very exciting,
or downright scary. But if you find yourself on the scary
side, the tabulation makes the cure pretty obvious, and available to
anyone who wants it.
You can be "Melted In" and
still speak other languages.
I'll use my own home to demonstrate why a simple tabulation of
household languages might be really misleading. In my home people usually speak English, but we also speak 5 other
languages. The Census Bureau classifies households based on their most frequently used non-English language.
People reading the data might then think that we don't speak English, when
in fact English is the most common language for us.
While no one cares that my daughter
and I try to speak in Latin, they do care about the number
of households that speak in native languages, which may or may
not be English. They also care about whether or not households can
speak English, how well they speak English, and how their language relates to their income.
Those are the kinds of questions that we'll answer with this tabulation.
[For a list of languages spoken in the US & the
number of people who speak them, see The Package
and the risk-free offer below.]
Many Spanish Speakers Mostly Speak
While poking around in this data, I found something that really
surprised me. There are a lot of households out there that
speak Spanish without having a single Hispanic person in the house.
The family knows Spanish, but if you eavesdropped on their conversations, you'd
hear mostly English. The same applies to other languages. A
family categorized as speaking Mandarin might speak mostly English - or -
they might speak all Mandarin and no English. We need to bring
another measure into the picture.
Get The True Picture
To understand our melting pot, we need to count Households that are fluent in English
separately from those
that aren't. It turns out that "Language Spoken At Home"
doesn't help much - all it means is that they can speak that
language, and it tells us nothing about their English ability. To
solve the problem, we check each household to see if people are fluent in
English (they speak it "Well" or "Very Well").
We also check each household to see if people are not fluent in English
(they speak it "Not Well" or "Not At
All"). (Only people over 5 years old are considered for fluency.) We now
get 3 classifications for each household:
- Language spoken at home
- Presence(1+)/absence(0-0) of people who are not fluent in English
- Presence(1+)/absence(0-0) of people who are fluent in English
This will place each household into 1 of 3 groups (for each language
- Households where nobody speaks English fluently (1+, 0-0)
- Households where everybody speaks English fluently (0-0, 1+)
- Households where some speak English fluently and some do not (1+, 1+)
Digging Into The Data
Get a copy of the tabulation right now using the risk-free offer
described below (along with a bonus) to see what's really going on.
Let's look at the actual numbers. The first line is for English-only speaking
households. You can see how many there are, and what their average
The next 3 lines (lines 2-4) are for Spanish speaking households.
We can see how many are fluent in English, how many are not, and how many
are mixed. We can also see the enormous differences in
household incomes which are directly related to their fluency in
Lines 5-7 show us the same data for speakers of other (non-Spanish)
Indo-European languages. Interestingly, the income pattern (based on
English fluency) is the same as in lines 2-4, yet the average incomes are
Similarly, in lines 8-10 (Asian or Pacific Island language) and lines
11-13 (Other language) we see the same pattern of household income being
directly related to fluency in English. And yet, setting fluency
aside, there are very significant differences in incomes based on
languages. (Don't forget to compare foreign speaking incomes to
incomes of English-only households!) A final note: We have removed all
vacant households and all group quarters (such as prisons and nursing
homes) from this tabulation to prevent them from skewing the data.
There are two tabulations in this package (each given in both
"spreadsheet" .csv form, and non-spreadsheet form, for a total
of 4 files). The smaller file (lang.htm and lang.csv) is the one
discussed on this page. It's 14 lines long, including the column
headings. The larger file (language.htm and language.csv) is good
stuff that we haven't really mentioned. It is 196 lines long
(including column headings and 1 line for English-only speakers), and
lists 97 non-English languages spoken in the USA, along with the number of
people who speak these languages. So why are there 194 lines for 97
languages? Because each language is broken into two lines - one for
the number of people who are also fluent in English, and another for those
who are not fluent in English. So each language gets 2 lines.
To get the total number of people in the USA who speak that language, just
add the two lines together. (Incidentally, I defined
"fluent" as those who speak English "Very Well" or
"Well", and "not fluent" as those who speak English
"Not Well" or "Not At All".)
Risk Free 30 Day Offer
Buy the combination of these tabulations, with instant delivery, now for $28, without
risking a penny. Look them over. If at any time over the next
30 days you decide that you are dissatisfied with your tabulations, I will give you a 100% refund.
I'll even include a thank-you for
trying our data. Here's how it works. You agree to delete the
data from your computer, and send an email to "refunds @
SliceAndDiceData.com" (no spaces). In the email tell me what
you bought, when you bought it, and how much you paid. That's all
there is to it. I will cheerfully refund your money.
Transaction Handled by Specialists
People should do what they're good at, and hire other people to do what
the other people are good at. In my case, my talent, my profession,
is writing computer programs to analyze data. I'm good at it, so
it's what I do. But I hire two companies to handle billing the
credit cards and delivering the tabulations.
PayPal has built an entire business around handling transactions.
Their entire existence depends on keeping customer credit card data
secure, making all transactions happen properly, and even intervening to
resolve disputes between customers and vendors. So naturally, I use
PayPal to handle all our transactions.
Likewise, PayLoadz has built a business around delivering files.
It's all they do, and they're good at it. They also work seamlessly
with PayPal. When you click the button below to receive your
tabulations, you'll be taken to PayPal, then to PayLoadz. At
PayLoadz you'll get your tabulations.
Some things in life are no-brainers. Like choosing PayPal to
handle transactions, PayLoadz to deliver tabulations, and risk-free offers
to try out tabulations. So click the "Buy Now" button
below to see your tabulations.
- John Grumbine, President, Innovative Computing, Inc.
The following is US Census Documentation, but it is for the 2000 Decennial
Census, not the 2007 ACS data, which is the US Census data we are using.
Unfortunately there is no documentation available for the 2007 ACS data.
So, while the following may be very useful, it may not apply 100%.
LANGUAGE SPOKEN AT HOME AND ABILITY TO SPEAK ENGLISH
Language Spoken at Home
Data on language spoken at home were derived from answers to long-form questionnaire Items
11a and 11b, which were asked of a sample of the population. Data were edited to include in
tabulations only the population 5 years old and over. Questions 11a and 11b referred to
languages spoken at home in an effort to measure the current use of languages other than
English. People who knew languages other than English but did not use them at home or who
only used them elsewhere were excluded. Most people who reported speaking a language other
than English at home also speak English. The questions did not permit determination of the
primary or dominant language of people who spoke both English and another language. (For more
information, see discussion below on Ability to Speak English.)
Instructions to enumerators and questionnaire assistance center staff stated that a respondent
should mark Yes in Question 11a if the person sometimes or always spoke a language other
than English at home. Also, respondents were instructed not to mark Yes if a language other
than English was spoken only at school or work, or if speaking another language was limited to a
few expressions or slang of the other language. For Question 11b, respondents were instructed to
print the name of the non-English language spoken at home. If the person spoke more than one
language other than English, the person was to report the language spoken more often or the
language learned first.
For people who indicated that they spoke a language other than English at home in Question 11a,
but failed to specify the name of the language in Question 11b, the language was assigned based
on the language of other speakers in the household, on the language of a person of the same
Spanish origin or detailed race group living in the same or a nearby area, or of a person of the
same place of birth or ancestry. In all cases where a person was assigned a non-English language,
it was assumed that the language was spoken at home. People for whom a language other than
English was entered in Question 11b, and for whom Question 11a was blank were assumed to
speak that other language at home.
The write-in responses listed in Question 11b (specific language spoken) were optically scanned
or keyed onto computer files, then coded into more than 380 detailed language categories using
an automated coding system. The automated procedure compared write-in responses reported by
respondents with entries in a master code list, which initially contained approximately 2,000
language names, and added variants and misspellings found in the 1990 census. Each write-in
response was given a numeric code that was associated with one of the detailed categories in the
dictionary. If the respondent listed more than one non-English language, only the first was coded.
The write-in responses represented the names people used for languages they speak. They may
not match the names or categories used by linguists. The sets of categories used are sometimes
geographic and sometimes linguistic. The following table provides an illustration of the content of
the classification schemes used to present language data.
Four and Thirty-Nine Group Classifications of Census 2000 Languages Spoken at Home
With Illustrative Examples
Four-Group Classification Thirty-Nine-Group
Spanish Spanish and Spanish creole Spanish, Ladino
Other Indo-European languages French French, Cajun, Patois
French Creole Haitian Creole
Portuguese and Portuguese
Other West Germanic
Dutch, Pennsylvania Dutch,
Scandinavian languages Danish, Norwegian, Swedish
Serbo-Croatian Serbo-Croatian, Croatian,
Other Slavic languages Czech, Slovak, Ukrainian
Other Indic languages Bengali, Marathi, Punjabi,
Four and Thirty-Nine Group Classifications of Census 2000 Languages Spoken at Home
With Illustrative ExamplesCon.
Other Indo-European languages Albanian, Gaelic, Lithuanian,
Asian and Pacific Island
Chinese Cantonese, Formosan,
Other Asian languages Dravidian languages
(Malayalam, Telugu, Tamil),
Other Pacific Island languages Chamorro, Hawaiian, Ilocano,
All other languages Navajo
Other Native North American
Apache, Cherokee, Choctaw,
Dakota, Keres, Pima, Yupik
African languages Amharic, Ibo, Twi, Yoruba,
Bantu, Swahili, Somali
Other and unspecified
Syriac, Finnish, Other languages
of the Americas, not reported
Household language. In households where one or more people (5 years old and over) speak a
language other than English, the household language assigned to all household members is the
non-English language spoken by the first person with a non-English language in the following
order: householder, spouse, parent, sibling, child, grandchild, in-laws, other relatives, stepchild,
unmarried partner, housemate or roommate, and other nonrelatives. Thus, a person who speaks
only English may have a non-English household language assigned to him/her in tabulations of
individuals by household language.
Language density. Language density is a household measure of the number of household
members who speak a language other than English at home in three categories: none, some, and
all speak another language.
Limitation of the data. Some people who speak a language other than English at home may
have first learned that language at school. However, these people would be expected to indicate
that they spoke English Very well. People who speak a language other than English, but do not
do so at home, should have been reported as not speaking a language other than English at
The extreme detail in which language names were coded may give a false impression of the
linguistic precision of these data. The names used by speakers of a language to identify it may
reflect ethnic, geographic, or political affiliations and do not necessarily respect linguistic
distinctions. The categories shown in the tabulations were chosen on a number of criteria, such as
information about the number of speakers of each language that might be expected in a sample
of the U.S. population.
Comparability. Information on language has been collected in every census since 1890, except
1950. The comparability of data among censuses is limited by changes in question wording, by
the subpopulations to whom the question was addressed, and by the detail that was published.
The same question on language was asked in 1980, 1990, and Census 2000. This question on the
current language spoken at home replaced the questions asked in prior censuses on mother
tongue; that is, the language other than English spoken in the persons home when he or she was
a child; ones first language; or the language spoken before immigrating to the United States. The
censuses of 1910-1940, 1960, and 1970 included questions on mother tongue.
A change in coding procedures from 1980 to 1990 improved accuracy of coding and may have
affected the number of people reported in some of the 380 plus categories. In 1980, coding clerks
supplied numeric codes for the written entries on each questionnaire using a 2,000 name
reference list. In 1990, written entries were keyed, then transcribed to a computer file and
matched to a computer dictionary that began with the 2,000 name list. The name list was
expanded as unmatched entries were referred to headquarters specialists for resolution. In Census
2000, the written entries were transcribed by optical character recognition (OCR), or manually
keyed when the computer could not read the entry. Then all language entries were copied to a
separate computer file and matched to a master code list. The code list is the master file
developed from all language unique entries on the 1990 census, and included over 55,000
entries. The computerized matching ensured that identical alphabetic entries received the same
code. Unmatched entries were referred to headquarters specialists for coding. In 2000, entries
were reported in about 350 of the 380 categories.
Ability to Speak English
Data on ability to speak English were derived from the answers to long-form questionnaire Item
11c, which was asked of a sample of the population. Respondents who reported that they spoke a
language other than English in long-form questionnaire Item 11a were asked to indicate their
ability to speak English in one of the following categories: Very well, Well, Not well, or Not
The data on ability to speak English represent the persons own perception about his or her own
ability or, because census questionnaires are usually completed by one household member, the
responses may represent the perception of another household member. Respondents were not
instructed on how to interpret the response categories in Question 11c.
People who reported that they spoke a language other than English at home, but whose ability to
speak English was not reported, were assigned the English-language ability of a randomly selected
person of the same age, Hispanic origin, nativity and year of entry, and language group.
Linguistic isolation. A household in which no person 14 years old and over speaks only
English and no person 14 years old and over who speaks a language other than English speaks
English Very well is classified as linguistically isolated. In other words, a household in which
all members 14 years old and over speak a non-English language and also speak English less than
Very well (have difficulty with English) is linguistically isolated. All the members of a
linguistically isolated household are tabulated as linguistically isolated, including members under
14 years old who may speak only English.
Comparability. The current question on ability to speak English was asked for the first time in
1980. From 1890 to 1910, Able to speak English, yes/no was asked along with two literacy
questions. In tabulations from 1980, the categories Very well and Well were combined. Data
from other surveys suggested a major difference between the category Very well and the
remaining categories. In some tabulations showing ability to speak English, people who reported
that they spoke English Very well are presented separately from people who reported their
ability to speak English as less than Very well.