Printer Friendly

The analysis of telephone penetration: an update.

The Analysis of Telephone Penetration: An Update

This article analyzes the relationship between telephone penetration (the proportion of households with telephone service) and prices, household income, and other factors. The analysis is based on regressions using household data from the March Current Population Surveys for the years 1984 through 1987 and price data from telephone company tariffs. This article updates previous studies [2, 4, 5] in that it uses more recent data and extends the analysis by incorporating information on additional variables.


The last several years have seen substantial changes in the way telephone service is priced. In an attempt to implement cost based pricing, fixed monthly charges have gone up while usage charges for long-distance calls have come down. One of the concerns about implementing these charges is their impact on low-income households.

It has been shown that most households without telephone service are low-income households [12]. It has been argued that high connection charges make it difficult for such households to afford service. These have been used as the basis for the Lifeline and Link-Up America Programs. This article provides further evidence in support of such policies.

The Data

Under contract from the Federal Communications Commission (FCC), the Census Bureau in its Current Population Survey (CPS) includes the question, "Is there a telephone in this house/apartment?" [7] The CPS is a staggered panel survey in which the households residing at particular addresses are included for four consecutive months in one year and the same four months in the following year. It is staggered in that one-eight of the sample is replaced every month. Although the survey is conducted every month, not all questions are asked every month. The telephone question is asked once every four months, in the month that a household is first included in the sample and in the month that the household reenters the sample a year later. Since the sample is staggered, the information that is reported for any given month actually reflects responses over the preceding four months. Once a year, in March, the CPS augments its basic questionnaire with many additional economic questions. Thus the analysis here concentrates on the March CPS reports.

The March CPS data tapes contain records for households, families, and persons. The responses to the telephone question are included only on the household records. Thus, we have limited our analysis to the household records. Incorporation of information from the family and person records (such as age, race, employment status, etc.) is deferred to future research. A cross-tabulation of the responses to the telephone question with the responses to the other household questions on the March tapes can be found elsewhere [1, 3, 12].

Since the telephone question was added to the survey, there have been four March reports, for the years 1984 through 1987. The number of sample households in each of these reports has ranged from 57,985 in 1987 to 59,274 in 1985. From these samples, the proportion of households in the U.S. with telephones has been estimated to be 91.8 percent in March 1984, 91.8 percent in March 1985, 92.2 percent in March 1986, and 92.5 percent in March 1987.

Using geographical identification information on the household records, the households are divided into geographical areas, each of which falls entirely within one state. These areas consist of individual Metropolitan Statistical Areas (MSAs) and non metropolitan areas. The larger MSAs are further divided into central cities and surrounding (suburban) areas. Some smaller MSAs are grouped together. Since the Census Bureau has reduced the minimum population of an area required to allow disclosure from 250,000 to 100,000 people, it is possible to identify more geographical areas for 1986 and 1987 than for 1984 and 1985. For the two earlier years, 201 areas are identified, while 495 areas are identified for the two later years. The observations used in the regression analysis discussed below are a pooling of the cross-sections of the data for these areas over these four years.

The CPS data were supplemented by local rate information gathered from tariffs. This price data may contain some errors due to difficulty in matching the Census areas with the tariffed regions and incomplete information from non-Bell areas.

The Model

The variable to be explained is telephone penetration. Since this variable is limited to a range of zero to one, a simple linear regression is not appropriate. Instead, a logistics curve model is used. If Z is telephone penetration and Y is a linear function of the independent variables, the model takes the form: (1) Z = e raised to Y over (1 + e raised to Y) or Y = log[Z/(1 - Z)].

The above model is similar to a logit model, except that the observations are for groups of households rather than individual households. The use of a logit model for this type of data has been extensively explored by Perl [11]. Rather than try to duplicate his efforts, the goal here is to try to analyze the data without taxing the limited capabilities of the available computer hardware and software. (Attempts to do logit regression quickly ran into memory size constraints.)

This model assumes that Z asymptotically approaches one. It breaks down if Z=1 since Y is infinite. In the sample there are several areas, generally with small numbers of sample households, for which the sample proportion is one. Since it is not believed that the population proportion is really one, the sample value in those cases was replaced by [1 - (1/2n)], where n is the sample size for the area.

Since the observations used in our analysis are based on groups of sample households, and since the sizes of these sample groups vary substantially (from three to 1,906), it can be expected that the variance of the observations will be inversely related to the sizes of the sample groups. This is a clear case of heteroscedasticity, which can easily be treated by using weighted regression, with the sizes of the sample groups as weights. The use of these weights will also compensate for the fact that there are fewer areas in the first two years than in the other two years, since those fewer areas will generally have higher weights. A comparison showing that the results from weighted least squares are superior to the results from ordinary least squares in this case can be found in Belinfante [4].

The use of weighted regression was implemented by multiplying both sides of the regression equation by the square root of the sample group size and using standard regression programs on the modified equation. (Attempts to use a more sophisticated generalized least squares program ran into computer memory size constraints.) Straightforward application of this procedure would result in the intercept being replaced by the variable weighting factor. However, as Draper and Smith [10] note, it is not necessarily appropriate to estimate a weighted least squares regression equation without an intercept unless that intercept proves to be empirically statistically insignificant. Therefore, the regression equations were estimated both with and without an intercept. The general form for estimation is thus: (2) Y square root of n=a+b sub 0 square root of n+b sub 1 X sub 1 square root of n+...+ b sub K X sub K square root of n+u.

The regressions were estimated using various sets of independent variables, including limited models using just a few price and income variables, as well as a "full" model incorporating 44 economic, geographic, and demographic variables. These variables are listed in Table 1 on page 13. They are based on information on the household records on the CPS data tapes, supplemented by price information obtained from telephone company tariffs.

Since some of these 44 variables are essentially alternate measures of the same concepts, there is multi-collinearity among some of the variables. As noted in Belinfante [4] the multicollinearity is exacerbated by the use of weighted regression. This is partly alleviated by the large number of pooled observations (1,392). But principal components regression on the weighted variables was also tried to alleviate the multicollinearity. The use of this technique is discussed more fully in Coxe [8,9] and Belinfante and Coxe [6] including the use of the selection rule of including only those components with an F-statistic greater than two.

All dollar values are adjusted for inflation using the Consumer Price Index (CPI) for all items. They are expressed in 1987 dollars. Since the responses to the telephone question come from a four month period, from December through March, the average value of the CPI for those four months was used to adjust the dollar values for each year. The telephone prices used are generally those prevailing at the end of December. The poverty line estimate used in the variable POVRTY was $3,506 plus $1,887 times the number of people in the household, expressed in 1987 dollars. This is an approximation of the official Federal poverty line. The Federal subscriber line charge, before adjustment for inflation, was zero in the first two years, $1 in the third year, and $2 in the final year.

Regression Estimates

Initially some limited models were estimated, concentrating on price and income variables (specifically, variables 36 and 39 through 44 in Table 1). The results of these regressions are shown in Table 2 on page 14. Aside from SQRNOB, which measures the intercept effect, the most significant variables are INCOME and POVRTY, the coefficients of both having their expected signs. Both are measures of income effects, with INCOME measuring the average level of income and POVRTY measuring the proportion of households in low-end distribution of income. The first eight regressions include both of these variables. Because of the interrelationship between these two variables, the last eight regressions were estimated without the POVRTY variable to try to get a direct estimate of a simple income effect.

Two of the price variables, PRLOW (the lowest available monthly charge) and PRINST (the installation and connection charge), have coefficients with the expected sign, albeit with small values. They are included in all of the regressions. The remaining two variables, SLC (the subcriber line charge) and LIFELN (an indicator of the presence of a lifeline program), do not have coefficients with the expected signs. The estimated positive sign for the coefficient of SLC is undoubtedly a reflection of the fact that telephone penetration rose in 1986 and 1987 when the subscriber line charge was increased. Thus there is no evidence here that the subscriber line charge resulted in a decline in telephone penetration. If anything, the reduced long-distance rates that accompanied the increased subscriber line charges may have stimulated penetration. Since the coefficient of SLC did not have the expected sign, regressions 5 through 8 and 13 through 16 were estimated excluding this variable. The fact that the coefficient of LIFELN failed to have the expected positive sign indicates that many of the current lifeline programs have not yet had a significant effect in stimulating telephone penetration. This may be partly because many lifeline programs have very restrictive qualification requirements, such as limitations allowing only the elderly to qualify, that make them inaccessible to those who most can use them. It may also be because the programs are new. Since the coefficient of LIFELN did not have the expected sign, regressions 3,4,7, 8,11,12,15 and 16 were estimated excluding this variable.

The coefficient of SQRNOB changes significantly when the variable POVRTY is removed. This is presumably because SQRNOB takes the place of an intercept in weighted regression. Finally, since there is some question as to whether an intercept should be included in a weighted regression, the odd-numbered regressions were estimated with an intercept, while the even numbered regressions were estimated without an intercept. The inclusion of an intercept in the weighted regression is equivalent to the inclusion of (1/SQRNOB) as an independent variable in the original unweighted model.

Finally, a full model using all 44 variables in Table 1 was estimated. This includes many other economic, geographic, and demographic variables. The results of the regressions are shown in Table 3 on page 32. The table shows expected signs of the regression coefficients. These expectations are based primarily on examination of the cross-tabulations referred to above of the responses responses to the telephone question with the responses to the other questions in the CPS [1,3,12]. The model was estimated in three different ways: first, using weighted least squares without an intercept; second, using weighted least squares with an intercept; and third, using weighted principal components regression retaining only those components for which the F-statistic is greater than two. (The principal components procedure requires the inclusion of an intercept.)

The results of these three estimates are generally fairly similar. There are sign differences for the coefficients of only five variables (HOUSE, NOREL, FOODST, AL2CFS, and POVRTY) and in all of these cases the coefficients which do not have the expected sign are clearly insignificant since their t-values are less than one. Of the variables in the limited models, the INCOME coefficient remains significantly positive, although somewhat reduced in magnitude, probably as a result of the inclusion of other income-related variables. The POVRTY coefficient becomes insignificant and has the wrong sign in the weighted least squares regressions, probably as a result of the inclusion of other poverty-related variables. The coefficient of SLC remains significantly positive and of similar magnitude. Thus the comments made above remain applicable. The coefficient of PRLOW remains negative, but is much smaller and insignificant, indicating that the estimates from the limited models may be overstated. On the other hand, the measurement errors that probably exist in the price data bias the price coefficient estimates toward zero. The coefficient of PRINST remains significantly negative and of similar magnitude. The coefficient of LIFELN remains negative, but becomes smaller and insignificant, indicating that the estimates in the limited models may be spurious. The coefficient of SQRNOB is still positive.

Of the other variables in the regression, the following have coefficients which are significant but not of the expected sign: ALONE, PUBH, RENTSU, AL3CSA, and NUMPER. The positive signs of PUBH and RENTSU are indications that although households in public housing receiving rent subsidy have below average telephone penetration rates, those penetration rates are higher than they would have been in the absence of housing aid. Thus it can be argued that the housing aid has given these households the ability to afford phone service and the positive coefficients are not unreasonable. The unexpected signs of the coefficients of the other three variables seem to be interrelated and related to the coefficient of NPU18, which has the expected negative sign.

Tables relating telephone penetration to the number of persons in the household, the number of persons under 18, and the number of persons of school age (five to 18) all show a decline in penetration with an increase in family size. However, the sharpest decline appears in the case of the number of persons under 18. This would indicate that households are less likely to have a phone if they contain preschool children than households of equal size with older children and/or adults instead. This could explain the positive coefficients for AL3CSA and NUMPER. The expected negative coefficient for ALONE was partly connected with the expected negative coefficient for NUMPER, since penetration for single-person households is below what would be expected from a negative relation between penetration and the number of persons. However, since the coefficient for NUMPER was positive, this could have contributed to the positive coefficient for ALONE.


This study shows the strong relationship between telephone penetration and income, which has been found in all previous penetration studies. This has been a primary rationale for the institution of lifeline programs. Similarly, this study has found a significant price elasticity for installation charges, supporting a rationale for the Link-Up program. This confirms a similar finding by Perl [11]. However, there is no evidence here that existing Lifeline programs are effective. This may be due to overly restrictive eligibility requirements in many states with such programs or the newness of the programs. Also, there is no evidence here that the imposition of subscriber line charges has decreased telephone penetration.

There is room for further improvements in the regression models. Some variables have coefficients which are not significant and can probably be safely eliminated. Other potential variables related to the characteristics of individual household members (such as age, race, employment status, etc.) should be added to the model and could be significant. In addition, more accurate and better defined local price information is being sought. Furthermore, the above analysis indicates that toll rates may be a significant additional variable.


1. Belifante, Alexander. "Telephone Penetration and Household Characteristics."Federal Communications Commission, April 18, 1986.

2. . "An Analysis of Telephone

Penetration." Federal Communications Commission, November 12, 1986.

3. . "Telephone Penetration and

Household Characteristics for 1986." Federal Communications Commission, April 3, 1987.

4. . "An Analysis of Telephone

Penetration Using Weighted Principal Components Regression." Federal Communications Commission, May 1, 1987.

5. . "An Analysis of Telephone

Penetration," Federal Communications Commission, September 29, 1987.

6. and Karen L. Coxe. "Principal

Components Regression - Selection Rules and Application." Proceedings of the Business and Economic Statistics Section, American Statistical Association, 1986, pp. 429-434.

7. Bureau of the Census. "Current Population Survey March 1987 Technical Documentation." U.S. Dept. of Commerce, 1987.

8. Coxe, Keren L. "Multicollinearity, Principal Components Regression, and Selection Rules for These Components." Proceedings of the Business and Economic Statistics Section, American Statistical Association, 1984, pp. 449-453.

9. . "Principal Component Regression."

Encyclopeia of Statistical Sciences. John Wiley & Sons, Vol. 7, 1987.

10. Draper, Norman and Harry Smith. Applied Regression Analysis. John Wiley & Sons, 2nd Edition, 1981, Sec. 2.11 and 6.9.

11. Perl, Lewis J. "Residential Demand for Telephone Service 1983." National Economic Research Associates, Inc., December 16, 1983.

12. Staff of the Federal-State Joint Board. "Monitoring Report, CC Docket No. 87-339." CC Docket No. 80-28b, December 1987.

Table : 1. Independent Variables in Regressions

Table : 2. Regression Coefficients and t-statistics for Limited Models
COPYRIGHT 1989 St. John's University, College of Business Administration
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 1989 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:proportion of households with telephone service
Author:Belinfante, Alexander
Publication:Review of Business
Date:Mar 22, 1989
Previous Article:Price caps: a rational means to protect telecommunications consumers and competition.
Next Article:Pricing and technological change in regional telecommunications companies.

Related Articles
Emotions of Surprise and Concern Fanned by Telephone Rate Changes.
Internet TV takes root in Japan. (Webcasting Magazine).
Going mobile--slowly: how wireline telephone regulation slows cellular network development.

Terms of use | Copyright © 2017 Farlex, Inc. | Feedback | For webmasters