Survey response in a statewide social experiment: differences in being located and collaborating, by race and Hispanic origin.
This study examined whether and how survey response differs by race and Hispanic origin, using data from birth certificates and survey administrative data for a large-scale statewide experiment. The sample consisted of mothers of infants selected from Oklahoma birth certificates using a stratified random sampling method (N= 7,111). This study uses Heckman probit analysis to consider two stages of survey response: (1) being located by the survey team and (2) completing a questionnaire through collaboration with the survey team. Analysis results show that African Americans, American Indians, and Hispanics are significantly less likely to be located during the study recruitment than white Americans, controlling for other demographic and socioeconomic factors. Conditional on being located, the probability of collaboration did not differ among the four groups. Findings suggest that researchers should pay attention to separate stages of respondent recruitment and improve strategies to locate members of racial and ethnic minority groups during recruitment.
KEY WORDS: Hispanic origin; noncontact; nonresponse; race; survey research
Racial and ethnic inequality in socioeconomic status (SES) and health is a serious concern among social work practitioners and researchers. White Americans have, on average, advantaged positions over minority groups in educational achievement, employment status, income, wealth, and health (Bauman & Graf, 2003; Shapiro, 2004). For example, college-educated individuals represent 55% of non-Hispanic white Americans, but only 43% of African Americans, 42% of American Indians, and 30% of Hispanics (Bauman & Graft 2003). In terms of wellness indicators, racial and ethnic minority groups art- estimated to have poorer health than white people and to have more limited access to needed medical and preventive health care (Liao et al., 2004). Despite huge gaps in socioeconomic achievement, health, and other well-being measures, understanding of the causes of and solutions to these disparities is limited (Yancey, Ortega, & Kumanyika, 2006).
One barrier to understanding racial and ethnic disparities is a lack of representative data for each racial and ethnic group. Accordingly, a methodological focus for many decades among social scientists, especially survey researchers, has been how to generate representative data from all racial and ethnic groups (Hirschman, Alba, & Farley, 2000; Johnson, O'Rourke, Burris, & Owens, 2002; Yancey et al., 2006). Knowledge and experience in achieving representative samples, recruiting survey respondents, and collecting accurate information from study target populations have recently expanded (Baines, Partin, Davern, & Rockwood, 2007; Groves, 2006), but strategies and tactics are still far from perfect (Yancey et al., 2006). One potential obstacle to collecting reliable and valid data is unequal survey response rates among racial and ethnic groups. If response rates are consistently low among certain racial and ethnic groups, it may be hard to generate data representative of these groups (Yancey et al., 2006). Little is known, however, about how and why survey response varies among racial and ethnic groups.
To shed new light on survey response by race and ethnicity, this study took advantage of data from a statewide social experiment, SEED for Oklahoma Kids (SEED OK hereinafter). This study is unique in that the sample was drown from a sampling flame with information on demographic and social status--2007 birth certificate data provided by the Oklahoma State Department of Health (OSDH). Data from the sampling frame enabled us to compare characteristics between respondents and nonrespondents. Moreover, we separated the survey response process into two stages: (1) being located by the survey team (location), and (2) completing the questionnaire through collaboration with the survey team (collaboration). In this way, we investigated differences among white Americans, African Americans, American Indians, and Hispanics at two distinct stages of overall survey response: location and collaboration.
Reliable data from representative samples of minority groups are necessary in research on racial and ethnic disparities in SES and health. However, knowledge is limited on how to generate representative samples from these underrepresented groups (Yancey et al., 2006). In fact, we do not have a definite answer even to the question of whether response rates differ across various racial and ethnic groups (Galea & Tracy, 2007; Johnson et al., 2002). Although some studies have found that minorities are less likely to participate in scientific research (Kim et al., 2008; Yancey et al., 2006; Zaslavsky, Zaborski, & Cleary, 2002), others have found no statistical differences among groups (Groves, 2006), and the rest have shown that minority groups' response rates are higher than those of white people (Groves & Couper, 1998).
We also know little about what affects racial and ethnic differences in survey response, although previous studies have identified four possible answers. First, disparities in SES may contribute to different response rates. It is well documented that those with higher SES (for example, highly educated people) are more likely to participate in scientific research than those with socioeconomic disadvantages (Galea & Tracy, 2007; Johnson et al., 2002). Because members of racial and ethnic minority groups tend to have lower SES than white people, these differences may result in lower survey response rates.
Second, difficulty in locating members of minority groups is a challenge to recruitment. Some minority groups' geographic mobility rates are high, resulting in frequent changes in addresses and telephone numbers (Johnson et al., 2002; Oropesa & Landale, 2002). For example, low-income renters and migrant farm workers, disproportionately members of minority groups, move more frequently than other populations. In addition, people who are not white are less likely to be contacted through traditional channels of recruitment. American Indians on reservations have a lower level of access to telephone services than other groups (Kim et al., 2008). Kim et al.'s descriptive analysis has shown that American Indians' response rate is lower than the response rate for white Americans because the former's chance of being located is much lower than that of the later (70% versus 89%). Collaboration rates for those who were located were similar between the two groups (90% versus 92%) (Kim et al., 2008). Similarly, Zaslavsky et al. (2002) showed that African Americans and Hispanics are less likely to be located than white Americans.
Third, mistrust and skepticism of research institutions may reduce minority groups' survey response rates. Individuals are more willing to participate in a study when they perceive research institutions as legitimate authorities (Johnson et al., 2002). Members of minority groups may be less likely to accept mainstream research institutions (for example, government agencies, hospitals, higher education institutions) as legitimate, however, because of historical and personal experiences of discriminatory treatment and misuse of research (for example, the Tuskegee Syphilis Study) (Buchwald et al., 2006; Fouad et al., 2000; Johnson et al., 2002; Oropesa & Landale, 2002; Yancey et al., 2006).
Fourth, potential benefits to their family or community may increase minority groups' survey response (Fouad et al., 2000). Using vignettes that describe different types of research projects, Buchwald et al. (2006) found that American Indians and Alaska Natives were more willing to participate in studies on diabetes and alcoholism, two major health concerns among these groups, than in other clinical studies, suggesting that these minority groups are more likely to participate in studies that address their communities' salient issues.
Existing studies also have methodological limitations. Most have relied on bivariate associations (Groves & Couper, 1998; Yancey et al., 2006). In addition, the majority of existing studies have reported only final response rates. Only a small number of studies (Groves & Couper, 1998; Kim et al., 2008; Zaslavsky et al., 2002) distinguished sample members who were not located from those who were located but did not collaborate with the survey team. For this reason, we cannot say when in the recruitment process differences among racial and ethnic groups occur. Furthermore, previous studies mostly focused on differences between white and African Americans, leaving other racial and ethnic minorities understudied (Buchwald et al., 2006; Des Jarlais et al., 2005; Johnson et al., 2002).
To fill gaps in existing knowledge, we examined survey response in the SEED OK baseline survey among African Americans, American Indians, Hispanics, and white Americans. Using data from Oklahoma birth certificates and the SEED OK baseline survey, we first examined descriptive characteristics and response status by race and Hispanic origin. Next, we conducted Heckman prohit analyses (Greene, 2003; Sales, Plomondon, Magi& Spertus, & Rumsfeld, 2004) to see whether and how the likelihood of participating in SEED OK differed by race and ethnicity. These analyses separated sample members' probability of being located by the SEED OK survey team from their probability of collaborating with the baseline survey on the condition of being located.
Study Setting: SEED OK
SEED OK is a longitudinal social experiment that tests the policy concept of Child Development Accounts (CDAs) offered at birth. The experiment used both random selection and random assignment into treatment and control groups. For infants in the treatment group, an Oklahoma College Savings Plan (OCSP) account was automatically opened, and a $1,000 "seed" deposit was made into each account. Saving matches were also available to treatment infants from low- or moderate-income families for the first four years of the experiment (Zager, Kim, Nam, Clancy, & Sherraden, 2010).
The SEED OK study is unique in that its sampling frame was birth certificate data of all infants born in the state of Oklahoma in certain periods. The OSDH provided birth certificate data of infants born in the state between April and June, 2007 (stage 1) to the SEED OK survey team. Due to a lower-than-expected response rate in the first stage of data collection, the SEED OK survey team added a second sample, consisting of infants born between August and October 2007 (stage 2). In this way, SEED OK successfully acquired a sampling frame representative of its target population, the first step in constructing a representative sample (Zager et al., 2010).
From the sampling frame of birth certificates, the SEED OK survey team used stratified random sampling. Three minority groups were oversampled: African Americans, American Indians, and Hispanics. Through oversampling, the SEED OK study created samples of these three minority groups sizeable enough for separate analyses (Marks, Rhodes, & Scheffter, 2008).
After selecting potential study participants from the sampling frame of birth certificates, the SEED OK survey team recruited and collected data between August and December 2007 from the stage 1 sample and between January and April 2008 from the stage 2 sample. Primary caregivers of selected infants (mostly mothers) were recruited to join the SEED OK study. SEED OK had generous participation incentives: Caregivers who completed the baseline survey had a 50-50 chance of being assigned to the treatment group and thus of receiving an OCSP account with a $1,000 deposit for their infants. In addition, SEED OK offered a $40 incentive to all individuals (both treatment and control group members) who completed the baseline survey. However, the SEED OK study required survey respondents to provide their infant's Social Security number (SSN) so that an OCSP account could be opened for the child. This requirement may have reduced survey collaboration rates because some may have felt uncomfortable providing this confidential information (Marks et al., 2008).
Because birth certificate data included names and addresses but not telephone numbers, the SEED OK survey team used various methods to obtain phone numbers of sample members for interviews. First, an attempt was made to match OSDH birth certificate data to up-to-date telephone numbers and addresses, using automated databases. Second, an invitation letter, signed by Oklahoma State Treasurer Scott Meacham and sent by RTI International (the research firm implementing the baseline survey), briefly introduced the study to sample members and encouraged them to call the project's toll-free telephone number or visit the project's Web site. Third, professional tracers used commercial sources and contacted possible relatives and neighbors of sample members. Fourth, a field representative visited the best known address of the sample members not located through other methods to explain the study and encourage participation (Marks et al., 2008).
Data and Sample
This study used data from three different sources: (1) Oklahoma birth certificates; (2) SEED OK administrative data on the baseline survey recruitment process; and (3) "Population by ZIP Code Tabulation Area: 2000" data from the Oklahoma deputy treasurer for policy and administration. Out of 7,328 cases originally selected from birth certificates, 213 ineligible cases (for example, deaths of infants or mothers) were excluded. We also excluded four cases with information missing on three variables included in the Heckman probit analysis: child's birthweight and mother's marital status and birthplace. The final analysis sample was 7,111 cases, consisting of 3,264 white Americans, 1,183 African Americans, 1,355 American Indians, and 1,309 Hispanics.
The dependent variable was survey response status. This variable was created using administrative data that recorded the recruitment process of the SEED OK baseline survey. Because those who were not located may have differed from those who were located but did not collaborate with the SEED OK survey research team, we created three categories for the dependent variable: (1) not located (that is, unlocated cases), (2) located but did not complete the survey by refusing to participate or breaking off the interview (that is, located noncollaborators), and (3) located and completed the survey questionnaire (that is, respondents or located collaborators).
The independent variable was the race and Hispanic origin of infants. We created the variable from birth certificate data using the National Center for Health Statistics' vital statistics protocol (Buescher, Gizlice, & Jones-Vessey, 2005; Marks et al., 2008). If the mother was reported as Hispanic, the infant was classified as Hispanic. If information regarding the mother's Hispanic origin was missing and the father was recorded as Hispanic, the infant was also categorized as Hispanic. The SEED OK sample, however, includes no child identified as Hispanic by using the father's information; among seven cases with missing information for the mother's Hispanic origin, no child had a Hispanic father. Infants not identified as Hispanic were assigned to African American, American Indian, or white categories on the basis of the mother's race. As a result, the infant's race and Hispanic origin variable used in this study was identical to that of the mother.
The race and Hispanic origin variable consisted of four categories: non-Hispanic white (white), non-Hispanic African American (African American), non-Hispanic American Indian (American Indian), and Hispanic. The white category included a small percentage of Asians (1.34%) because the number was too small for a separate category. Supplementary analysis results, conducted after excluding Asians from the sample, did not differ substantively from those reported in this article.
We used demographic information from birth certificates as control variables. Information on infants included gender (1 for male, 0 for female) and birthweight. Birthweight is a reliable indicator of an infant's general health status during early childhood and affects the amount of time parents may spend on parenting. Accordingly, birthweight may have affected potential respondents' decision about collaborating with the survey. Because the relationship between birthweight and probability of responding to the survey may not be linear, we used the logarithm of birthweight in multivariate analyses. Analysis results with the actual value of birthweight were substantively identical to those reported in this study. Caregiver variables included age; education (no high school diploma, zero to 11 years of schooling; high school diploma, 12 years of schooling; some college, 13 to 15 years of schooling; college degree, 16 or more years of schooling; and missing information); marital status (1 for mamed, 0 for others); nativity (1 for nativeborn, 0 for foreign-born); and indicator of being born in Oklahoma (1 for yes, 0 otherwise). Because a substantial proportion of birth certificates (18%) did not contain information on father's age or education, we created an indicator for father's information (1 for those without information on father, 0 otherwise). Infants with fathers' information on their birth certificates may have differed from those without in terms of parents' relationship and father's involvement with childbirth and child rearing, which may have affected the caregivers' chance of survey response (Ma, 2008; Nicolaidis, Ko, Saha, & Koepsell, 2004).
We also used two variables related to the SEED OK baseline survey administration: (1) recruitment stage and (2) the number of days from the child's birth to first contact. For the recruitment stage variable, we assigned the value of "0" to those recruited at stage 1 and "1" to those recruited at stage 2. We suspected unobserved systematic differences between the two stages. For example, survey interviewers may have been more experienced in working with those in the SEED OK sample during the second stage compared with the first. The number of days between infant's birth and first contact variable was created using the child's birthday and the first contact date of each recruitment stage: July 1, 2007, for stage 1 and November 1, 2007, for stage 2. We used a logarithm of this variable in the Heckman probit analysis because the relationship between this variable and survey response may not be linear. Analysis using the actual value of the number of days between birth and first contact produced substantively identical results to those from the main model.
In addition, we created a geographic residency variable using "Population by ZIP Code Tabulation Area: 2000" data. The geographic data were created on the basis of the U.S. Census Bureau's classification of five-digit ZIP code tabulation areas (ZCTAs) (for additional information, see U.S. Census Bureau Geography Division, 2011). Because the SEED OK baseline survey data contained only the first three digits of ZCTAs, for confidentiality reasons, we used the three-digit ZCTAs and merged the baseline survey data set and the geographic data set. The geographic residency variable consisted of three categories: (1) metropolitan only (three-digit ZCTAs that fall entirely in a metropolitan area); (2) nonmetropolitan only (three-digit ZCTAs classified as entirely nonmetropolitan areas); and (3) mixed (three-digit ZCTAs that are a combination of metropolitan and nonmetropolitan areas).
This study used probit regression with sample selection (Heckman probit). The analytical method identifies factors associated with survey collaboration, conditional on the probability of being located. Because only those who were located could decide whether or not to collaborate with the survey team, analyses may produce a biased estimation if different (and unobserved) characteristics between those located and those not located (selection bias) are not considered. For example, it is plausible that those who had limited English proficiency were less likely to respond to SEED OK's calls and mailed invitation and, therefore, were not located by the survey research team. Because one's ability to communicate in English was also likely to affect one's decision to participate in the SEED OK survey, this unobserved difference between located and unlocated cases would likely have biased analysis results on survey collaboration if not properly addressed. Heckman probit analysis deals with this selection bias by estimating one's probability of collaborating with the survey team (one's likelihood of completing the survey) while considering one's probability of being located (Greene, 2003; Sales et al., 2004). We conducted Heckman probit analysis using STATA.
The Heckman probit model requires at least one identifier that is supposed to affect selection (being located) but not the final outcome (collaboration) (Greene, 2003; Sales et al., 2004). We included two identifiers in the selection model: (1) whether the mother was born in Oklahoma and (2) the number of days between the infant's birth and the first contact date. Mothers born in Oklahoma were more likely to have lived in Oklahoma longer and to have a stable residence than those born elsewhere. These mothers were also more likely to have relatives in the state, which may have facilitated tracking them. Oklahoma-born mothers, however, probably would not have differed from others in their chance of collaboration if demographic and other factors were controlled. A shorter time gap between infant birth and the first contact day was expected to improve the chance of being located because it decreased the probability of moving to a new place, but likely did not affect the chance of collaborating, conditional on being located. Except for these two identifiers, both equations (location and collaboration) in the Heckman probit analysis included the same variables.
In addition to the analysis model previously described, we ran supplementary analyses to check the robustness of findings. First, we ran bivariate probit, instead of Heckman probit. Bivariate probit regression is another statistical method used to address a sample selection issue (Greene, 2003). Second, we ran a model using actual values of birth-weight and the days between birth and first contact instead of their logarithm to consider the possibility that the associations between these two variables and survey response were linear. Third, we ran an additional analysis with a sample that did not include Asians. Due to the small number in the sample, we did not create a separate category for Asians but instead combined them with white Americans. Because Asians may have had a different probability of responding to the survey than their white counterparts, we conducted an analysis after deleting Asians from the sample. These supplementary analyses produced substantively identical results to those reported in this article. (Results from supplementary analyses are available on request.)
Based on the Heckman probit analysis results, we estimated three types of predicted probabilities by race and Hispanic origin: (1) being located, (2) collaborating with the survey team after being located, and (3) responding to the survey (a summary that considers both one's probabilities of being located and collaborating with the survey team). This approach showed the impacts of race and Hispanic origin more clearly than descriptive statistics, because the predicted probabilities were estimated while holding other factors constant (Powers & Xie, 1999). That is to say, predicted probabilities were calculated for potential respondents with the same characteristics other than race and Hispanic origin. We estimated predicted probabilities for a typical case in the sample: A male infant with a birthweight of 3,250 grams; whose mother was a 25-year-old, Oklahomaborn, married woman with a high school diploma, living in a metropolitan area; whose father's information was not missing; and who was recruited at stage 1 and born 50 days before the first contact was made by SEED OK.
In addition to Heckman probit analysis, we replicated the analytical approach of two existing studies that focused solely on survey collaboration (Groves & Couper, 1998; Zaslavsky et al., 2002). In contrast to Heckman probit analysis, which includes both located and unlocated cases, the logistic regression used in these other two studies investigated survey collaboration with located cases only. The comparison of Heckman probit and logistic regression results showed whether and how the selection bias (unobserved differences between located and unlocated cases) affected estimations on survey collaboration.
Demographic and other characteristics of the sample, by race and Hispanic origin, are summarized in Table 1. As expected, white Americans had, on average, more socioeconomic advantages than other groups. White infants were more likely to have married and college-educated caregivers than minority infants. They were more likely to have the father's information recorded on their birth certificates, suggesting a more stable relationship between birth parents. The percentage living in metropolitan areas was much higher among African Americans and Hispanics.
Response status, by race and Hispanic origin, is presented in Table 2. In addition to the number of individuals in each category of status, Table 2 shows three summary measures of response: (1) location rate (the percentage of located cases in the entire sample), (2) collaboration rate (the percentage of located cases who completed the survey questionnaire), and (3) response rate (the percentage of complete cases in the entire sample). Different measures produced distinct results by race and Hispanic origin (see Table 2). Location rates significantly differed by race and Hispanic origin [[chi square](3)=42.31, p=.00]. Among four groups, white Americans had the highest location rate (89%), and African Americans had the lowest rate (82%). In contrast, white Americans showed the second-lowest collaboration rate, and African Americans had the highest rate. Hispanics had low rates for both location and collaboration, and American Indians had rates in the middle for both measures. Differences in collaboration rates among the four groups were also statistically significant [[chi square](3) = 9.75, p = .02]. As a result of contrasting patterns of location and collaboration rates by race and Hispanic origin, differences in overall response rates among the four groups were not significant [[chi square](3) = 5.42, p = .14].
Results from the Heckman probit analysis are summarized in Table 3. Results for control variables were as expected. Those with a college degree were more likely than those without a high school diploma to be located and to collaborate with the survey team, consistent with findings in previous studies (Galea & Tracy, 2007; Groves, Cialdini, & Couper, 1992). Native-born mothers were more likely to collaborate once they were located by the SEED OK survey team. Foreign-born mothers may have had language and cultural barriers, which could have discouraged them from participating in the survey. In addition, these mothers may have felt uncomfortable revealing personal information if they were undocumented. Consistent with existing studies (Groves, 2006; Keeter,
Kennedy, Dimock, Best, & Craighill, 2006), those living in nonmetropolitan areas were more likely to be located than those living in metropolitan areas.
Analysis of race and Hispanic origin variables revealed different patterns for location and collaboration. All three minority groups were significantly less likely to be located than the white group (the reference group), even when demographic and geographic factors were considered. Conditional on being located, however, the probability of collaboration did not differ significantly by race or Hispanic origin.
Predicted probabilities of survey response based on the Heckman probit results are presented in Figure 1. The first bars in Figure 1 present predicted probabilities of being located for the four racial and ethnic groups. Consistent with descriptive statistics, white Americans were more likely to be located than minority groups (82% versus 77% to 78%). In contrast to the descriptive statistics that showed the location rate for American Indians as being two or four percentage points higher than that of African Americans and Hispanics, predicted probabilities showed that the likelihood of being located was comparable for all three minority groups. These results suggest that differences in location rates between American Indians and the other two minority groups found in the descriptive statistics may have been be caused by distinct characteristics. For example, American Indians were less likely to live in metropolitan areas (see Table 1), where the location rate was typically lower (see Table 3).
The second bars in Figure 1 show predicted probabilities of collaboration by race and Hispanic origin once located. Predicted probability of collaboration was lowest among white Americans (48%) and highest among African Americans (54%), which was consistent with descriptive statistics in Table 2. The other two minority groups' predicted probabilities of collaboration lay between those of white and African Americans: 52% for American Indians and 53% for Hispanics.
The last bars in Figure 1 show predicted probabihties of responding to the survey, combining probabilities of being located by and collaborating with the survey team. Predicted probability of survey response was lowest among white Americans (40%) and highest among African Americans (43%). Response rates among Hispanics and American Indians were around 42%. These results were somewhat different from descriptive statistics that showed the lowest response rate among Hispanics and a relatively high response rate among white Americans. These discrepancies suggest that different response rates among racial and ethnic groups may be explained, at least partially, by distinct characteristics among these groups, such as low education level among Hispanic caregivers and high educational attainment among white caregivers.
As described in the Method section, we replicated two existing studies on survey collaboration (Groves & Couper, 1998; Zaslavsky et al., 2002). Logistic regression with the located case sample showed that the three minority groups were significantly more likely to collaborate with the survey team than was the white group when demographic and geographic factors were controlled for. (Full analysis results are not reported in the tables but are available from the authors). The result differed from findings in Zaslavsky et al. (2002), where collaboration rate was estimated to be lower among African Americans and Hispanics than among white Americans. Analysis results in this study were somewhat similar but not identical to those of Groves and Couper (1998), who found that Hispanics were more likely to collaborate than were white Americans but African Americans were as likely as white Americans to collaborate. Logistic regression results indicated that different analytical approaches did not explain discrepancies in findings between this study and Groves and Couper (1998) and Zaslavsky et al. (2002).
In this study, we investigated differences in survey response by race and Hispanic origin, using data collected for the SEED OK experiment. This research offers important new insights on survey response with innovations in methodological approach. First, our analytical method compared characteristics between survey respondents and non-respondents using information from the Oklahoma birth certificate sampling frame. Second, this study addressed whether survey response rates differed by race and ethnicity at two distinct stages in the recruitment process: (1) being located by the survey team and (2) collaborating with the survey team. Third, this research included American Indians, a racial group rarely studied in survey response research.
Despite these methodological strengths, this study had some limitations. First, this study used data from only one state; hence, findings cannot be generalized to other states or the nation. Second, we were unable to create a separate category for Asian Americans because of the small number included in our sample. Third, birth certificate data did not provide a few critical variables, such as family income and homeownership. The omission of these variables may have produced biased analysis results. Fourth, a multiracial category was not available for this study. This category is desirable given that multiracial populations are growing in the United States (Hirschman et al., 2000). Fifth, some hypotheses described in the Background section were not tested directly. For example, we could not test the hypothesis that mistrust and suspicion toward research institutions reduced response rates among minority groups. We were not able to investigate whether and how the potential benefits of the research to the minority communities may have motivated survey participation.
Turning to key results and implications for social work research, Heckman probit analysis found that African Americans, American Indians, and Hispanics were significantly less likely than white Americans to be located during study recruitment. Study findings are consistent with other research that found lower location rates among minority groups (Kim et al., 2008). These results suggest that social work researchers should improve strategies to locate African American, American Indian, and Hispanic study subjects to increase response rates of these groups. This finding also calls for further investigation of why it is more difficult to locate people from racial or ethnic groups other than white (for example, high mobility rates, lack of access to a phone, mistrust, and fear).
Furthermore, Heckman probit analysis also demonstrated that, conditional on being located, the probability of collaboration did not differ between white and minority groups. If these results are confimled by future studies, they would contradict the often-mentioned hypothesis that members of minority groups refuse to collaborate with survey researchers because of mistrust and fear (Buchwald et al., 2006; Fouad et al., 2000; Johnson et al., 2002; Oropesa & Landale, 2002; Yancey et al., 2006). However, it is also possible that significantly lower probabilities of being located may reflect minority group members' mistrust and suspicion. Those who were suspicious and mistrusting may not have responded to mail requesting their phone numbers or answered the phone when the SEED OK survey team called. More empirical evidence is needed before making conclusions about the roles of fear and mistrust.
It is also noteworthy that Heckman probit results on collaboration differed from those of the logistic regression with the located-case sample. In contrast to the Heckman probit results, the logistic regression found that the three minoriW groups had significantly higher chances of collaborating with the survey team than did the white group. Different results between Heckman probit analysis and logistic regression findings indicate the need for survey collaboration research to address selection bias issues. These findings also support the consideration of two distinct stages of survey recruitment: (l) locating potential respondents and (2) getting collaboration from them to complete a survey.
The logistic regression results also show that distinct findings between this study and the two existing studies (Groves & Couper, 1998; Zaslavsky et al., 2002) cannot be attributed to different analytical approaches. R.epllcation of these two studies' analytical approach did not produce the same results, as described in the Results section. It is not clear what caused disparate findings across studies, but generous incentives available to SEED OK study participants may be an answer. As previously described, SEED OK informed those in the sample that their child would have a 50% chance of receiving a $1,000 deposit into an OCSP account and (if income-eligible) saving matches if they participated in the study. Considering their higher likelihood of experiencing economic disadvantages, members of minority groups may have been attracted to SEED OK's financial incentives more strongly than their white counterparts. At the same time, the survey and study topic (that is, saving for college) may have affected individual likelihood of responding to SEED OK calls and materials in ways that vary by race and Hispanic origin, perhaps due to differences in the education level of potential respondents. SEED OK's request for each infant's SSN may also explain the distinct findings: This request may have had more negative impact on the white group than on the racial and ethnic minority groups if the former was more cautious about sharing confidential information than the latter. In addition, the study population of SEED OK--caregivers of infants in SEED OK-also differed from that of the other two studies (Medicare beneficiaries in Zaslavsky et al.  and general households in Groves and Couper ). Further research is warranted on whether survey collaboration rates are the same or different among racial and ethnic groups when their chances of being located are taken into account, whether and what financial incentives affect collaboration, and whether the relationship between race and ethnicity and chances of collaboration differs by age, gender, and other characteristics.
Original manuscript received March 9, 2011 Final revision received June 3, 2011 Accepted July 6, 2011 Advance Access Publication February 7, 2013
Baines, A. D., Partin, M. R., Davem, M., & Rockwood, T. H. (2007). Mixed-mode administration reduced bias and enhanced poststratification adjustments in a health behavior survey. Journal of Clinical Epidemiology, 60, 1246-1255.
Bauman, K.J., & Graf, N. L. (2003). Educational attainment: 2000. Washington, DC: U.S. Census Bureau.
Buchwald, D., Mendoza-Jenkins, V., Calvin, C., McGough, H., Bezdek, M., & Spicer, P. (2006). Attitudes of urban American Indians and Alaska natives regarding participation in research..Journal of General Internal Medicine, 21,648-651.
Buescher, P. A., Gizlice, Z., & Jones-Vessey, K. A. (2005). Discrepancies between published data on racial classification and self-reported race: Evidence from the 2002 North Carolina live birth records. Public Health Reports, 120, 393-398.
Des Jarlais, G., Kaplan, C. P., Haas, J, Gregorich, S. E., Perez-Stable, E., & Kerlikowske, K. (2005). Factors affecting participation in a breast cancer risk reduction telephone survey among women from four racial/ethnic groups. Prewntive Medicine, 41,720-727.
Fouad, M. N., Edward, P., Lee, G. B., Kohler, C., Wynn, T., Nagy, S., & Churchill, S. (2000). Minority recruitment in clinical trials: A conference at Tuskegee, researchers and the community. Annals of Epidemiology, 10 (8), $35-840.
Galea, S., & Tracy, M. (2007). Participation rates in epidemiologic studies. Annals ofEpidemiology, 17, 643-654.
Greene, W. H. (2003). Econometric analysis (5th ed.). Upper Saddle River, NJ: Prentice Hall.
Groves, 1%. M. (2006). Nonresponse rates and nonresponse bias in household surveys. Public Opinion Quarterly, 70, 646-675.
Groves, 1%. M., Cialdini, IK. B., & Couper, M. P. (1992). Understanding the decision to participate in a survey. Public Opinion Quarterly, 56, 475-495. Groves, 1%. M., & Couper, M. P. (1998). Nonresponse in household inten@w surveys. New York: John Wiley & Sons.
Hirschman, C., Alba, 1%, & Farley, 1%. (2000). Meaning and measurement of race in the U.S. Census: Glimpses into the future. Demography, 37, 381-393.
Johnson, T., O'P,.ourke, D., Bums, J., & Owens, L. (2002). Culture and survey nonresponse. In 1%. M. Groves, D. A. Dillman, J. L. Etlinge, & 1%.J.A. Little (Eds.), Survey nonresponse (pp. 55-69). New York: John Wiley & Sons.
Keeter, S., Kennedy, C., Dimock, M., Best, J., & Craighill, P. (2006). Gauging the impact of growing nonresponse on estimates from a national RDD telephone survey. Public Opinion Quarterly, 70, 759-779.
Kim, S. Y., Tucker, M., Danielson, M.,Johnson, C., Snesmd, P., & Shulnran, H. (2008). How can PRAMS survey response rates be improved anmng American Indian mothers? Data from 10 states. Maternal and Child HeahhJournal, 12, $119-S125.
Liao, Y., Tucker, P., Okoro, C. A., Giles, W. H., Mokdad, A. H., & Hams, V. B. (2004). REACH2010---SurveillanceJbr health status in minority communities, United States, 2001-2002 (Vol. 53). Atlanta: Centers for Disease Control and Prevention.
Ma, S. (2008). Paternal race/ethnicity and birth outcomes. American Journal qfl Public Health, 98, 2285-2292.
Marks, E. L., Rhodes, B. B., & Scheffler, S. (2008). SEED for Oklahoma Kids': Baseline analysis. Research Triangle Park, NC: RTI International.
Nicolaidis, C., Ko, C. W., Saha, S., & Koepsell, T. D. (2004). 1%acial discrepancies in the association between paternal vs. maternal educational level and risk of low birthweight in Washington state. BMC Pregnancy and Childbirth, 4 (10).
Oropesa, P,,. S., & Landale, N. S. (2002). Nonresponse in follow-back surveys of ethnic minority groups: An analysis of the Puerto Rican maternal and infant health study. Maternal and Child Heahh Journal, 6, 49-58.
Powers, D. A., & Xie, Y. (1999). Statistical methods for categorical data. San Diego: Academic Press.
Sales, A. E., Plomondon, M. E., Magid, D.J., Spertus, J. A., & 1Lumsfeld, J. S. (2004). Assessing response bias from nfissing quality of life data: The Heckman method. Health and Quality of Life Outcomes, 2, 49.
Shapiro, T. M. (2004). The hidden cost qfbeing African-American: Hou, u,eahh perpetuates inequality. New York: Oxford University Press.
U.S. Bureau Census, Geography Division. (201 l). ZIP code tabulation areas (ZCTAs). Retrieved from http://www. census.gov/geo/ZCTA/zcta.html
Yancey, A. K., Ortega, A. N., & Kumanyika, S. K. (2006). Effective recruitment and retention of minority research participants. Annual Review of Public Heahh, 27, 1-28.
Zager, 1k., Kim, Y., Nam, Y., Clancy, M., & Sherraden, M. (2010). The SEED for Oklahoma Kids Experiment: Initial account opening and savings (CSD Research Report 10-14). St. Louis: Washington University, Center for Social Development.
Zaslavsky, A. M., Zaborski, L., & Cleary, P. D. (2002). Factors affecting response rates to the consumer assessment of health plans study survey. Medical Care, 40, 485-499.
Yunju Nam, PhD, is assistant professor, Sdwol of Social Work, State Uniw'rsity of Neu, York, University at Buffalo, 685 Baldy Hall, Buffalo, NY 14260-1050; e-maih email@example.com. Lisa Reyes Mason, MSW, is research associate, Center for Social Development (CSD), George Warren Brown School of Social Work, Washington University in Saint Louis. Youngmi Kim, PhD, is assistant professor, School of Social Work, Virginia Comnlonweahh University. Margaret Clancy, MSW, is policy director Michael Sherraden, PhD, is Benjamin E. Youngdahl Professor of Social Development, CSD, George Warren Brown School of Social Work, Washington University in Saint Louis. Support for SEED for Oklahoma Kids comes from the Ford Foundation, Charles Stewart Mott Foundation, and Lunlina Foundation ['or Education. The authors especially value their partnership with the state of Oklahoma: Scott Meacham, state treasurer; Tim Allen, deputy treasurer for policy and administration; James Wilbanks, former director of revenue and fiscal policy; Jeremy Hood, former Intern; Kelly Baker, Derek Pate, and Sue MaUonee, Oklahoma State Departnlent of Health; Tony Mastin, Oklahoma Tax Commission Administrator; and James Conway, Program Administrator for Information Services, Family Support Services Division, Oklahoma Department of Human Services. They appreciate the contributions of staff at RTI International, especially those of EIh'n Marks, Bryan Rhodes, and Jun Liu. The Oklahoma College Savings Plan Program Manager, TIAA-CREF, has been a valuable partner. The authors extend particular thanks to Kerry Alexander, Katrina Moore, Allison Ziegler, and Toniann Nastasi at TIAA-CREF. They also thank Jnlia Stevens and Carrie Freeman, at CSD, for their editing assistance.
Table 1: Sample Characteristics, by Race and Hispanic Origin (N=7,111) White African Characteristic Americans Americans Infant characteristics Male (%) 52.02 52.16 Birthweight (in grams, M) *** 3,277 3,054 Caregiver characteristics Age (M) 26.11 24.37 Married (%) *** 68.47 27.81 Education (%) *** No diploma 15.01 22.49 High school diploma 35.45 40.15 Some college 24.26 24.34 College degree 24.82 12.26 Missing 0.46 0.76 Native born (%) *** 95.44 95.35 Oklahoma born (%) *** 61.21 67.62 Other factors Father, information missing (%) *** 11.21 37.19 Area (%) *** Metropolitan 26.62 66.86 Mixed 65.44 29.08 Nonmetropolitan 7.94 4.06 Recruited at stage 2 (%) *** 49.08 48.52 Birth to contact (M) 50.41 49.51 n 3,264 1,183 American Characteristic Indians Hispanics Total Infant characteristics Male (%) 51.22 54.47 52.34 Birthweight (in grams, M) *** 3,305 3,297 3,249 Caregiver characteristics Age (M) 24.46 25.19 25.34 Married (%) *** 47.01 52.02 54.59 Education (%) *** No diploma 25.02 49.73 24.55 High school diploma 41.25 32.93 36.87 Some college 21.55 10.62 21.25 College degree 11.88 5.50 16.71 Missing 0.30 1.22 0.62 Native born (%) *** 99.26 37.51 85.49 Oklahoma born (%) *** 83.84 16.58 58.37 Other factors Father, information missing (%) *** 17.42 18.11 17.99 Area (%) *** Metropolitan 16.01 59.66 37.38 Mixed 74.91 31.93 55.03 Nonmetropolitan 9.08 8.40 7.59 Recruited at stage 2 (%) *** 56.16 49.35 50.39 Birth to contact (M) 50.93 51.08 50.48 n 1,355 1,309 7,111 Notes: in testing the statistical significance of differences among the four groups, we used the Pearson [chi square] statistic for categorical variables ltor example, education) and the analysis of variance for continuous variables (for example, birthweight). Asterisks indicate the level of statistical significance among the four groups. *** p<.01. Table 2: Response Status, by Race and Hispanic Origin (N=7,111) Number of Race and Number of Located Number of Hispanic Unlocated Noncollaborators Respondents Origin Cases (A) (B) (C) White 364 1,651 1,249 African American 209 503 471 American Indian 192 644 519 Hispanic 216 629 464 Pearson [chi square] (dj) p value Full sample 981 3,427 2,703 Race and % Location % % Response Hispanic Rate (B+C)/ Collaboration Rate [C/ Origin (A+B+C) Rate CI(B+C) (A+B+C)] White 88.85 43.07 38.27 African American 82.33 48.36 39.81 American Indian 85.83 44.63 38.30 Hispanic 83.50 42.45 35.45 Pearson [chi square] [chi square](3) [chi square](3) [chi square](3) (dj) = 42.31 = 9.75 = 5.42 p value p = .00 p = .02 p = .14 Full sample 86.20 44.09 38.01 Note: In testing the statistical significance of differences among the four groups, we used the Pearson [chi square] statistic for categorical variables. Table 3: Heckman Probit Results: Probability of Collaboration Conditional on Being. Located Variable Located Collaborated Infants characteristics Coeff. (SE) Coeff. (SE) Race (white Americans) African American -0.166 *** (0.058) 0.084 (0.052) American Indian -0.150 *** (0.055) 0.046 (0.045) Hispanic -0.129 * (0.069) 0.066 (0.058) Gender (female) Male 0.037 (0.039) 0.031 (0.031) Birthweight (logarithm) 0.140 * (0.084) 0.023 (0.074) Caregiver's characteristics Age 0.007 * (0.004) 0.002 (0.003) Married 0.139 *** (0.049) 0.027 (0.041) Education (no diploma) High school diploma 0.013 (0.050) -0.023 (0.043) Some college 0.064 (0.061) -0.000 (0.051) College degree 0.296 *** (0.076) 0.209 *** (0.058) Missing 0.340 (0.259) 0.063 (0.199) Native born 0.053 (0.077) 0.167 *** (0.060) Other factors Father's information -0.041 (0.055) 0.063 (0.048) missing Area (metropolitan) Mixed 0.068 (0.045) 0.020 (0.037) Nonmetropolitan 0.149 * (0.079) -0.011 (0.064) Recruited at stage 2 0.685 *** (0.041) -0.174 *** (0.060) Oklahoma-born caregiver 0.038 (0.045) Log (birth to contact) -0.042 (0.028) Constant -1.200 * (0.700) -0.520 (0.637) [rho] 0.741 * (0.243) Log likelihood=-6770.81 Wald [chi square](16) = 64.59 *** Notes: N = 7,111. Coeff = coefficient. Reference groups are in parentheses. * p<.10. *** p<.01.
|Printer friendly Cite/link Email Feedback|
|Author:||Nam, Yunju; Mason, Lisa Reyes; Kim, Youngmi; Clancy, Margaret; Sherraden, Michael|
|Publication:||Social Work Research|
|Date:||Mar 1, 2013|
|Previous Article:||Epistemology in qualitative social work research: a review of published articles, 2008-2010.|
|Next Article:||Social work and postdoctoral experience.|