Do ABCs get more citations than XYZs?
An established strand of literature in both psychology and behavioral economics suggests that there are primacy effects on choice, that is, options being first or early in a sequence are more likely to receive attention and to be chosen (Becker 1954; Berg et al. 1955; Carney and Banaji 2012; Coney 1977; Dean 1980; Mantonakis et al. 2009; Miller and Krosnick 1998; Waltman 2012). This sort of influence on attention has been noticed and heavily exploited in the realm of advertising. One example raised by Einav and Yariv (2006) is that the 2003-2004 Los Angeles Westside Yellow Pages reveal more than 450 listed businesses with names containing a redundant initial A, as in "A-Approved Chimney Services," "A Any Way Bail Bonds," "A Budget Move," and the like.
Given that a large amount of scientific research papers order their references alphabetically by first author's surnames (Harvard-style references) (1) and that many academic conferences also order the attendees in the same way, it is thus natural to ask whether and to what extent that papers with first authors whose surname initials are earlier in alphabet are more likely to be cited (i.e., alphabetical order effect/alphabetical bias, afterwards), and whether the order in reference lists contributes to the alphabetical bias. The importance of this question is emphasized by the fact that number of citations is generally treated as an objective measure of paper quality. Indexes or measures based on citations (e.g., the H-index) are used as important references to determine everything from hiring and tenure (Ellison 2013), to wages and earnings (Diamond 1986; Hilmer, Hilnter, and Ransom 2011), to scholarship (Hamermesh, Johnson, and Weisbrod 1982; Smart and Waldfogel 1996) as well as society membership and grant funding (Berger 2013; Garfield 1999). If the alphabetical bias really exists, the measures based on citations have to be reinterpreted because they are contaminated by factors other than paper quality. In addition, a phenomenon has been documented in previous economic literature that professors with earlier surname initials in the alphabet are more likely to achieve academic success (Einav and Yariv 2006; Praag and Praag 2008), and a number of studies attribute the phenomenon to the alphabetical order in authorship position because first authors win most credits (Waltman 2012). (2) But the order in reference lists is mostly ignored (Einav and Yariv 2006; Joseph, Laband, and Patil 2005). Establishing an alphabetical primacy effect in citations thus provides another potential explanation. Finally, answering the question also helps to deepen our understanding of individual behavior as there is little literature on how serial position affects choice after delay, especially in situations where people browse through lists without an explicit choice goal in mind. (3)
Partly due to the lack of reliable data set on publications, there is no previous literature directly answering the questions except for few relevant studies. Einav and Yariv (2006) collected data from professors in Top 30 economics departments in the United States and mentioned the effects of alphabetical order in reference lists on academic success, but they did not provide empirical evidence for it. Berger (2013) collected field data from a journal in Psychology, which showed that articles listed in earlier position in a journal are cited more frequently. But he did not consider effect of the alphabetical order.
Using a unique paper sample of about 850,000 U.S. scientific articles in the Thomson-Reuters Web of Science (WoS) database, this study aims to shed light on the questions above. In this study, estimates show that the papers with first authors whose surname initials appear earlier in the alphabet get more citations, controlling for authors' likely ethnicities, number of references, number of addresses, and journal-year fixed effects. The estimates show that shifting surname initial from the bottom of the alphabet to the top is associated with a 0.44 percentile increase in rank of citations among the papers published in the same year, indicating a sizable alphabetical bias. However, no evidence supports the alphabetical order effect for non-first authors under the same settings.
One potential explanation is that order in reference lists plays a role in it because references are usually ordered alphabetically by first-author surnames in many scientific journals (Harvard style). Thus, I link the alphabetical order effect to the length of reference lists. The hypothesis is that readers would be less patient and more likely to stop without reaching the ends when scanning through longer reference lists. The empirical results show that the association of length of reference lists with alphabetical order effects mainly exists among the fields with stronger alphabetical bias. Although these results cannot rule out other possible unobservable factors potentially relevant or leading to alphabetical bias (e.g., individual characteristics correlated with surname initials), the order in reference lists at least partly contributes to that.
Finally, if alphabetical order matters while researchers are searching through reference lists, it could be expected that alphabetical order effect only exists in citations by authors other than those of original cited papers, because it is quite unlikely that researchers cite their own previous papers by searching through the reference lists. To test this, I match the cited paper sample to a pool of citing papers to divide the total citations into self-citations, that is, citations by authors themselves, and other-citations, and find that the alphabetical order of first authors' initials is only negatively associated with the number of other-citations. Furthermore, to further identify the alphabetical order effects originated in reference lists in citing papers, I use a difference-indifference-in-difference econometric framework to further rule out other confounding factors such as quality of citing papers. The results provide further suggestive evidence that length of reference lists in citing papers really matters for the alphabetical order effects.
The paper is organized as follows. Section II introduces the data used in this study. Section III shows empirical results and Section IV concludes.
A. Sample Restriction and Author Ethnicities
The data used in this study are from the Thomson-Reuters's WoS database for the years 1990-2005, which includes data on the articles published in more than 12,000 scientific journals and one of the two major sources for bibliometric material on scientific publications, number of citations until 2009, and related information needed in this study. (4)
Firstly, I restrict the sample to those with two-, three-, and four-authored papers because (1) 65% of all papers with multiple authors have two, three, or four authors and (2) the authors who are not in the first position make a natural control group. In addition, I also restrict the sample to papers in which all authors had U.S. addresses because it is more accurate to match these U.S.-based authors with their ethnicities through their last names and U.S.-MSAs (metropolitan statistical areas) in the ethnicity-matching program, as described in the following.
Surname initials are likely to be correlated with ethnicities, and ethnicities are also related to scientific productivity measured by impact factor, citations, and number of papers (Gaule and Piacentini, forthcoming; Huang 2013a, 2013b). To address this problem, I use William Kerr's name-ethnicity-matching program to determine the likely ethnicity of authors (Kerr 2008; Kerr and Lincoln 2010). This program uses names and the MSAs in which individuals live to determine their likely ethnicity. MSAs are matched because this information helps distinguish ethnicity as well because persons of a particular ethnicity live disproportionately in some MSAs. There are nine categories: Chinese (CHN), Anglo-Saxon/English (ENG), European (EUR), Indian/Hindi/South Asian (HIN), Hispanic/Filipino (HIS), Japanese (JAP), Korean (KOR), Russian (RUS), and Vietnamese (VNM). For a particular author, this program provides a probability distribution over the nine groups. The matching rate for individual authors is about 82%, and I further restrict the analysis to the papers with both first and last authors' ethnicities identified. (5) Finally, the sample I use has 846,122 papers in 12 fields: multidisciplinary (2%), agriculture (11%), biology (7%), biomedicine (26%), chemistry (11%), clinical medicine (31%), engineering (8%), geosciences (4%), information computer technology (6%), material science (3%), mathematics (3%), and physics (10%). (6)
Table A1 (the Appendix) shows how the number of papers distributes by the initials of first or last author's surnames, and the conditional probability distribution among the nine ethnicities given a particular surname initial. Obviously, the number of papers written by authors with I, Q, U, and X initials is small (less than 6,000), while authors with surnames beginning with S are the most common. The statistics in Table A1 clearly show that the surname initials are correlated with ethnicities. For example, more than half of the authors with X, Y, and Z initials are Chinese, even though less than 20% of the papers are written by Chinese authors.
B. Citation Measures
Number of citations for each paper is provided in WoS, which sums up all the citations in all the WoS publications until 2009. Because the number of citations follows power law distribution (Gupta, Campanha, and Pesce 2005; Redner 2005), using number of citations directly in linear regression models would bring potential econometric problems, as frequently cited papers may drive estimates, even though they are few. To address this problem, I use citation rank instead. For each publication year, I define the citation rank of a paper with c times of citations as ([N.sub.citation <c] + [N.sub.total]) X 100; in which [N.sub.citation <c] is the number of papers with citations less than c, and [N.sub.total] is the total number of papers published in the same year. This measure ranges from nearly 0 to 100. For a particular paper, it can be understood as the proportion of articles published in the same year that have fewer citations, in percentage terms.
In addition, the results keep robust and consistent when using the number of citations directly, log(number of citations + 1) or inverse hyperbolic sine transfer of citations as dependent variables.
III. EMPIRICAL RESULTS
A. Citations and Author Surname Initials
I assess the alphabetical order effect on citation by estimating the following equation:
(1) [Citation.sub.ij] = [[beta].sub.0] + [[beta].sub.1] [initial.sup.first.sub.i] + [[beta].sub.2] [initial.sup.last.sub.i] + [theta] [X.sub.ij] + [[gamma].sub.njt] [Z.sub.njt] + [[epsilon].sub.i]
in which [Citation.sub.ij] is the measure of citations for paper i in journal j, which may be the citation rank mentioned earlier, or other variables such as whether there is any citation for the paper i, or whether the citations are more than the median level, and the like. The variable [initial.sup.first.sub.i] is the surname initial order for the first authors, and [initial.sup.last.sub.i] is that for last authors. They may be the order of the initial letter in the alphabet (1-26) or be sets of categorical variables to capture the nonlinear relationship between the initial alphabetical order and citations, like initial letter dummies or groups. The coefficients on them, [[beta].sub.1] and [[beta].sub.2], are of main interest in this study because they capture alphabetical order effect in first-author position and last-author position, respectively.
Because previous literature (Freeman et al. 2014; Freeman and Huang, forthcoming) finds that papers with more addresses or references gain more cites, I control for a set of covariates, [X.sub.jj], in Equation (1), including the likelihood of each author's ethnicity, number of references categorical dummies and a linear term, dummies for number of addresses, and states. The addresses in WoS are provided in organization level for each publication. Number of addresses means how many different addresses there are in a paper, which is a measure for diversity of authors from different institutes or organizations.
Considering the quality of journal may change over time, and the distribution of author numbers across different journal-year cells may differ, I control for a group of indicators, the journals, publication years (1990-2005), number of authors (two-, three-, and four-authored), and all of their interactions, [Z.sub.njt], to capture these effects. Because field and subfield are defined at the journal level, the author number-year-journal fixed effects completely capture the variation across fields and publication year. This framework thus identifies the author surname initials order effect within each author number-year-journal cell; e, is the error term.
Table 1 reports the estimates of Equation (1) and the mean values of dependent variables. Estimates in the first column shows that the alphabetical order of first author surname initials is negatively associated with citation rank, but no analogous effect is present for the last author. Specifically, the rank of citations in percentile will increase by about 0.02 if the initial letter of first author's surname moves one position higher in the alphabet. Results in column 2, where I use the category indicators of author surname initials, show similar and consistent results: citation rank would increase 0.34 if the initial of first author's surname changes from the category U to Z range to the A to F range. The signs of other covariates are consistent with the previous literature, though not reported here (Freeman et al. 2014; Freeman and Huang, forthcoming; Lortie et al. 2013; Lozano, Lariviere, and Gingras 2012; Vanclay 2013).
However, last authors may not be a comparable group because they are more likely to be seniors while the first authors tend to be junior academics, which can be found in Figures Al and A2. (7) It is possible that the alphabetical bias identified in columns 1 and 2 in Table 1 is driven by their junior status and that the insignificance of last author initials is simply a result from seniority in some way. Therefore, I use second author as another reference group, as the second authors are also more likely to be juniors and more similar to the first authors. Figures A1 and A2 provide some supporting evidence for this, showing that the proportion of new names and average impact factor is similar between first and second authors in three- and four-authored papers. I thus restrict the sample to three- and four-authored papers, and then add the variables for surname initials of second author in the regressions, with the results in columns 3 and 4 in Table 1. The coefficients on second author initial order or categorical variables are small and insignificant, similar to those on last authors' surname initials implying that there is no alphabetical order effect for the authors in second position, either.
Furthermore, I exploit dummies for initial letters of first and last authors' surnames, estimate the coefficients, and report them in Figures 1A and IB, respectively. The pattern in Figure 1A clearly shows that the alphabetical order of surname initials is negatively associated with citation rank for first authors. To the contrary, Figure IB, where initials of last author's surname initials are used, shows a fairly flat pattern except for a couple of outliers, indicating that there is not much difference between them. The scales in Figures 1A and IB are different because of outliers in (V, X, Z in Figure 1A and Q, U, X, Z in Figure IB). When I drop the outliers and draw a similar figure, the pattern is actually the same. Although the exact reason for the outliers is not clear, the small sample size of these letters may contribute to it. In addition, names staring with these letters are more likely to be Chinese or European (see Table A1), and so the coefficients may also capture some ethnicity effects. Figure 1A shows that papers written by those with Z initials are cited most frequently, which may be caused by recency effects (de Bruin 2005; Houston and Sherman 1989; Li and Epley 2009; Nisbett and Wilson 1977), which imply that the papers listed later or last in the reference lists may be cited more often.
Table A2 shows that the results are robust across different author-number samples, and are consistent with the estimates in Table 1.
B. Surname Initials and Citations over Time
As discussed earlier, alphabetical order of first author surname initials really matters in citations. But how does it change as the time goes? It is reasonable that a paper, once cited, is more likely to be cited in the future by the same researchers due to familiarity and by the other researchers because it is more often to be cited in literature. So the alphabetical bias should be stronger when citing a paper published earlier.
Because WoS provide the number of citations in the first year, 2-4 years, 5-10 years, and 11-20 years, respectively, I estimate the relationship between author surname initials and citation in different time periods. I thus test the above hypothesis by using the papers published no later than 2002 and those no later than 1997, and report the results in Table 2. Column 1 shows that the alphabetical order effect exists in both subsamples, and that the earlier the papers are, the larger magnitude there is. Columns 2 and 3 present the results when dependent variable is rank of first-year citations and rank of 2-4 years citations. Estimates suggest that alphabetical order effects exist in the 2-4 years citations rather than first-year citations. Estimates in columns 4 and 5 show that the alphabetical bias is strengthened in later years. Comparing column 3 and column 4, the coefficients on first author initials change from -0.015 and -0.022 to -0.028 and -0.039, respectively. Estimates in column 5 further support this claim. Furthermore, the estimates in columns 3-5 are robust if controlling for the first-year citations, though the results are not reported. These results indicate that the alphabetical order effects can only be partly explained by first-year citations. In addition, all coefficients on last author initials are smaller in magnitude and insignificant, which is fairly consistent with the results in Table 1.
The above results show that the alphabetical bias exists for the first authors and for citations with at least one year's span. One of the potential explanations originates in order of references, because many journals order their references using the initials of surnames of first authors, and the first-year citations may be mainly from reading the paper directly rather than citing from a reference list in another paper. It is possible that researchers may scan the reference lists from top to bottom to find something they are interested in, and cite one or two relevant references to support some claim or assumption needed in their own papers. Therefore, those papers with first authors' surname initials being earlier letters are more likely to attract attention and get more cites because they are usually listed on the top of the reference lists. Because the journals do not order references by the surnames of non-first authors, the alphabetical bias should only exist in first-author position, and this is fairly consistent with the results above.
However, not all journals or articles order the references in Harvard style. Science, for example, is a Vancouver-style journal. The presence of these journals in the sample would cast some doubt on the above interpretation for the alphabetical bias. Unfortunately, WoS do not provide information on how the articles order their references. It is possible that the alphabetical order effect identified above does not originate from reference list, and is just caused by unobservable factors that we do not know. Suppose that students whose names come earlier in alphabetical order get more attention or better treatment from professors than those who come at the end, they tend to be more successful in their publishing efforts and get more citations.
C. Alphabetical Bias and Length of Reference Lists
This section shows that at least part of the alphabetical bias comes from order in reference list by investigating the relationship between the length of reference lists and the alphabetical order effect. If the length of reference lists of potential cited papers is longer, it would be less likely for the readers to read through the references completely, and then the alphabetical order effect, if existing, should be stronger. I test this hypothesis in three steps. First, for the papers in each field and publication year, I derive the ordinary least squares (OLS) estimates for alphabetical bias by estimating Equation (1). The mean of these coefficients is also negative, -0.018, and almost the same with the estimates in Table 1, which is -0.017. Second, I calculate the mean values of the number of references for the papers in each field and publication year, and plot them over publication year in Figure 2. It shows that the length of reference lists is increasing over time, which is consistent with Althouse et al. (2008). Third, I estimate the following equation to investigate the relationship between reference list length and alphabetical bias:
(2) [coef.sub.ft] = [[gamma].sub.0] + [[gamma].sub.1] [ref.sub.ft] + [[gamma].sub.f][field.sub.f] + [[gamma].sub.t] [year.sub.t]
+ [[delta].sub.f] ([field.sub.f] x [year.sub.t]) + [e.sub.ft]
in which dependent variable [coef.sub.ft], is the coefficients derived in field f and publication year t. The more negative it is, the larger alphabetical bias there is. The key independent variable, [ref.sub.ft], is the mean value of number of references. The coefficient on it, [[gamma].sub.1] shows the association between the alphabetical bias and length of reference lists. Field fixed effects and year fixed effects are also captured by the field dummies, [field.sub.f], and publication year dummies, [year.sub.t]. In addition, I also allow for the different time trend in each field by controlling for the field specific linear trend, [field.sub.f] x [year.sub.t], as there may be some time trend heterogeneity across different fields in length of reference lists and/or alphabetical bias; [e.sub.ft] is the error term.
Table 3 reports the OLS point estimation results for [[gamma].sub.1]. Estimates in Panel A are unweighted, and the results in Panel B are weighted by the number of the papers in the corresponding field and publication year. As there are 12 fields and 16 publish years, there are 192 observations in total. Both panels show consistent results. The first column shows that the alphabetical bias is significantly negatively correlated with the length of reference lists, indicating that the alphabetical order effect is significantly stronger in the fields with longer reference lists. I also add the lagged term in second column, only finding that overall effects are still negative and that the coefficients on lagged term are insignificant. I also test whether this association exists in the fields with strong alphabetical bias or in those with weak alphabetical order effect. I thus drop those observations with weak alphabetical bias (i.e., drop those with [coef.sub.ft] larger than 0.05), do the same estimation, and find the estimated Yi still significantly negative. However, when I run the regression after dropping those with [coef.sub.ft] smaller than -0.05, the estimated [[gamma].sub.1] becomes insignificant and smaller in magnitude. So the results are further consistent with the expectation that the association between alphabetical bias and reference list length mainly exists in those fields with strong alphabetical order effect.
D. Self-Citations and Other-Citations
If the order of reference lists plays an important role in the alphabetical bias, it would be helpful to distinguish between self-citations and other-citations. Researchers normally know what they have published in the past and will cite their own papers when necessary and relevant. In this process, authors do not need to go through articles by others or scan reference lists in them, and thus such search processes are not relevant to self-citations. Therefore, alphabetical order effect is expected to exist in other-citations, rather than self-citations.
WoS provides a unique identifier for papers and information about which paper cites any other. I match the cited paper sample used in Table 1 to a citing paper pool, which is composed of U.S.-based two-, three- and four-authored papers published during 1990-2008. These citing papers have the same information as the sample used above. It turns out that 567,018 of 846,122 cited articles could be matched to at least one paper in the citing pool. The rest are not matched because they do not have any citations or the papers citing these articles are not included in the citing paper pool. The first column in Table 4 shows how matched papers are distributed in our sample. Consistent with expectation, the papers written by authors with later initials in the alphabet are less likely to be matched, and papers with higher impact factor, more references or more addresses are more likely to be matched. The second column shows how the associations of citation rank with author initial order and other covariates. The coefficient on first author surname initial order is -0.017, very close to the corresponding estimate in Table 1, that is, -0.018. In addition, the third column further adds the times of citations in the first year since the paper is published into the regression directly to test whether the first-year citations matter for alphabetical order bias. Firstly, the coefficient on first citations is positive and very significant. But the coefficient on first author surname initials only changes from -0.017 to -0.016, implying that alphabetical order effect is robust to controlling the first-year citations.
From each citing-cited paper pair, it is possible to see whether names of any authors in the cited paper are the same with those in the citing paper. I define self-citations as any citations where the name of any author in cited papers is the same as that in citing papers. All other citations are then treated as other-citations. Self-citations make up 24% of the sample. However, it should be noted that this definition may take some other-citations for self-citations due to ambiguous names, where different authors may have the same names. Because only last names and first name initials are provided in WoS, it is likely that some other-citations may be identified as self-citations. For each cited article, I calculate the total number of matched citation pairs, self-citations and other-citations, respectively. Then I rank the citations in the same way in Section II.B. After that, I reestimate Equation (1) and report the coefficients on first author initials in Table 5A and Table 5B. The first columns in both tables show the results when the dependent variable is rank of matched total citations. The coefficient on surname initial order is negative though not significant in column 1, but some evidence for alphabetical bias appears when using categorical indicators. Less variation in the dependent variable may contribute to the insignificance and small magnitude of the coefficients, since it is calculated from the number of citations only matched. The next column in each table shows the results for self-citations. All the coefficients are insignificant and even positive, and these coefficients are likely to be underestimated due to name disambiguation issue mentioned earlier. Thus, there is no evidence supporting the alphabetical bias in self-citations. Column 4 presents the results for other-citations in both tables. The coefficient on surname initial order is negative and significant at 10% significance level, and those on categorical variables are also negative and jointly significant. These estimates thus provide evidence for the alphabetical bias in other-citations only, which is consistent with the hypothesis mentioned earlier.
Furthermore, by making use of the detailed information of the citing papers provided in WoS, I am able to provide some suggestive evidence that the order in reference lists may contribute to the alphabetical order effect. To rule out other confounding factors, I use the following difference-in-difference-in-difference framework to identify the alphabetical order effects originated from length of reference lists in citing papers:
(3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
where [Citations.sup.obs.sub.ij] is the observed citations for paper i published in journal j, I[F.sup.citing.sub.i] is the demeaned impact factor of the papers observed citing paper i in the matched pool, and [Ref.sup.citing.sub.i] is the demeaned number of references in the papers observed citing paper i. The reference list length of citing papers is added to proxy the potential intensity of reference list alphabetical order effects from citing papers. Because number of references is positively associated with number of citations, impact factor of journals where citing papers are published is controlled for to capture the "announcement" effects of being cited by other highly influential papers. The coefficient of main interest is 57, which captures the reference list length associated alphabetical bias conditional on the impact factor of these papers. Estimates are reported in columns 3 and 5 in Tables 5A and 5B. The third columns in both tables show that the length of reference lists in citing papers is not associated with self-citations. The estimated [[delta].sub.7] is 0.003 (0.017) in Table 5A, positive and insignificant, which is consistent with the estimates in Table 5B. However, the fifth column in both tables shows that the alphabetical order effect is stronger when the reference lists in citing papers are longer, and the coefficients on the initial order become smaller and insignificant, indicating that the length of reference lists in citing papers potentially plays an important role in the alphabetical bias.
IV. CONCLUSIONS AND DISCUSSIONS
Using a large sample from WoS, this study finds that articles with first authors whose surname initials come earlier in the alphabet are more likely to be cited, and are cited more frequently, and that the relation between surname initials and citations does not exist for the second-or last-author position. The results are robust across different author-number papers and at different citation levels. The alphabetical bias is stronger for papers published earlier. Our hypothesis for this phenomenon is that the order in reference lists may play an important role because many articles sort their reference lists alphabetically by the initials of the first author's surnames.
In a later stage of analysis, this paper links the alphabetical bias to reference lists by establishing its relationship with length of reference lists. The results show that the alphabetical bias is stronger when the reference list is longer, and this length-bias association mainly exists in those papers with relative stronger alphabetical order effect, which implies that the order in reference lists is potentially important for the alphabetical bias.
Finally, by matching the sample to a citing paper pool, I divide the observed citations into self-citations and other-citations to further test the hypothesis. Because researchers can be expected to be intimately familiar with their own previous work, alphabetical order effect arising from searching behavior is not expected in self-citations. Estimates provide consistent results showing that alphabetical order effect applies only to other-citations, where people are likely to cite items after searching through the reference lists in relevant articles. By exploiting the detailed information of the citing papers provided in WoS, I use a difference-in-difference-in-difference framework to identify the alphabetical order effects originated from length of reference lists in citing papers to rule out other confounding factors. The estimates show the alphabetical order effect is stronger when the reference lists in citing papers are longer, indicating that the length of reference lists in citing papers potentially plays an important role in the alphabetical bias.
It should be noticed that almost all the papers from WoS are in natural science fields. Extensions should be carefully made. For example, can the results be extended to economics? In economic journals, almost all papers order their references by first author's surname initials. The alphabetical order effect may thus be more serious and the influence of it on academic development may be larger. But these expectations should be tested in future studies.
In addition, though current results suggest that the order in reference lists matters in the alphabetical bias, I still cannot rule out other possibilities. There is no evidence that only the order in reference lists contributes to the alphabetical bias in first-author position and further studies may shed light on this issue.
Some policy implications can be made based on the results in this paper. Above all, results in this study indicate that citations can be influenced by the mere alphabetical order of first authors' surname initials, which casts doubt on citation counts when used as objective and unbiased measures of scholarship. The results suggest that first author surname initials should be controlled for when measuring the research contribution of a given article. In addition, because the alphabetical bias is possibly from the order in references since many scientific journals sort their reference by first authors' surnames, it would be fairer to all the references to list them by the appearance order in their paper.
Finally, this study provides a new potential explanation to the observed phenomenon whereby professors with earlier initials in the alphabet are more likely to enjoy added academic success. Owing to the wide use of citation counts as measures of scholarship, it is possible that junior academics with early letter initials are more likely to secure higher wages or earnings, tenures, and grants, simply because their last names help their papers to be cited more frequently.
MSA: Metropolitan Statistical Areas
OLS: Ordinary Least Squares
WOS: Web of Science
APPENDIX TABLE A1 Surname Initials and Ethnicities Probability Distribution among Ethnicities Indian/ Hindi/ Number Anglo-Saxon/ South Surname of Chinese English European Asian Initials Papers (CHN) (ENG) (EUR) (HIN) Panel A: First author Total 846,122 0.17 0.49 0.13 0.08 By surname initials: A 29,253 0.01 0.45 0.09 0.23 B 63,842 0.01 0.61 0.19 0.10 C 66,772 0.29 0.47 0.07 0.06 D 34,483 0.09 0.53 0.14 0.12 E 13,053 0.01 0.62 0.19 0.07 F 28,578 0.12 0.63 0.14 0.01 G 41,275 0.10 0.51 0.16 0.11 H 55,877 0.21 0.56 0.12 0.04 I 5,284 0.02 0.25 0.09 0.15 J 21,152 0.17 0.58 0.06 0.09 K 45,205 0.07 0.37 0.17 0.11 L 59,719 0.41 0.42 0.10 0.01 M 65751 0.04 0.63 0.12 0.07 N 16808 0.05 0.44 0.14 0.12 0 10997 0.03 0.53 0.12 0.01 P 35,485 0.06 0.52 0.13 0.10 Q 3,201 0.65 0.21 0.03 0.04 R 38,663 0.01 0.57 0.18 0.11 S 82,240 0.10 0.50 0.15 0.12 T 26,186 0.21 0.45 0.09 0.04 U 1977 0.03 0.30 0.18 0.12 V 10,544 0.01 0.23 0.30 0.20 w 48,268 0.33 0.54 0.11 0.00 X 5,743 1.00 0.00 0.00 0.00 Y 16,368 0.64 0.12 0.01 0.01 Z 19,398 0.79 0.06 0.08 0.01 Panel B: Last author Total 846,122 0.08 0.62 0.15 0.06 By surname initials: A 26,779 0.01 0.57 0.10 0.18 B 65,273 0.00 0.68 0.18 0.08 C 60194 0.18 0.63 0.07 0.04 D 33,714 0.03 0.64 0.14 0.09 E 14,593 0.01 0.64 0.23 0.06 F 30870 0.05 0.71 0.15 0.01 G 43,155 0.03 0.63 0.17 0.07 H 55,799 0.11 0.68 0.14 0.03 I 4,520 0.03 0.41 0.13 0.10 J 20,335 0.07 0.74 0.06 0.06 K 42,099 0.05 0.48 0.21 0.08 L 50,246 0.23 0.57 0.14 0.01 M 72,771 0.02 0.71 0.12 0.05 N 17,135 0.04 0.57 0.15 0.10 0 11,117 0.03 0.67 0.13 0.01 P 38,434 0.03 0.64 0.13 0.07 Q 1891 0.39 0.41 0.06 0.04 R 45,604 0.01 0.65 0.19 0.07 S 92,851 0.05 0.60 0.19 0.09 T 27,506 0.13 0.59 0.10 0.04 U 2,045 0.02 0.42 0.23 0.10 V 11080 0.00 0.30 0.33 0.17 w 54,173 0.16 0.69 0.14 0.00 X 1950 0.99 0.00 0.00 0.00 Y 10,783 0.53 0.25 0.02 0.01 Z 11,205 0.50 0.16 0.18 0.02 Hispanic/ Surname Filipino Japanese Korean Russian Vietnamese Initials (HIS) (JAP) (KOR) (RUS) (VNM) Panel A: First author Total 0.05 0.03 0.02 0.03 0.00 By surname initials: A 0.11 0.03 0.01 0.05 0.00 B 0.04 0.00 0.01 0.04 0.00 C 0.06 0.00 0.05 0.01 0.00 D 0.07 0.00 0.00 0.04 0.01 E 0.06 0.01 0.00 0.04 0.00 F 0.06 0.03 0.00 0.02 0.00 G 0.08 0.00 0.00 0.04 0.00 H 0.02 0.03 0.01 0.01 0.00 I 0.05 0.31 0.02 0.11 0.00 J 0.02 0.00 0.06 0.02 0.00 K 0.01 0.07 0.13 0.07 0.00 L 0.03 0.00 0.00 0.02 0.01 M 0.06 0.04 0.00 0.03 0.00 N 0.04 0.10 0.02 0.05 0.05 0 0.08 0.14 0.05 0.05 0.00 P 0.07 0.00 0.06 0.06 0.01 Q 0.07 0.00 0.00 0.00 0.00 R 0.08 0.00 0.01 0.03 0.00 S 0.04 0.03 0.02 0.04 0.00 T 0.04 0.09 0.00 0.04 0.03 U 0.08 0.21 0.02 0.07 0.00 V 0.17 0.00 0.00 0.08 0.02 w 0.00 0.01 0.00 0.01 0.00 X 0.00 0.00 0.00 0.00 0.00 Y 0.01 0.09 0.09 0.02 0.00 Z 0.02 0.00 0.00 0.04 0.00 Panel B: Last author Total 0.04 0.01 0.01 0.03 0.00 By surname initials: A 0.08 0.01 0.01 0.03 0.00 B 0.03 0.00 0.00 0.03 0.00 C 0.05 0.00 0.02 0.01 0.00 D 0.05 0.00 0.00 0.03 0.00 E 0.04 0.00 0.00 0.02 0.00 F 0.04 0.01 0.00 0.02 0.00 G 0.06 0.00 0.00 0.03 0.00 H 0.01 0.02 0.01 0.01 0.00 I 0.05 0.20 0.01 0.07 0.00 J 0.02 0.00 0.02 0.02 0.00 K 0.01 0.04 0.06 0.06 0.00 L 0.03 0.00 0.00 0.02 0.00 M 0.05 0.02 0.00 0.03 0.00 N 0.03 0.06 0.01 0.03 0.03 0 0.05 0.08 0.01 0.03 0.00 P 0.05 0.00 0.02 0.04 0.00 Q 0.09 0.00 0.01 0.00 0.00 R 0.05 0.00 0.01 0.02 0.00 S 0.03 0.02 0.01 0.03 0.00 T 0.04 0.04 0.00 0.04 0.02 U 0.11 0.08 0.00 0.04 0.00 V 0.13 0.00 0.00 0.06 0.02 w 0.00 0.00 0.00 0.01 0.00 X 0.01 0.00 0.00 0.00 0.00 Y 0.01 0.08 0.05 0.04 0.00 Z 0.04 0.00 0.00 0.09 0.00 Notes: The numbers from the third column are the probabilities of corresponding ethnicities conditional on the surname initials. Each row adds up to one. TABLE A2 Citation Level and Surname Initials, by Number of Authors (2) Three- Authored Sample (1) Papers (3) Two- Rank of All Four- Dependent Variable Authored Citations Authored Papers (0-100) Papers First author surname -0.011 * -0.019 *** -0.025 ** initials (1 - 26) (0.006) (0.007) (0.008) Last author surname 0.003 0.001 -0.002 initials (1 - 26) (0.006) (0.007) (0.008) Observations 340,046 290,302 215,808 [R.sup.2] 0.523 0.523 0.527 Covariates controlled for Journal, publication year, number of authors, and their interactions Yes Yes Yes Number of references categories Yes Yes Yes Linear trend for number of references Yes Yes Yes Author number dummies Yes Yes Yes Ethnicity of first author Yes Yes Yes Ethnicity of last author Yes Yes Yes State dummies Yes Yes Yes Notes: Robust standard errors in parentheses, Citation rank is defined by publication year in the whole sample. *** p < .01; ** p < .05; * p < .1.
Althouse, B. M, J. D. West, T. C. Bergstrom, and C. T. Bergstrom. "Differences in Impact Factor across Fields and over Time." Working Paper 2008-4-23, Department of Economics, University of California, Santa Barbara, 2008.
Becker, S. L. "Why an Order Effect?" Public Opinion Quarterly, 18, 1954, 271-78.
Berg, H. W., F. E. Filipello, E. Hinreiner, and F. M. Sawyer. "Consumer Wine Preference Methodology Studies at California Fairs." Food Technology, 9, 1955, 90-93.
Berger, J. "Does Presentation Order Impact Choice after Delay?" Working Paper, 2013.
de Bruin, W. B. "Save the Last Dance for Me: Unwanted Serial Position Effects in Jury Evaluations." Acta Psychologica, 118(3), 2005, 245-60.
Carney, D. R., and M. R. Banaji. "First Is Best." PLoS One, 7, 2012, 2.
Coney, K. A. "Order Bias: The Special Case of Letter Preference." Public Opinion Quarterly, 41, 1977, 385-88.
Dean, M. L. "Presentation Order Effects in Product Taste Tests." Journal of Psychology. 105. 1980, 107-10.
Diamond, A. M., Jr. "What Is a Citation Worth?" Journal of Human Resources, 21(2), 1986, 200-15.
Einav, L., and L. Yariv. "What's in a Surname? The Effects of Surname Initials on Academic Success." Journal of Economic Perspectives, 20(1), 2006, 175-88.
Ellison, G. "How Does the Market Use Citation Data? The Hirsch Index in Economics." American Economic Journal: Applied Economics, 5(3), 2013, 63-90.
Freeman, R. B., and W. Huang. Forthcoming. "Collaborating with People Like Me: Ethnic Co-Authorship within the US." Journal of Labor Economics.
Freeman, R. B., I. Ganguli, and R. Murciano-Goroff. "Why and Wherefore of Increased Scientific Collaboration," NBER Working Paper No. 19819, 2014.
Garfield, E. "Journal Impact Factor: A Brief Review." Canadian Medical Association Journal, 161, 1999, 979-80.
Gaule, P., and M. Piacentini. Forthcoming. "Chinese Graduate Students and U.S. Scientific Productivity." The Review of Economics and Statistics.
Gupta, H. M., J. R. Campanha, and R. A. G. Pesce. "Power-Law Distributions for the Citation Index of Scientific Publications and Scientists." Brazilian Journal of Physics, 35(4A), 2005.
Hamermesh, D. S., G. E. Johnson, and B. A. Weisbrod. "Scholarship, Citations and Salaries: Economic Rewards in Economics." Southern Economic Journal, 49(2), 1982, 472-81.
Hilmer, C. E., M. J. Hilmer, and M. R. Ransom. "Fame and the Fortune of Academic Economists: How the Market Rewards Influential Research in Economics." Working Paper, 2011.
Houston, D. A., and S. J. Sherman. "The Influence of Unique Features and Direction of Comparison on Preferences." Journal of Experimental Social Psychology, 25(1989), 1989, 121-41.
Huang, W. "Are Immigrants the Best and Brightest Researchers in U.S." Working Paper, 2013a.
--. "Citation Patterns in U.S. Academic Papers." Working Paper, 2013b.
Joseph, K., D. N. Laband, and V. Patil. "Author Order and Research Quality." Southern Economic Journal, 71, 2005, 545-55.
Kerr, W. R. "Ethnic Scientific Communities and International Technology Diffusion." Review of Economics and Statistics, 90(3), 2008, 518-37.
Kerr, W. R., and W. F. Lincoln. "The Supply Side of Innovation: H-1B Visa Reforms and US Ethnic Invention." Journal of Labor Economics, 28(3), 2010, 473-508.
Li, Y.. and N. Epley. "When the Best Appears to Be Saved for Last:
Serial Position Effects on Choice." Journal of Behavioral Decision Making, 22, 2009, 1-12.
Lortie, C. J., L. W. Aarssen, A. E. Budden, and R. Leimu. "Do Citations and Impact Factors Relate to the Real Numbers in Publications? A Case Study of Citation Rates, Impact, and Effect Sizes in Ecology and Evolutionary Biology." Scientometrics, 94(2), 2013, 675-82.
Lozano, G. A., V. Lariviere, and Y. Gingras. "The Weakening Relationship between the Impact Factor and Papers' Citations in the Digital Age." Journal of the American Society for Information Science and Technology, 63(11), 2012, 2140-45.
Mantonakis, A., P. Rodero, I. Lesschaeve, and R. Hastie. "Order in Choice: Effects of Serial Position on Preferences." Psychological Science, 20(11), 2009, 1309-12.
Miller, J. M., and J. A. Krosnick. "The Impact of Candidate Name Order on Election Outcomes." Public Opinion Quarterly, 62, 1998, 291-330.
Nisbett, R. E., and T. D. Wilson. "Telling More than We Can Know: Verbal Reports on Mental Processes." Psychological Review, 84(3), 1977, 231-59.
Praag, C. M. V., and B. M. S. V. Praag. "The Benefits of Being Economics Professor A (Rather Than Z)." Economica, 75, 2008. 782-96.
Redner, S. "Citation Statistics from More Than a Century of Physical Review." Physics Today, 58, 2005, 49-54.
Smart, S.. and J. Waldfogel. "A Citation-Based Test for Discrimination at Economics and Finance Journals." NBER Working Paper No. 5460, 1996.
Vanclay, J. K. "Factors Affecting Citation Rates in Environmental Science." Journal of Informetrics, 7(2), 2013, 265-71.
Waltman, L. "An Empirical Analysis of the Use of Alphabetical Authorship in Scientific Publishing." Journal of Informetrics, 6(4), 2012, 700-711.
(1.) Many articles are, but ordering in other ways exists, like appearing order in the article (Vancouver-style references). Unfortunately, our dataset does not contain such information about whether the journal is Harvard style or Vancouver style.
(2.) This phenomenon is found in departments of economics, and most journals in economics have Harvard-style references.
(3.) Existing research is mainly about situations where choice directly follows exposure to the options or cases where judges and participants knew that they would be exposed to a set of options and that they would have to make choice at the end of the set. But the study by Berger (2013) is an exception.
(4.) http://thomsonreuters.com/products_services/science/science_products/a-z/web_of_science/. (Accessed June 2013)
(5.) Results and conclusions are robust and consistent when I used the whole sample.
(6.) The percentages in parentheses show the proportions of the papers in the corresponding field.
(7.) I used the one- to five-authored papers published after 1985 to draw Figures A1 and A2, and the author identification strategy is similar to Freeman and Huang (forthcoming). They show the pattern that last authors are less likely to be "new names" (no previous paper in WoS) in a given year, and that the average impact factor of previous papers is higher than that of first author's, among those with previous papers.
WEI HUANG *
* I thank my advisor Professor Richard Freeman for providing the data used in this study and for his persistent guidance. 1 thank the editor Professor Ted Bergstrom and anonymous referees for their constructive suggestions. I also thank Yiping Huang. Mi Luo, Zijun Luo, Amanda Pallais, David Wise, Sifan Zhou, and Yi Zhou for their helpful comments. All errors are mine. I gratefully acknowledge funding support from the Department of Economics of Harvard University, the Institute for Quantitative Social Science, and the National Bureau of Economic Research. The views expressed herein are mine and do not necessarily reflect the views of the Department of Economics of Harvard University, the Institute for Quantitative Social Science, and the National Bureau of Economic Research.
Huang: Department of Economics, Harvard University, Cambridge, MA 02138. Phone 617-999-3450, Fax 617-349-3955, E-mail email@example.com
TABLE 1 Citation Rank and Surname Initials Dependent Variable (1) (2) Citation Rank (0-100) Sample Full Sample Mean of Dependent Variable 48.3 First author surname initials (1-26) -0.017 *** (0.004) First author surname initials: Reference -0.280 *** group is A-F G-T (0.065) U-Z -0.342 **** (0.099) Last author surname initials (1-26) 0.001 (0.004) Last author surname initials: Reference -0.073 group is A-F G-T (0.065) U-Z 0.181 * (0.102) Second author surname initials (1-26) Second author surname initials: Reference group is A-F G-T U-Z Observations 846,156 846.156 R2 0.530 0.530 Covariates controlled for Ethnicity of second author No No Journal, publication year, number of authors, and Yes Yes their interactions Number of references categories Yes Yes Linear trend for number of references Yes Yes Author number dummies Yes Yes Ethnicity of first author Yes Yes Ethnicity of last author Yes Yes State dummies Yes Yes Dependent Variable (3) (4) Citation Rank (0-100) Sample Three- and Four-Authored Papers Mean of Dependent Variable 50.68 First author surname initials (1-26) -0.022 *** (0.005) First author surname initials: Reference -0.320 *** group is A-F G-T (0.084) U-Z -0.498 *** (0.128) Last author surname initials (1-26) -0.000 (0.005) Last author surname initials: Reference -0.062 group is A-F G-T (0.083) U-Z 0.164 (0.132) Second author surname initials (1-26) 0.002 (0.005) Second author surname initials: -0.010 Reference group is A-F G-T (0.083) 0.053 U-Z (0.132) Observations 506,110 506,110 R2 0.527 0.527 Covariates controlled for Ethnicity of second author Yes Yes Journal, publication year, number of authors, and Yes Yes their interactions Number of references categories Yes Yes Linear trend for number of references Yes Yes Author number dummies Yes Yes Ethnicity of first author Yes Yes Ethnicity of last author Yes Yes State dummies Yes Yes Notes: Robust standard errors in parentheses. Citation rank is defined by publication year. Dependent variable is multiplied by 100, so the corresponding coefficients can be interpreted as percentage. *** p < .01; p < .5; * p < .1. TABLE 2 Citation Rank and Surname Initials, by Citation Time (1) (2) Sample Two-, Three-, and Four-Authored Papers Rank of Rank of First All Citations Years Citation Dependent Variable (0-100) (0-100) Panel A: Papers published no later than 2002 First author surname initials (1-26) -0.021 *** -0.002 (0.004) (0.005) Last author surname initials (1-26) -0.002 0.007 (0.004) (0.006) Observations 653,940 653,940 [R.sup.2] 0.524 0.468 Panel B: Papers published no later than 1997 First author surname initials (1-26) -0.031 *** -0.007 (0.006) (0.008) Last author surname initials (1-26) 0.001 0.013 * (0.006) (0.008) Observations 344,821 344.821 [R.sup.2] 0.509 0.461 Covariates controlled for Journal, publication year, number of Yes Yes authors, and their interactions Number of references categories Yes Yes Linear trend for number of references Yes Yes Author number dummies Yes Yes Ethnicity of first author Yes Yes Ethnicity of last author Yes Yes State dummies Yes Yes (3) (4) Sample Two-, Three-, and Four-Authored Papers Rank of 2-4 Rank of 6-10 Years Citation Years Citation Dependent Variable (0-100) (0-100) Panel A: Papers published no later than 2002 First author surname initials (1-26) -0.015 *** -0.028 *** (0.005) (0.005) Last author surname initials (1-26) -0.000 -0.004 (0.005) (0.005) Observations 653,940 653,940 [R.sup.2] 0.520 0.436 Panel B: Papers published no later than 1997 First author surname initials (1-26) -0.022 *** -0.039 *** (0.006) (0.007) Last author surname initials (1-26) 0.005 -0.002 (0.006) (0.007) Observations 344,821 344,821 [R.sup.2] 0.518 0.438 Covariates controlled for Journal, publication year, number of Yes Yes authors, and their interactions Number of references categories Yes Yes Linear trend for number of references Yes Yes Author number dummies Yes Yes Ethnicity of first author Yes Yes Ethnicity of last author Yes Yes State dummies Yes Yes (5) Sample Two-, Three-, and Four-Authored Papers Rank of 11-20 Years Citation Dependent Variable (0-100) Panel A: Papers published no later than 2002 First author surname initials (1-26) Last author surname initials (1-26) Observations [R.sup.2] Panel B: Papers published no later than 1997 First author surname initials (1-26) -0.043 *** (0.008) Last author surname initials (1-26) -0.005 (0.008) Observations 344,821 [R.sup.2] 0.368 Covariates controlled for Journal, publication year, number of Yes authors, and their interactions Number of references categories Yes Linear trend for number of references Yes Author number dummies Yes Ethnicity of first author Yes Ethnicity of last author Yes State dummies Yes Notes: Papers are published between 1985 and 2002. Robust standard errors in parentheses. Citation rank is defined by publication year. *** p < .01; p < .05; * p < .1. TABLE 3 Alphabetical Order Effects and Length of Reference Lists (1) (2) Alphabetical Bias (Coefficients on First Dependent Variable Author Initials) Sample Full Sample Panel A: Unweighted regressions Mean value of dependent variable (unweighted) -0.0179 -0.0163 Number of references -0.032 *** -0.031 *** (0.011) (0.012) Number of references -0.004 last year (0.012) Observations 192 180 [R.sup.2] 0.285 0.246 Panel B: Weighted regressions Mean value of dependent variable (weighted) -0.0178 -0.0165 Number of references -0.018 * -0.020 * (0.010) (0.011) Number of references 0.008 last year (0.011) Observations 192 180 [R.sup.2] 0.295 0.278 Covariates controlled for Field dummies Yes Yes Publication year dummies Yes Yes Field interacting with linear time trend Yes Yes (3) (4) Alphabetical Bias (Coefficients on First Dependent Variable Author Initials) Drop Those Larger Drop Those Than Smaller Sample 0.05 Than -0.05 Panel A: Unweighted regressions Mean value of dependent variable (unweighted) -0.0406 0.0190 Number of references -0.026 ** -0.012 (0.010) (0.011) Number of references last year Observations 159 131 [R.sup.2] 0.405 0.314 Panel B: Weighted regressions Mean value of dependent variable (weighted) -0.0304 0.00738 Number of references -0.015 * -0.009 (0.009) (0.010) Number of references last year Observations 159 131 [R.sup.2] 0.352 0.394 Covariates controlled for Field dummies Yes Yes Publication year dummies Yes Yes Field interacting with linear time trend Yes Yes Note: Robust standard errors in parentheses. *** p < .01; ** p < .05; * p < . 1. TABLE 4 Matched Sample and First Author Surname Initials (1) (2) (3) Matched Citation Rank in Dependent Variable (0-100) Matched Sample Mean of Dependent Variable 67.01 60.96 First author -0.016 ** -0.017 ** -0.016 ** surname initials (0.007) (0.004) (0.004) (1 - 26) Rank of citations 0.379 ** in first year (0.001) Observations 846.156 567,034 567,034 [R.sup.2] 0.362 0.477 0.620 Covariates controlled for Journal, publication year, number of authors, and their interactions Yes Yes Yes Number of references categories Yes Yes Yes Linear trend for number of references Yes Yes Yes Author number dummies Yes Yes Yes Ethnicity of first author Yes Yes Yes Ethnicity of last author Yes Yes Yes State dummies Yes Yes Yes Notes: Robust standard errors in parentheses. The cited paper sample is all two-, three-, and four-authored papers published during 1990-2005, and citing paper sample is all two-, three-, and four-authored papers published during 1990-2008. Citation rank is defined by observed citations in each publication year in the whole sample. *** p < 0.01; ** p < 0.05; * p < 0.1. TABLE 5A Type of Citations and First Author Surname Initials in Matched Sample Dependent (1) Rank (2) (3) Variables of Matched Rank of Citations Matched Self- Citations Mean of 42.56 35.71 Dependent Variables First author surname -0.008 0.005 0.004 initials (1 - 26) (0.006) (0.007) (0.007) First author 0.003 surname initials X (0.017) IF of citing papers x References of citing Observations 567,034 567.034 567.034 [R.sup.2] 0.363 0.287 0.287 Covariates controlled for Information of citing papers (a) No No Yes Interactions between No No Yes surname initial order and information of citing papers (b) Journal, publication Yes Yes Yes year, number of authors, and their interactions Number of references Yes Yes Yes categories Linear trend for number Yes Yes Yes of references Author number dummies Yes Yes Yes Ethnicity of first author Yes Yes Yes Ethnicity of last author Yes Yes Yes State dummies Yes Yes Yes Dependent (4) (5) Variables Rank of Matched Other- Citations Mean of 42.02 Dependent Variables First author surname -0.010* -0.008 initials (1 - 26) (0.006) (0.006) First author -0.027 * surname initials X (0.015) IF of citing papers x References of citing Observations 567,034 567,034 [R.sup.2] 0.350 0.350 Covariates controlled for Information of citing papers (a) No Yes Interactions between No Yes surname initial order and information of citing papers (b) Journal, publication Yes Yes year, number of authors, and their interactions Number of references Yes Yes categories Linear trend for number Yes Yes of references Author number dummies Yes Yes Ethnicity of first author Yes Yes Ethnicity of last author Yes Yes State dummies Yes Yes Notes: Robust standard errors in parentheses. The cited paper sample is all two-, three-, and four-authored papers published during 1990-2005, and citing paper sample is all two-, three-, and four-authored papers published during 1990-2008. Citation rank is defined by observed citations in each publication year. (a) The information of citing papers include average number of references, average impact factor of citing papers and the interactions of the two. (b) The interactions include surname initial order of the cited paper interacting with average impact factor and number of references of citing papers. p < .01; p < .05; p < .1. TABLE 5B Type of Citations and First Author Surname Initials in Matched Sample (1) (3) (4) Rank of Dependent Matched Rank of Matched Variables Citations Self-Citations Mean of Dependent 42.56 35.71 Variables First author surname initials: References group is A-F G-T -0.192 * 0.027 -0.012 (0.105) (0.116) (0.120) U-Z -0.133 0.163 0.128 (0.160) (0.177) (0.182) First author surname initials x IF of citing papers x References of citing papers/100: Reference group is A-F G-T 0.050 (0.294) U-Z 0.380 (0.435) Observations 567.034 567,034 567,034 R1 0.363 0.287 0.287 Covariates controlled for Information of citing papers (a) No No Yes Interactions between surname initial order No No Yes and information of citing papers (b) Journal, publication Yes Yes Yes year, number of authors, and their interactions Number of references categories Yes Yes Yes Linear trend for number of references Yes Yes Yes Author number dummies Yes Yes Yes Ethnicity of first author Yes Yes Yes Ethnicity of last author Yes Yes Yes State dummies Yes Yes Yes (5) (6) Dependent Rank of Matched Variables Other-Citations Mean of Dependent 42.02 Variables First author surname initials: References group is A-F G-T -0.183 * -0.146 (0.103) (0.106) U-Z -0.221 -0.166 (0.156) (0.161) First author surname initials x IF of citing papers x References of citing papers/100: Reference group is A-F G-T -0.289 (0.260) U-Z -0.635 * (0.384) Observations 567,034 567.034 R1 0.350 0.350 Covariates controlled for Information of citing papers (a) No Yes Interactions between surname initial order No Yes and information of citing papers (b) Journal, publication Yes Yes year, number of authors, and their interactions Number of references categories Yes Yes Linear trend for number of references Yes Yes Author number dummies Yes Yes Ethnicity of first author Yes Yes Ethnicity of last author Yes Yes State dummies Yes Yes Notes: Robust standard errors in parentheses, The cited paper sample is all two-, three-, and four-authored papers published during 1990-2005, and citing paper sample is all two-, three-, and four-authored papers published during 1990-2008. Citation rank is defined by observed citations in each publication year. (a) The information of citing papers include average number of references, average impact factor of citing papers and the interactions of the two. (b) The interactions include surname initial order of the cited paper interacting with average impact factor and number of references of citing papers. *** p < .01; ** p < .05; * p< .1.
|Printer friendly Cite/link Email Feedback|
|Date:||Jan 1, 2015|
|Previous Article:||Seatbelt use among drunk drivers in different legislative settings.|
|Next Article:||Structural estimation of sequential games of complete information.|