Does the song remain the same? An empirical study of bestselling musical compositions (1913-1932) and their use in cinema (1968-2007).

An influential group of commentators assert that the public suffers when valuable copyrighted works fall into the public domain. One concern is under-exploitation, the possibility that a work without an owner will not be adequately distributed or otherwise made available to the public. According to William Landes and Richard Posner, "[A]n absence of copyright protection for intangible works may lead to inefficiencies because ... of impaired incentives to invest in maintaining and exploiting these works." (1) Congress, (2) the courts, (3) and the Copyright Office (4) all relied on this theory to support recent copyright term-extension legislation. (5) Although the only study testing this theory (conducted after the term-extension legislation had passed and been litigated) casts significant doubt on the empirical assertion of under-exploitation of public-domain works, (6) the effort by copyright owners to win further term extensions continues unabated. (7) The present study confirms doubts about under-exploitation in a more robust empirical context.

A different, and until now empirically untested, claim asserts that popular works falling into the public domain may be over-exploited in two different ways. First, a public-domain work might be "overgrazed," to use the terminology found in the tragedy-of-the-commons literature. (8) Landes and Posner assert that the value of "a novel or a movie or a comic book character or a piece of music or a painting" could be depleted in much the same way as "unlimited drilling from a common pool of oil or gas would deplete the pool prematurely." (9) Second, the value of ownerless works could be dissipated through debasing or inappropriate uses. (10) Although both the overgrazing and debasement theories of over-exploitation are based on empirical assertions about what might happen to works when they fall into the public domain, no empirical studies have yet tested these hypotheses. (11) The present study therefore fills a significant gap in the literature.

Mark Lemley identifies both the under-exploitation and the over-exploitation arguments as "ex post" justifications for protecting works, asserting that both sets of arguments provide a rationale for extending protection without reference to "ex ante" incentives to create. (12) Ex post justifications based on under- and over-exploitation worries stand in the forefront of the worldwide debate over whether copyright terms for existing works should be retroactively extended. (13) Because the standard incentive-to-create rationale cannot justify extending the term of protection for a work that already exists, (14) ex post justifications are driving copyright term-extension debates around the world, and are likely to drive the debate in the United States when the present twenty-year extension runs out in 2018.

Neither the over- nor under-exploitation theories have gone unchallenged. Lemley scoffs at under-exploitation worries, stating that the claim "that control by a single firm is necessary to induce efficient distribution [is] theoretically flawed and empirically unsound," (15) and wondering why there is "some greater need to subsidize [by granting exclusive rights] the making of more copies of Ulysses than the making of more paper clips." (16) Amicus briefs (17) in Eldred v. Ashcroft, (18) including one signed by five Nobel Laureate economists, (19) have also rejected the under-exploitation argument, and my own empirical work concludes that popular books falling into the public domain are not under-exploited in comparison to their copyrighted counterparts. (20)

The over-exploitation theory has also come under attack. (21) Richard Epstein is a doubter, suggesting that "[a]nyone is hard pressed to believe that Shakespeare's star has been dimmed by the calamities committed in his name...." (22) So too are Lemley and Dennis Karjala, both of whom deploy market-based economic arguments to allay fears of a congestion externality caused by overuse of copyrighted works. (23) They conclude that "a belief that the original creator (or his transferee) can best manage the work in the public interest runs strongly contrary to our long-standing and fundamental reliance on free markets to allocate resources to the production and distribution of goods." (24)

Although the theoretical arguments on both sides are interesting, commentators have so far assumed (but not necessarily believed) that works falling into the public domain will be exploited at a different rate than their copyrighted counterparts. Exploitation rates are, of course, observable and ripe for empirical analysis. In Part I of the Article, I explain the methodology of my study of popular musical compositions from 1913-1932 as they appear in movies from 1968-2007. The study tracks songs from 1913-1922 as they fall into the public domain, and compares changes in exploitation rates with songs from 1923-1932 that are still protected by copyright.

Studying musical compositions has several advantages over my prior study of bestselling books. First, tracking the appearance of compositions in movies provides data on the exploitation of derivative works. (25) Musical compositions usually appear in movies as works realized by someone other than the copyright owner. In a movie we hear a recording of the composition, a derivative work under the Copyright Act. (26) Since those worded about over-exploitation inevitably warn against unauthorized derivative works as their most serious potential concern, (27) the study provides especially relevant data. Second, relying on the appearance of musical compositions in movies provides an alternative, and possibly superior, measure of availability to the counting of book editions and book publishers in my prior study. (28) Therefore, the present study's finding of no under-exploitation within my sample is not merely duplicative. Finally, and most importantly, studying songs provides the first opportunity to study claims of over-exploitation.

In Part II, the methodology and the results of the study are reported. Before they fell into the public domain, the relevant set of musical compositions from 1912-1923 appeared in one movie every 15.3 years. After they fell into the public domain, the songs appeared in movies much more frequently, about once every 3.8 years, a four-fold increase. Compositions from 1923-1932, which have always been protected by copyright, appeared in movies once every 7.8 years and 3.3 years respectively, an increase of approximately two and one-half times over the parallel periods of time. The greater rate of increase for the public-domain compositions allays worries of under-exploitation, while the lower absolute rate of exploitation suggests strongly that overgrazing concerns are misplaced. A formal statistical analysis of the data is provided in Appendix B. Part HI joins the theoretical debate and suggests why self-regulation by both producers and consumers of copyrighted works explains the absence of observable market failure. Building on the data gathered here and in a prior study, I suggest when rare cases of over- or under-exploitation might occur. Identifying these cases requires defining the most slippery sort of damage--debasement of a copyrighted work--in a more precise manner than has previously appeared in the literature. The article concludes that addressing any potential market failure requires a much more narrowly tailored regulatory response than general copyright term extension that extends protection to millions of works in order to prevent a theoretical harm to a handful.


Previous studies confirm that most copyrighted works do not hold their value over time. Landes and Posner note, "fewer than 11 percent of the copyrights registered between 1883 and 1964 were renewed at the end of their twenty-eight-year term, even though the cost of renewal was small." (29) They point out that of 10,027 books published in the U.S. in 1930, only 1.7% remained in print in 2001. (30)

Even those worried about what happens when works fall into the public domain agree there is little reason to extend copyright protection to works with no current value. (31) In fact, extending copyright for those works would entail significant tracing and transaction costs, and would almost certainly be inefficient. (32) Given that no one argues for increasing protection for obscure works, the present study identified the 1,294 most popular musical compositions from 1913-1932 and focused on the seventy-four most enduringly valuable of those compositions as they appeared in movies from 1968-2007. The years 1968-2007 were chosen because the compositions from 1913-1922 began to fall into the public domain in 1988, the mid-point in that timeline. Compositions from 1913-1932 were chosen because the works published from 1913-1922 are all in the public domain, and properly renewed works published from 1923-1932 are all still protected by copyright as a result of the 1998 Copyright Term Extension Act, (33) allowing for a basically symmetrical comparison of ten years' worth of works from each group. Until extension, the effective copyright term for these works was seventy-five years, so works from 1913 fell into the public domain in 1988, works from 1914 in fell into the public domain in 1989, and so on until the 1998 legislation ended the flow of works into the public domain. (34)

Studying a group of works from approximately the same era provides the opportunity to study what happened to works from 1913-1922 after they fell into the public domain, and to compare rates of exploitation with those works from 1923-1932 that remained protected. The initial data set included 601 of the most popular compositions from 1913-1922 and 693 of the most popular compositions from 1923-1932, as listed in the most accepted compilation of popular historical musical compositions. (35) All of these songs were then tracked in the Internet Movie Database soundtrack database, (36) which, at the time of the study contained comprehensive information on almost 380,000 movies. (37)

As the present debate revolves around only those works that have substantial present value, the primary statistical analysis was performed on the seventy-four musical compositions that appeared in at least four movies from 1968-2007 (38) (although the findings hold for compositions that appear in one, two, or three movies (39)). Since current sales data or licensing information of historic compositions is mostly proprietary and unavailable, appearance in movies serves as a proxy for enduring popularity. Movie producers invest significant resources into choosing music for their soundtracks. Their goal is to please audiences. Observing their choices provides an objective and neutral indication of what historic music is likely most valuable to consumers.

A substantial majority of the compositions (forty-four out of seventy-four) were published in the six-year period from 1926-1931, indicating the significance of the golden age of Tin Pan Alley, (40) an extraordinary time period which marked the publication of many enduringly familiar works like "Bye Bye Blackbird," (41) "Blue Skies [Smiling at Me]," (42) "My Blue Heaven," (43) "Let's Do It [Let's Fall in Love]," (44) "Let's Misbehave," (45) "When You're Smiling--The Whole World Smiles with You," (46) "Bolero," (47) "Happy Days Are Here Again," (48) "Singin' in the Rain," (49) "Star Dust," (50) "Embraceable You," (51) "Georgia on My Mind," (52) "Get Happy," (53) "I Got Rhythm," (54) "Just a Gigolo," (55) and "Mood Indigo." (56) During this time, Cole Porter, the Gershwin Brothers, Harold Arlen, Hoagy Carmichael, Duke Ellington, and many others were at the prime of their famous composing careers. Since only fifteen of the compositions dated from the 1913-1922 time period, four qualifying songs (57) from 1909-1912 augment that portion of the data.

The public-domain songs were tracked during the period they were protected by copyright law and then after they fell into the public domain, seventy-five years after publication. For example, "Danny Boy" was first published in 1913 (58) and entered the public domain in 1988. So, its use in movies from 1968-1987 (twenty years) when it was protected by copyright was tracked separately from its use in movies from 1988-2007 (twenty years) when it was in the public domain. Compositions from 1914 were therefore tracked from 1968-1988 (twenty-one years) and then from 1989-2007 (nineteen years), and so on.

In order to make the graphic comparison seen in Figure 1, the compositions in each year of the public-domain song set were matched with the corresponding year a decade later in the copyrighted song set. Thus, compositions from 1913 were paired with 1923, 1914 were paired with 1924, and so on. For example, three songs from 1913 appeared in a total of four movies from 1968-1987 (a rate of 4/60), before the songs fell into the public domain. Those same three songs appeared in twenty movies from 1988-2007 (a rate of 20/60). (59) Therefore, the single song in the data set of copyrighted songs from 1923 was also measured in the same time frame, counting its use in movies from 1968-1987 (denominated "period one") and then from 1988-2007 (denominated "period two"). The song, "Bugle Call Rag," (60) appeared in no movies from 1968-1987 (a rate of 0/20) and in four movies from 1988-2007 (a rate of 4/20). For songs from 1914 and 1924, the relevant time periods for measuring uses in movies was 1968-1988 (period one) and 1989-2007 (period two); for songs from 1915 and 1925, from 1968-1989 (period one) and 1990-2007 (period two), and so on.

The aggregate number of times the 1913-1922 songs appeared in movies during the period they were still under copyright was compared to the aggregate number of movie appearances of the 1923-1932 songs in time period one. Then, the aggregate number of times the songs from 1913-1922 appeared in movies after they fell into the public domain was compared with the aggregate number of movie appearances of the 1923-1932 songs in time period two. This comparison allows for a more straightforward explanation of the formal statistical regressions presented in Appendix B, which employ a more robust and uncontroversial, but less narratively engaging, methodology.


The goal of the analysis was to answer two questions. First, when compositions from 1913-1922 fell into the public domain, were they exploited at a significantly different rate than while they were still protected by copyright? Second, if the rate of exploitation of those songs changed after they fell into the public domain, did the change indicate signs of over- or under-exploitation in comparison with the rate of exploitation of the copyrighted songs?

A. No Evidence of Under-Exploitation

Before the compositions from 1913-1922 fell into the public domain, they appeared in movies on average at a rate of once every 15.3 years. After they fell into the public domain, they appeared in movies on average at a rate of once every 3.8 years. At first glance, this rate change appears to show a significant increase in exploitation, but the rate change must be compared to the rate of uses of copyrighted songs during the same time period. After all, it is possible that songs from this general era, regardless of their legal status, may be appearing more frequently in recent movies. This, in fact, appears to be the case. During the same comparative time periods, the rate at which copyrighted songs from 1923-1932 appear in movies increased from once every 7.8 years in time period one to once every 3.3 years in time period two. The following graph shows the comparative increase in terms of average yearly use of a song in a movie. The increase for public-domain songs went from .065 uses per year to .263 uses per year; for copyrighted songs it went from .128 uses per year to. 304.


Since the songs from 1913-1922 fell into the public domain, they have been used in movies an average of four times more frequently than while they were still under copyright. The songs from 1923-1932 also appear more frequently in movies over the same time period. The change for the copyrighted songs, however, was more modest, an increase of a little less than two and one-half times. The formal statistical analyses in Appendix B, not surprisingly, demonstrate that the transition from protected work to unprotected work did not render public-domain compositions under-exploited in relation to works that remained protected by copyright. Thus, public-domain songs from this era do not become orphans that are unavailable for public consumption.

As a check on the data, the relative popularity of the movies appearing in the study was measured in terms of box office receipts. After all, if musical compositions falling into the public domain only appeared in obscure art films, then a strong argument could be made that they were not as widely available as if they had appeared in blockbusters seen by millions. While the compositions from 1913-1922 were still protected by copyright, the nineteen songs appeared in films with a combined gross of $384 million or an average of about $20 million per song. Over the same period, the 55 compositions from 1923-1932 appeared in films with an average combined gross of $3.97 billion, an average of over $70 million per song, suggesting once again that the compositions from that era have always been more popular, irrespective of copyright status. However, after the compositions from 1913-1922 fell into the public domain, they appeared in films with a combined gross box office of $2.5 billion, an average of $131 million per song, a six-fold increase. Over the same period of time, the average for the songs from 1923-1932 did not quite double, moving from $3.97 billion to $7.8 billion, or about $141 million per song. (61)

Surely some of the rate increase in both sets of songs is due to increased ticket prices over the time period studied. (62) The much higher rate of increase for songs from 1913-1922 nonetheless supports the conclusion of no under-exploitation. In fact, despite the popularity of the songs from 1923-1932, the public-domain songs almost pulled even in box office terms over the last twenty years, averaging $131 million per song in the public domain, as compared to $141 million per song under copyright. This may even suggest a positive public-domain effect on exploitation.

The finding of no under-exploitation is generally consistent with my prior study of bestselling fiction from the same period. (63) That research compared the 166 bestselling novels from 1913-1922 with the 167 bestselling novels from 1923-1932, and found that from 1988-2001, novels in the public domain were in print at an insignificantly different rate from novels still under copyright. (64) After 2001, however, the public-domain novels were in print at a significantly higher rate, with significantly more editions per novel. (65) In 2006, the in-print rate for the public-domain novels was 98%, as compared to 74% for the copyrighted novels. (66) A comparison of the subsets of the twenty most enduringly popular novels generated results similar to those seen in the current study. (67)

Although the music composition data show no evidence of under-exploitation, the study does not necessarily prove a positive public-domain effect on availability, like that demonstrated for public-domain books after 2001. A superficial comparison of the rate changes for music exploitation looks significant (4x as compared to 2.5x), but the logistic regressions performed in Appendix B expose the confounding effect of time as a variable, and show that the comparative rates of exploitation of public domain and copyrighted music are not significantly different.

Why is there a positive public-domain effect with books, but not with musical compositions as they appear in film? One difference may be that the study of bestselling fiction measured the availability of copies of a work. (68) The cost of scanning a book into a computer, printing it, and selling it are quite low; many Dover versions of bestselling classics sell for less than four dollars. (69) If one chooses to publish a copyrighted book instead of a public-domain book, the additional licensing cost will have a significant effect on the overall cost of production. On the other hand, the proportional cost savings of choosing a public-domain song for a movie are likely to be much lower. Because a musical composition, whether it is protected by copyright or not, can only appear in a movie as a derivative work, the director of the film must either hire musicians or singers (or both) in order to realize a version of the composition, or she must obtain a license to use an existing recording of the composition. Creating the derivative work from "scratch" will likely entail significant costs, and the alternative of using an existing recording will likely entail the payment of a significant licensing fee to the owner of the recording. These costs will be incurred even if the underlying musical composition is in the public domain. (70)

Using a musical composition in a movie, therefore, is likely to be significantly more expensive than copying a book because it entails the creation of a new derivative work or the purchasing of a license to use a work created by someone else. A film director can save some money by telling her musical director to choose a public-domain composition for the score, (71) but the savings will be proportionally smaller than those enjoyed by the book publisher. Because of these marginal savings, it is unsurprising that public-domain musical compositions are not exploited at a significantly higher rate than protected music. (72)

B. No Evidence of Over-Exploitation

Two sorts of over-exploitation arguments have been offered by those who worry about what happens to works when they fall into the public domain. First, works may simply be overused and worn out, (73) like a song we have heard so frequently we do not want to hear it again. Second, inappropriate uses, even if infrequent, may "recode" the original meaning of a work, (74) debase it, or otherwise make it less valuable to consumers. The examples most frequently given involve uses of copyrighted fictional characters in new pornographic works. (75)

1. No Evidence of Worn-Out Songs

As noted earlier, each song in the public-domain data set appears in a movie an average of once every 3.8 years; each song in the copyrighted data set appears in a movie an average of once every 3.3 years. Appendix B shows that these rates are statistically the same. This result makes it very difficult to argue that valuable songs need owners in order to prevent them from being worn out and devalued. If copyright owners are willing to license their compositions at a higher rate than public-domain compositions are used, then the evidence against over-exploitation seems conclusive.

Even the most intense periods of usage of the public-domain songs, Danny Boy, (76) with nine movie appearances between 1993 and 2001, and After You've Gone, (77) with nine movie appearances between 1996 and 2006, do not outstrip the periods of most intense usage for compositions protected by copyright. For example, in the 1930s, Sweet Georgia Brown (78) appeared in fifteen movies; Am I Blue? (79) in seventeen movies; and Happy Days Are Here Again (80) in thirty-four movies. More recently, the Irving Berlin classic Blues Skies (81) appeared in ten movies from 1994-2004; Star Dust (82) appeared in ten movies in the 1990s; and Dream a Little Dream of Me (83) appeared in ten movies from 1995-2005. (84) Copyright owners seem to be willing to license their compositions at rates equal to or exceeding those of the most intensely used public-domain compositions. Thus, when a song falls into the public domain, the data provides no evidence that it will be over-exploited and worn out by moviemakers.

2. Debased Works?

Even if a song is not subject to overly frequent use, some worry that a handful of "inappropriate" uses might debase the value of the original work, rendering it less desirable for consumption. (85) If public-domain songs have been subjected to damaging uses, however, one would expect them to be used less frequently in movies thereafter. After all, a rational film director would not want to alienate her audiences with a composition that had been previously debased. Evidence of debasement should therefore show up as a progressive decrease in demand for public-domain music as compared to copyrighted music from the same era. The data as a whole show no evidence of this, but the number of movie uses in any particular year is too small to measure accurately whether any particular public-domain song has been damaged, as such damage might be masked by the song's inclusion in a larger set.

Evidence from my previous study of bestselling fiction, however, provides some interesting evidence on individual works. In the seventy-fifth year after publication, the twenty most enduring popular works from 1913-1922 were in print an average of 4.7 editions per title. (86) In the eightieth year after publication, the average was nine editions per title, and by year eighty-five, it rose to 13.4 editions per title. (87) By the year 2006, an average of 26.6 editions per title were in print. (88) The data demonstrate no evidence that pervasive inappropriate uses have reduced the attractiveness of the works for production and delivery to the public. The story is the same when one looks at the individual titles. Eighteen of the twenty titles were in print in more editions in year eighty after publication than in year seventy-five. (89) All twenty experienced an increase from year eighty after publication to year eighty-five, and all twenty experienced an increase in the number of available editions from year eighty-five after publication to the year 2006. (90) Moreover, the steepness of the upward sloping curve of editions exceeds that of copyrighted works from the same era over the same periods. (91) This is not to assert, of course, that there have been no shocking uses of either the songs or the books studied. As discussed below, producer and consumer self-regulation may explain why works are likely safe from even pornographic uses.


Given the lack of empirical support, the persistence of claims that value dissipates when works fall into the public domain seems curious. In this final section, I explore the paradigmatic examples of inefficient exploitation that have been offered, and suggest under what conditions problems might occur. Previous skeptics have argued that even if value is dissipated, we should not worry when it results from the natural interaction of market forces. (92) Taking a different tack, I explore below why value may be unlikely to be dissipated at all when works fall into the public domain.

A. Under-Exploitation

In my previous work, I identified three conditions that might justify extending copyright protection to an existing work to prevent its under-exploitation: "(1) the cost of making the initial copy of a work available to the public is high; (2) the cost to free-riders of making subsequent copies is low; and (3) the newly available work does not incorporate independently protectable material." (93) The test had its genesis in arguments over whether old public-domain films needed owners in order to ensure their preservation and distribution. (94) If an old film requires a significant expenditure to repair, and yet could easily be copied and distributed without authorization once it is in digital form, the owner of the physical copy of the film may lack an adequate financial incentive to restore the film. This scenario is worrisome, however, only if the newly restored film contains no independently protectable new material, like a soundtrack added (a common practice) to an old silent film. Packaging a public-domain work so that it cannot be copied without infringing rights in newly incorporated material can effectively prevent free riding. (95) Such practices necessitate the inclusion of condition three, above.

The three-factor test should be updated in light of recent studies. For example, a study undertaken for the Library of Congress demonstrates that non-owners have been making historic sound recordings available in digital form at a higher rate than their owners. (96) In fact, there is some indication that non-owners may more efficiently husband aging films. (97) A fourth proviso should therefore be added before a conclusion of market failure is reached: (4) owners are, in fact, more willing than non-owners to preserve and distribute the work.

When the four conditions are met, perhaps the public should be concerned. It seems clear, however, that the vast majority of books, music, films, computer programs, and other works that are cheap and easy to reproduce generally do not meet these conditions. (98) In general, the copyright term seems adequate if it is long enough to stimulate the creation of the work in the first instance. Extra extension, like that provided by the Copyright Term Extension Act, is probably not justified except in a tiny fraction of cases.

In the absence of the four conditions, we should not expect to see under-exploitation problems when a work falls into the public domain. Applying the factors to musical compositions as they appear in movies helps explain why. Unlike making a copy of a book, the first condition in favor of ownership may often be met. This is because, as noted above, a musical composition as it appears in a movie is a derivative work that may be quite costly for the music director to use and make available to the public in a new form. (99) Condition two is also probably met: if the movie is in a digital format, it will be quite easy to copy. Condition three, however, is not met, and songs in movies provide a nice example of the salience of that condition. A musical composition as it abides in a soundtrack is surrounded by independently protected work, like the script, the cinematography, and the sound recording itself, whose copyright is owned by its producer. The musical composition per se, the sheet music, cannot be easily taken without offending the fights of copyright owners of neighboring works. The realization of the old public-domain work within a new protected format means that the filmmaker has few real worries about competitors free riding off of his labor. In other words, the public-domain status of the underlying musical composition should not pose a threat to its continued exploitation, which is precisely what the data analyzed above shows.

B. Over-Exploitation: Worn-Out Works and Inappropriate Uses

Trademarks provides a nice example of how both sorts of over-exploitation fears discussed in Part II become operationalized in law. One of the primary bases for the enactment of the Federal Trademark Anti-Dilution Act (100) was the fear that unauthorized uses of a trademark would blur its ability to identify the source of its owner's goods or services. (101) Even if a new "KODAK Cafe" or "EXXON Telephone" were of impeccable quality, Congress feared that a proliferation of uses would render marks like KODAK or EXXON less able to call to mind their original owners. Overuse might literally wear out the marks. I am currently collecting data on whether such unauthorized uses actually occurred prior to anti-dilution protection, but there is little doubt that the "wearing out" theory motivated Congress to pass the law in 1996. (102)

On the other hand, traditional trademark infringement provides a good example of how inappropriate uses can directly alter, as opposed to just wear out, the meaning of a symbol. (103) In fact, experts routinely testify about the amount of pecuniary damage done to the value of a trademark when consumers are confused by an infringer. (104) If a garment maker sells shirts under the trademark "EXCELSIOR" and establishes a reputation for a high-quality product, a subsequent user of the trademark on inferior goods will not only lower the trademark' s value to the garment maker, but also make the word "EXCELSIOR" less usable to the public. Before the infringement, "EXCELSIOR" meant high-quality shirts; afterwards, it does not. If an infringer successfully confuses consumers, then the public has been robbed of a valuable mnemonic device. The mark is debased.

Given the data presented in Part II, we need to ask why these two concerns might not have the same traction in the context of copyrighted works.

1. Worn-Out Songs? Worn-Out Anything?

As noted in Part II, in the context of musical compositions in movies, there appears to be no evidence that public-domain songs are wearing out at a higher rate than their copyrighted counterparts. But what about other media contexts, such as songs heard on the radio or in television advertising or books available in multiple editions? Is it likely that public-domain songs are being worn out via overexposure in non-movie media, or that books are being worn out due to their pervasive reproduction?

Landes and Posner, as well as Liebowitz and Margolis, recognize that congestion externalities usually are not thought to be a problem with works, like those typically protected by copyright law, that have the characteristics of non-rivalrousness and inexhaustibility. (105) They understand that a song can be sung by one or two or one thousand people at the same time (demonstrating non-rivalrousness), over and over again, day after day, without wearing out the song (demonstrating inexhaustibility). Since the marginal cost imposed by each additional user is zero, limiting access would result in a deadweight loss. In fact, if one defines the value of a good in terms of its continued usability, then overuse is theoretically impossible with pure public goods. Landes and Posner, and Liebowitz and Margolis, however, argue that the relevant measure of value is market value, not usability, and therefore posit that certain sorts of marginal additional uses of a public good may impose positive costs. (106) For example, if dozens of advertisers all chose the same song to market their products on television, the public might tire of the tune, and demand for it would drop, reducing its market value. We might, they speculate, see a musical version of the tragedy of the commons. (107)

With songs, this eventuality seems unlikely. First, the vast majority of media airplay occurs through broadcasters' acquisition of an ASCAP license. The standard license in no way restricts the number of times a song can be broadcast over any period of time. (108) In other words, copyright owners, acting through their primary agent, the American Society of Composers, Authors and Publishers, seem utterly uninterested in limiting the airplay of their compositions. Broadcasters, not copyright owners, determine how frequently the public should hear a song. Presumably, broadcasters voluntarily choose not to overplay a song for fear of alienating the public or reducing the value of a good they would like to offer in the future. Overplaying a musical composition, whether it is copyrighted or in the public domain, is bad business, a fact that copyright owners seem to recognize by not restraining broadcasters. (109) In the broadcasting context, public-domain songs seem no more likely to be worn out, therefore, than copyrighted songs. It seems specious, at least as to broadcasting, to argue that each song needs an owner to limit its use.

That leaves "background" music used in advertising, films, and television, which is not licensed through ASCAP. (110) My data casts doubt on overuse of public-domain music in movies, but over-exploitation seems unlikely in other contexts as well. With a virtually infinite commons of music to choose from, advertisers are unlikely to risk alienating the public by choosing the same theme music as too many of their peers. Decades of watching television and listening to radio support this economic intuition. (111) The traditional tragedy of the commons analogy may be inadequate to capture the market for something like music in advertising.

To illustrate the tragedy of the commons, economists tell the story of a common field that is subject to overgrazing: no one owns the field and, therefore, no one has the proper incentive to maximize its value. (112) In fact, empirical evidence shows an increase in agricultural production in England when common fields were enclosed. (113) An advertising jingle, however, presents a significantly different situation. Unlike the farmer who has limited options as to where to graze his cattle, the advertiser has thousands of songs to choose from. A farmer with a thousand choices of equally cheap and desirable fields on which to graze his cattle would rationally choose not to overgraze any particular one. It would be pointless and might cost him in the future. Overgrazing in the presence of numerous choices of fresh fields might even impose a reputational cost. So too with advertisers choosing music to sell their products. Advertisers have no reason to overgraze when musical options are plentiful, and, more importantly, when the costs associated with annoying the public are too high. Overuse of promotional music, as with broadcast music, would be a bad marketing decision that is unlikely to need regulation.

Outside of the context of background music, the role of consumer choice may also help explain the absence of overused works. Consider books, which, unlike trademarks and sometimes songs, require an element of consumer choice in their consumption. One can imagine the public getting tired of encountering a ubiquitous song or getting tricked by a misused trademark, but it is difficult to see how the multiplicity of editions of a book could make the public sick of the story. For example, My Antonia (1918), by Willa Cather, is available in at least fifty different editions by at least fifty different publishers, and exists in many formats (cheap paperback, trade paper, hard cover, large print, curricular unit, e-book, audio tape and audio CD) at prices as low as $2 and as high as $108. (114) Yet, no consumer is forced to unwillingly encounter the story or its characters. If a consumer encounters the same song in the advertising for fifty products, he or she may get tired of hearing it. The song cannot be avoided without turning off the television, switching off the radio, or avoiding places that broadcast ads. The consumer of books, however, will never be forced to consume even a single one of the fifty editions of My Antonia. It is difficult to see a work ever wearing out in a situation when the public only encounters the work when it chooses to do so. Consumer choice and avoidance can be an effective form of non-governmental regulation that prevents a work from wearing out.

As the above analysis of songs and books suggests, in order to determine the general conditions under which concerns of over-exploitation might be justified, one must consider the likelihood of private regulation by both producers and consumers of works. Consistent with the findings in this study, we should expect to find congestion in markets for intangible goods potentially protected by copyright only when three conditions exist: (1) substitutes for the good are not cheap and plentiful; (2) additional subsequent uses of the good entail no significant reputational or other costs to the producer (e.g., by alienating consumers); and (3) consumption of the good by consumers cannot easily be avoided by the consumers themselves (e.g., some advertising uses).

2. Debased Songs? Debased Anything?

The data analyzed in Part II suggests that public-domain musical compositions appear in movies with about the same frequency as one would predict similar copyrighted compositions to appear. This result suggests the songs have not been debased by inappropriate uses that render them no longer fit for public consumption. (115) My earlier study of fiction even more strongly suggests this sort of congestion is lacking. (116) Yet, worry over inappropriate uses debasing works persists. Although this article cannot claim debasement never occurs, a closer look at the conditions of potential debasement reveals that copyright extensions are not an effective means of addressing the worry.

a. Defining Debasement

Determining when a work might be debased, and whether we should be concerned, requires defining the possible harms that might be at issue. There are four main possibilities:

i) The relevant harm caused by debasement is a loss in the market value of the work. In other words, debasement occurs when consumer demand declines due to a damaging use.

ii) The relevant harm is a net loss in public welfare. If adequate substitutes exist for a good whose value is destroyed, then there is no net loss in welfare terms. Thus, even if the market value of a work declines to zero, harm may not necessarily occur. For example, if 100,000 fewer Mickey Mouse t-shirts are sold after a debasing use, but 100,000 more Goofy t-shirts are sold as a result of the same use, there may be no net loss in welfare.

iii) The relevant harm is psychic damage caused to the artist. This harm might simply be included in measuring the net effect on public welfare, but advocates of moral rights argue that the artist's right to control sometimes takes precedence over public welfare.

iv) The relevant harm is the recoding of the settled cultural meaning of the work. Again, this harm might be cast in public welfare terms, but some commentators suggest that the original meaning of a work may be worth preserving, even if subsequent changes in meaning might be welfare enhancing. (117)

The second definition, which requires a diminishment in net public welfare, seems most appropriate for several reasons. For the purposes of this paper, the possibility of debasement is most relevant as a theoretical justification for extending the copyright term of existing works, not for justifying other laws that might vindicate the European notion of moral rights. As the Supreme Court has explained on many occasions, the primary justification for copyright law in the United States is utilitarian, a balance of costs and benefits designed to enhance public welfare. (118) The wisdom of term extensions is most plausibly measured by this utilitarian yardstick. That said, net public welfare is difficult to measure. Therefore, the first definition, lost market value, may be a good practical proxy, subject to evidence of a substitution effect that indicates no net welfare loss.

But what about psychic harms to artists or the possible recoding of a work's meaning? First, copyright initially vests artists with the power to prevent almost all debasement and recoding through the right to prepare derivative works and the reproduction right. (119) Artists are given strong exclusionary rights. If an artist retains his or her copyright, the term of protection, now a minimum of ninety-five years, almost certainly lasts the life of the artist. (120) Beyond that period, the argument to protect the dead artist's psyche is weak. Even if we cared about dead artists' psyches, their copyrights are almost always transferred to a third-party publisher as a condition of publication, so term extension would typically provide no solace to aggrieved artists, just a bonus to the copyright owner.

As far as harm caused by recoding goes, copyright already stabilizes the meaning of the work while it is controlled by a single owner, under the old statute for a period of seventy-five years (now ninety-five years). It is difficult to see why the law should not at some point in time invite competition in the market for meanings. If a meaning is changed, hasn't the market spoken as to the worthiness of the new meaning? Second, initial meanings are likely to be extremely durable. Meanings do not change easily, although they can proliferate. Consumers can keep two meanings in their heads, potentially multiplying the work's value. I have seen multiple interpretations of many pieces of music and dramatic works. My initial impression co-exists with subsequent impressions. Were it not so, my children's early music recitals would have destroyed my ability to appreciate versions of the same works by famous professionals.

Finally, conventional interpretations of the First Amendment suggest that absolute deference to authorial control and settled meanings are not embedded in copyright law. (121) First, a parody, one of the most threatening and debasing forms of derivative work, is constitutionally privileged. (122) No amount of artistic outrage and angst can prevent a good parody from calling a work's value into question before the public eye and ear. Second, important theories of the First Amendment endorse a policy of competition for meaning and truth. (123) For example, the Constitution does not tolerate the suppression of works that destabilize our understanding of the Civil War or the Jim Crow Era or McCarthyism. It encourages tremendous competition over the meaning of our most treasured national moments and national heroes. Why should the Reagan presidency be open for constant reappraisal but not Porgy and Bess, (124) a work that would have fallen into the public domain in 1991 but for the passage of term extensions?

b. When Is Public Welfare Likely to be Harmed?

As suggested earlier, unless unwilling viewers and listeners are forced to consume a work, a diminishment in net public welfare seems unlikely. For this reason, the most common example of debasement seems inapt. Most commentators who worry about debasement point to unauthorized uses of fictional characters as his or her prime example. (125) The entire debate seems to turn on the effect of having unauthorized porn movies starring Mickey Mouse (126) or Superman. (127) Those concerned about unauthorized pornography do not seem aware of the vast amount of unauthorized "inappropriate" works that have already been produced. A quick search of the Internet Adult Film Database ( reveals six pornographic movies with "Cinderella" in the title, including Cinderella in Chains and its two sequels, seven with Snow White in the title, and a whopping twenty-three featuring Santa Claus. (128) Searches on the same database of "Apollo" and "Zeus" turn up numerous examples of gay cinematic achievement. (128) Unauthorized porn fan fiction also abounds, starring such characters as Harry Potter, Captain Kirk and Mr. Spock, and Starsky and Hutch. (130) Is there a serious argument that Cinderella, Santa, mythical Greek Gods, Harry Potter, and Star Trek characters are worth less now than before these works were produced?

Probably not. Consumer and producer self-regulation likely combine to nullify the potential negative effects of unauthorized uses of fictional characters. Consumers who would be offended by a porno Mickey will not purchase a movie or read the fan fiction setting forth his dating new exploits. Those who deliberately seek out the new Mickey will do so because the porn version enhances Mickey's value to them, rather than detracts from it. Movies, books, and images that must be deliberately sought out by consumers are unlikely to negatively affect the value of the fictional characters portrayed therein.

This observation suggests that goods, like t-shirts, that cannot be avoided by the public when the wearer strolls down the street might pose the most serious problem. This danger is probably lessened by the natural reluctance of producers and distributors to sell offensive material. The GAP is unlikely to start selling a t-shirt portraying Mickey and Goofy in bed together. In other words, producer self-regulation, like consumer self-regulation, diminishes the likelihood that serious damage will be done to an iconic character. The Internet, however, provides a venue where the reputation costs of selling offensive items like t-shirts may be low enough to sustain a market. If the GAP will not sell the offensive t-shirt, then someone online might. An Internet purchase might end up being displayed on the chest of someone walking down the street. We could potentially encounter an image portraying Mickey and Goofy in compromising circumstances, despite our best efforts to avoid it.

The number of pedestrians wearing offensive gear, however, is likely to be quite low. There are reputational costs to the wearer that will deter all but a handful of people from displaying such goods in public. And more importantly, Disney will employ its lawyers to prevent the unauthorized sale of its trademarked images. (131) Trademark law provides strong protection against unauthorized uses of franchised fictional characters. Not all characters function as trademarks, however, so the potential for an offensive Cinderella or Santa Claus t-shirt remains a possibility, although the author has never encountered one.

Beyond the t-shirt scenario, one might imagine a song used in an advertisement that creates uncomfortable associations, perhaps "La Marseillaise" used in the background of an attack ad on a euro-friendly politician. Or one might hear "God Save the Queen" sung on the radio in a particularly disrespectful way by Johnny Rotten of the Sex Pistols. (132) Although such uses may appear problematic (putting aside the First Amendment), current law and practice erects few hurdles to them. First, music publishers demand that most artists relinquish the right to control the licensing of their works. A song intended as a pro-environmental anthem may well end up in the background of a Hummer commercial despite the objection of the musicians that made it famous. Second, under present copyright law anyone can cover a song by paying the appropriate statutory fee. (133) We currently do not seem too worded that William Shatner might ruin our favorite song for us.

To generalize conditions from the discussion above, debasement of a work not protected by copyright would seem most likely when: (1) consumers must deliberately seek out and consume the good; (2) presenting the good to the consumer entails no reputational or other costs to the producer (e.g., by alienating consumers); (3) public consumption entails no reputational costs to the consumer; and (4) consumption is lawful (e.g., it entails no violation of trademark law, obscenity law or libel). These four conditions should be met so infrequently that the burden of proving over-exploitation should rest squarely on those who claim it is a serious problem worthy of government intervention in the market.

3. Remedies

Note that the conditions above are satisfied by the "Marseilleise" and "God Save the Queen" examples. It does not follow, however, that expansive new property rights should be created. First, the regulatory authority might conclude that there is no threat to public welfare. Evidence might show no negative effect on the market value for the potentially debased work. Or despite a loss in market value, a strong substitution effect might be shown. Or, even if recoding is included as a market-based harm, empirical evidence from psychologists, for example, might suggest that consumers are capable of retaining multiple meanings. Most importantly, even in situations where a red flag is raised because the set of stated conditions is met, the regulatory response should be narrowly tailored to the potential harm. If the problem is inappropriate t-shirts, the proper response might come from the FTC in the form of new regulations on sellers. If the problem is a poor fit between commercials and background music, then perhaps artists should be given an inalienable approval right. If the problem is an all-white-cast version of Porgy and Bess (such as the one a colleague just saw in Finland), then simple labeling requirements would be most appropriate. Under no circumstances does blanket copyright term extension, with its well-documented costs to consumers and users of works, seem to be the appropriate response.


This study of the use of popular musical compositions in film suggests that the film market for public-domain music functions as efficiently as the market for copyrighted music without any special governmental intervention, such as retroactive copyright term extension. This confirms similar research conducted on the exploitation of bestselling fiction from the same era. (134) These studies cannot prove that copyright protection beyond that required to stimulate the creation of a work in the first instance is never necessary, but they suggest that the over- and under-exploitation hypotheses are overstated. Surely the time has come to place the burden of proof on those who predict valuable works in the public domain will suffer from serious market failure. Legislation should be based on sound empirical evidence.

In the absence of concrete evidence, we are left with predicting the behavior of rational actors, which indicates that self-regulation by producers and consumers of public-domain goods will discipline the market. Their likely behavior suggests four conditions necessary for under-exploitation and four conditions necessary for over-exploitation. The rare simultaneous occurrence of these conditions demonstrates that any legislative response should be specifically targeted to a very narrow set of works. Blanket term extension to all sorts of works in all sorts of contexts, with its significant attendant costs, cannot be justified by a handful of very narrow, and unproven, hypothetical assumptions.


The song set was compiled based on the popular historical compositions from 1909-1932 listed in Julius Mattfeld's Variety Music Cavalcade, 1620-1961: A Chronology of Vocal and Instrumental Music Popular in the United States that appeared in at least four movies from 1967-2007.
Year Title Composer(s)

1909 By the Light of the Silvery Edward Madden; Gus
 Moon Edwards

1910 Let Me Call You Sweetheart Beth Whitson; Leo Friedman

1911 Alexander's Ragtime Band Irving Berlin

1912 It's a Long, Long Way to Jack Judge; Harry
 Tipperary Williams

1913 El Choclo A.G. Villoldo; G.J.S.W.
 Danny Boy Frederick E. Weatherly
 You Made Me Love You--I Joe McCarthy; James V.
 Didn't Want to Do It Monaco

1914 St. Louis Blues William Christopher Hand

1915 Pack Up Your Troubles in George Asaf; Felix Powell
 Your Old Kitbag and Smile,
 Smile, Smile

1916 Colonel Bogey Kenneth J. Alfred (pseud.
 of Major F.J. Ricketts)
 I Ain't Got Nobody Roger Graham; Spencer
 Williams & Dave Peyton
 Poor Butterfly (The Big John L. Golden; Raymond
 Show) Hubbell

1917 Over There George Michael Cohan

1918 After You've Gone Henry Creamer & Turner
1920 Avalon Al Jolson & Vincent Rose
 Look for the Silver Lining Bud De Sylva; Jerome Kern
 (Good Morning, Dearie)
 Whispering Malvin Schonberger; John

1921 The Sheik of Araby (Make Harry B. Smith & Francis
 it Snappy) Wheeler; Ted Snyder

1922 Hot Lips Henry Busse, Henry Lange &
 Lou Davis

1923 Bugle Call Rag Jack Pettis, Billy Meyers &
 Elmer Schoebel
1924 The Man I Love (Strike Ira Gershwin; George
 Up the Band) Gershwin
 Tea for Two (No, No, Irving Caesar; Vincent
 Nanette) Youmans

1925 Manhattan (Garrick Lorenz Hart; Richard
 Gaieties) Rhapsody Rodgers
 in Blue George Gershwin
 Show Me the Way to Go Irvin Kin
 Sweet Georgia Brown Ben Bernie, Maceo
 Pinkard & Kenneth Casey
 Yes Sir, That's My Baby Gus Kahn; Walter Donaldson

1926 Are You Lonesome Tonight? Roy Turk & Lou Handman
 Bye Bye Blackbird Mort Dixon; Ray Henderson
 La Cumparsita G.H. Matos Rodriquez;
 Vincenzo Billi
 Someone to Watch Over Me Ira Gershwin; George
 (Oh, Kay!) Gershwin

1927 The Best Things in Life Bud G. De Sylva, Lew
 Are Free (Good News) Brown & Ray Henderson
 Blue Skies Irvin Berlin
 M Blue Heaven George Whiting; Walter

1928 I Can't Give You Anything Dorothy Fields; Jimmy
 But Love McHugh
 I Wanna Be Loved By You Bert Kalmar; Herbert
 (Good Bo) Stothart & Harry Ruby
 If I Had You Ted Shapiro, Jimmy
 Campbell & Reginald
 Let's Do It (Paris) Cole Porter
 Let's Misbehave (Paris) Cole Porter
 Makin' Whoopee! Gus Kahn; Walter Donaldson
 Sweet Lorraine Mitchell Parish; Cliff
 When You're Smiling--the Mark Fisher, Joe Goodwin &
 Whole World Smiles with Larry Shay

1929 Ain't Misbehavin' (Hot Andy Razaf; Thomas
 Chocolates) Waller & Harry Brooks
 Am I Blue? Grand Clarke; Harry Akst
 Bolero Maurice Ravel
 Happy Days Are Here Again Jack Yellen; Milton Ager
 Honeysuckle Rose (Load of And Razaf; Thomas Waller
 Singin' in the Rain Arthur Freed; Nacio Herb
 Star Dust Mitchell Parish; Hoagy
 You Do Something to Me Cole Porter
 (Fifty Million Frenchmen)

1930 Beyond the Blue Horizon Leo Robin; Richard
 Whiting & W. Franke
 Body and Soul (Three's a Edward Heyman, Robert
 Crowd) Sour & Frank Eyton; John
 W. Green
 Embraceable You (Girl Ira Gershwin; George
 Crazy) Gershwin
 Exactly Like You Dorothy Fields; Jimmy
 Georgia On M Mind Stuart Gorrell; Hoagy
 Get Happy Ted Koehler; Harold Arlen
 I Got Rhythm (Girl Crazy) Ira Gershwin; George
 Just a Gigolo Irving Caesar; Leonello
 Love for Sale (The New Cole Porter
 My Ideal Leo Robin; Richard
 Whiting & Newell Chase
 On the Sunny Side of the Dorothy Fields; Jimmy
 Street McHugh
 Sleepy Lagoon Jack Lawrence; Eric Coates
 Three Little Words Bert Kalmar; Harry Rub
 You Brought a New Kind of Sammy Fain, Irving Kahal &
 Love to Me Pierre Norman

1931 Dancing in the Dark (The Howard Dietz; Arthur
 Band Wagon) Schwartz
 Dream a Little Dream of Me Gus Kahn; W. Schwandt &
 F. Andree
 I Found a Million Dollar Billy Rose & Mort Dixon;
 Baby--In a Five and Ten Harry Warren
 Cent Store (Billy Rose's
 Crazy Quilt)
 Life is Just a Bowl of Lew Brown & Ray Henderson
 Minnie, the Moocher--The Cab Callowa & Irving Mills
 Ho De 'Ho Song
 Mood Indigo Duke Ellington, Irving
 Mills & Albany Bigard
 Out of Nowhere Edward Heyman; John W.

1932 It Don't Mean a Thing Irving Mills; Duke
 Night and Day Cole Porter
 You're Getting to Be a Al Dubin; Harry Warren
 Habit With Me


Compiled by Professor Jaxk Reeves and Kun Xu Statistics Department, University of Georgia


A. Description

This set of data consists of seventy-four songs, composed in 1909-1932, which appeared at least four times in films from 1968-2007. The most popular songs, Star Dust and La Cumparsita, both appeared in film seventeen times in our study period. Nineteen of these songs were published between 1909 and 1922. These nineteen songs are all currently in the public domain, but were not necessarily in the public domain during the entire forty-year period of this investigation (1968-2007). The other fifty-five songs were published between 1923 and 1932, and are not yet in the public domain. This data set of seventy-four songs, where K [greater than or equal to] 4, is used for most of the analysis, but similar analyses using thresholds if k [greater than or equal to] 3, k [greater than or equal to] 2, and k [greater than or equal to] 1 are also included. Table 1 below contains a sample of the data.
Table 1. Popular Songs

 SITION YR 1968 1969 1970

1 By the 1909 1984 4 0 0 0 ...
 light of
 the ...

2 Let me 1910 1985 4 0 0 0 ...
 call you

3 Alex- 1911 1986 5 0 0 0 ...

4 It's a long 1912 1987 4 0 0 1 ...
 way to

5 El 1913 1988 6 1 0 0 ...

6 Danny 1913 1988 11 0 0 0
... ... ... ... ... ... ... ... ...

73 Night and 1932 2027 13 0 0 0 ...

74 You're 1932 2027 6 0 0 0 ...
 Getting to
 Be a ...

total 537 4 4 5

 SITION 2006 2007

1 By the 0 0
 light of
 the ...

2 Let me 0 0
 call you

3 Alex- 0 0

4 It's a long 0 0
 way to

5 El 0 0

6 Danny 0 0
... ... ... ...

73 Night and 2 0

74 You're 0 0
 Getting to
 Be a ...

total 20 13

Original Variables:

SONG song number (for reference purpose)
COMPOSITION name of the song (for reference purpose)
PUBYR publication year
TOT total appearance time (in film) for that song
 during 1968-2007
T1968 appears once for that song (in movie) in
 year 1968
T2007 appears once for that song (in movie) in
 year 2007
EXP copyright expire time
 (where PUBYR [less than or equal to] 1922,
 EXP=PUBYR+75; and PUB YR > 1922, EXP=PUB YR+95)

The last row represents the total appearance of the songs in our list for a certain year from 1968-2007. This ranges from a low of two in 1971 to a high of forty-one in 1998.

B. Data Manipulation

As stated in the introduction, the first analysis of the popular songs concerns "availability" of songs from 1968-2007. Each song was measured at every year from 1968-2007, a total of forty time points. The forty variables T1968, T1969, ... T2007 from the original data were converted into one variable called AFPUB, with the values for AFPUB being 59, 60, ... 98 respectively. The modified data set should have 74 x 40=2960 observations. This modified data set is called the song-year version of the popular songs. Three other variables, YR, MOV and PD, were also created from the original data set of N=74 songs, and carried over to the new data set of 2960 song-year events. A sample of the modified data is shown in Table 2 below:
Table 2. Popular Songs (Song-Year)


1 1 1909 1968 59 0 0
2 1 1909 1969 60 0 0
3 1 1909 1970 61 0 0
4 1 1909 1971 62 0 0
... ... ... ... ... ... ...
409 11 1916 1976 60 0 1
... ... ... ... ... ... ...
751 19 1922 1998 76 1 0
... ... ... ... ... ... ...
2956 74 1932 2006 74 0 0
2960 74 1932 2007 75 0 0

Generated Variables:
OBS observation number
SONG song number (same as in Table 1)
PUBYR: publication year of the song (same as in
 Table 1)
AFPUB: number of years after publication
 (as explained above)
YR: calendar year of measurement
MOV: indicator of the appearance of the song
 (1 = appear in that year;
 0 = does not appear in that year)
PD: indicator of the copyright
 (1 = in the public domain; 0 = not in public

Observations where PUBYR [less than or equal to] 1922 and AFPUB [greater than or equal to] 75 are in the public domain.


Before presenting analysis results, it is necessary to briefly describe the tools and methodology that were used. Each of the four analyses took the same general path. First, the data were explored by numerical and graphical summaries. Then, more sophisticated analyses followed. Since the response variable in this problem is dichotomous, logistic regression was applied.

A. Exploratory Data Analysis

1. Preliminary Analysis

As shown in Table 1, the appearance time of each popular song varies from four to seventeen, and the total number of appearances is 537 (shown in the last row of Table 1 as variable TOT). In Figure 1, the histogram shows the frequency of song appearances. Because no song appears exactly fifteen or sixteen times in the data set, these two columns do not appear in the chart. The average appearance for each song is about seven times.


Through data manipulation, the appearance of a single song in a particular year becomes a dichotomous variable (MOV), zero if the song did not appear and one if it appeared in that year's movie. Because there were sixty-four occasions in which the same song appeared in more than one film during the same year, the total number of events in the dichotomous data set was reduced from 537 to 473 unique events. According to Table 2, of the 2,960 observations, only 312 are in the public domain and the rest are copyrighted. The percentages of these two groups are shown in Figure 2 below. The copyrighted observations are the majority with a percentage of 89.46%.


Furthermore, consider the total appearance in one year (shown in Figure 3 below). By focusing on the total appearances, illustrated by the upper line, one can see an increase after the year 1984, when the songs published in 1909 entered the public domain. The total appearances also show a sharp decrease after 1998, when the songs in our study stopped entering the public domain. At the same time, the appearance of copyrighted songs, illustrated by the middle line (during the years 1968-1987, the middle line overlaps with the upper line), shows a steady increase throughout the entire time period. Contributions from the songs in the public domain (represented by the bottom line) give a linear increase in appearance time after 1984.



On the other hand, the total number of observations for both copyrighted songs and those in the public domain and are not equal. As shown in Figure 2, the proportion of appearances may be more appropriate to illustrate the effect of copyright. Divide the number of appearances in any year by the total number of that set for both public domain and copyrighted song observations. As shown in Figure 4, we can see a slight difference between the copyright statuses. Based on all of this, we can propose a null hypothesis that there is no significant difference in occurrence probability between the public-domain songs and copyrighted songs. An alternative hypothesis is that the songs in the public domain are more likely to be used in film. To decide which hypothesis is more probable we must perform further analysis.

According to Figures 3 and 4, we also notice that the value in year 2007 has an abnormally sharp decrease. We also perform the same preliminary analysis on the popular songs that appear more than once, twice, or three times. Those graphs show abnormally sharp decreases in year 2007 as well. It is reasonable to consider the year 2007 as an outlier in this study (which may be caused by incomplete data), so we do not include observations in 2007 in our further analysis. In year 2007, no public-domain song appeared in a film, and the copyrighted songs appeared only three times. After deleting this year for all seventy-four songs in 2007, we have a total of 470 appearances, including seventy-five public-domain songs and 395 copyrighted songs. The total observation number for all years combined decreases to 74*39, equaling 2,886.

2. Popular Songs Analysis I (Availability by Song-Year)

Results of song-year analysis of the popular songs are presented in this section. The frequency table of availability ('MOV' rows) versus copyright status ('PD' columns) is shown below:
Table 3. MOV*PD Frequency Table

Frequency Public Domain Copyrighted Total
Col Percent

Appear 75 395 470
 24.04% 15.35% 16.29%

Not appear 237 2179 2416
 75.96% 84.65% 83.71%

Total 312 2574 2886

Results from Table 3 show that over the period of analysis, 15.35% of copyrighted songs appeared versus 24.04% of those in the public domain. Assuming each determination of availability is independent from the others (which is not quite true here), the frequencies shown above imply that there exists an association between the rows and columns. But is the association statistically significant? The Chi-square test for independence of rows and columns is as follows:

[chi square = [summation] [(O-E).sup.2]/E = [(2179-2154.81).sup.2]/2154.81 + [(237-261.19).sup.2]/419.19 + [(75- 50.81).sup.2]

= 0.2715 + 2.2403 + 1.3959 + 11.5166 = 15.4242 P([chi square] > 15.4242) < 0.0001

The p-value from the Chi-square test indicates severe dependency between copyright status and appearance of songs in a movie. The Fisher exact test for positive association (upper-tail test for large sample) follows:

Z = [T.sub.2] - r*c/N/[square root of r*c*(N-r)*(N-c)/[N.sup.2]*(N-1)] = 75 - 470*312/2885/[square root of 470*312*(2886-470)*(2886-312)/[2886.sup.2]*(2886-1) = 3.9265

Where c = sum of the first column = 312; r = sum of the first row = 470; N= grand sum = 2886; [T.sub.2] = 75 P(Z > 3.9265) < 0.0001

The p-value from the Fisher exact test shows that songs in the public domain were used by moviemakers at a significantly higher rate than those that were copyrighted. The above result is based on the assumption that all observations are independent from others. It was used to determine if there exists an association that warrants further analyses. Since a strong dependency exists between copyright status and works' appearances, we proceed with further analysis. Of course, the results above are exaggerated to some extent because each song appeared, on average, about six times in the analysis, and the availability status for a particular song is surely positively correlated over time. However, even under the most severe assumption (that observations for a particular song are completely correlated, so that the sample size is exaggerated by a factor of 6), the [chi square] value obtained (15.42) would still lead one to conclude that there is very strong evidence of a public-domain effect.

3. Results for Other Thresholds

The results presented above and analyzed in the bulk of this report concern the dataset when restricted to the n=74 songs that appeared in at least four films during the thirty-nine years between 1968 and 2006. This restriction was made so as to include the songs that were clearly 'popular' over the period. On the other hand, this is a rather restrictive requirement, since it includes only seventy-four of the 1,294 popular songs released from 1909-1932, with only nineteen of these being current public-domain songs. If the threshold for inclusion were lowered from K [greater than or equal to] 4 to K [greater than or equal to] 3, K [greater than or equal to] 2, or K [greater than or equal to] 1, many more songs could be included, but the reliability of results might decrease. Table 4 below contains summaries of the data that would occur if one used other inclusion thresholds. The remainder of this report will concentrate on the K [greater than or equal to] 4 case described in the first row of Table 4, and discussed heretofore, but results for the other three data sets will be presented at the end of the report.

The 'Songs' section of Table 4 divides the 'N' songs that meet the threshold requirement into those that (1) have entered the public domain (EPD) and (2) are still copyright protected (CP). It should be remembered, of course, that the 'EPD' songs were not 'PD' for the entire period of observation. The next section of the table ('All Events') counts the total number of times that a song was used in a film in the thirty-nine year period from 1968-2006. This number of events ('Ev') is reduced slightly to unique events ('UEv'), since we allow a song to be counted at most once in a given year. Eligible song-years ('ESY') is given by ESY=N*39, since each song is eligible to be in a film for each of the thirty-nine years. The last column of this section, 'P,' where P=UEv/ESY, is the proportion of songs used in films. The last two sections, 'PD Events' and 'CP Events,' simply subdivide all song-years and associated events into those which occurred under 'PD' and those which occurred under 'CP' conditions. For all four threshold conditions, it can be noted that 'P' is higher under the PD conditions than under the CP conditions. One can easily perform Chi-squared tests, as was done above in Section III.1.b for the K [greater than or equal to] 4 dataset, to show that the differences are significant. One objection to these tests could be that they do not account for time effects--the 'PD' group has a higher proportion of its songs eligible during the latter years of the observation period than does the 'CP' group. So, if there is an increase in utilization rate over time due to factors unrelated to copyright status, the Chi-squared tests could overstate the importance of the copyright status effect. To investigate this, the next section of this report introduces logistic regression models, which can control for both copyright status and time (year).

B. Logistic Regression

In analysis 1 (song-year level) of the popular songs, the response variable (MOV) is dichotomous (zero if the song didn't appear, and one if it appeared in that year's movie). Logistic regression is appropriate for modeling this type of response variable.

Using copyright status (PD) alone to model availability (MOV) might omit other significant factors affecting a song's appearance in films. Other variables that could be included in the model are PUBYR, AFPUB, and YR. All four variables (PD, PUBYR, AFPUB, and YR) are possible explanatory variables for CPUB. Since copyright status is the explanatory variable of primary interest, it was the first variable included in the model. One should exercise care when choosing additional variables to include in the model, because some of these variables are functions of others and can create confounding effects. For example, copyright status (PD) depends solely on publication year (PUBYR) and age of the work (AFPUB), and the calendar year of the measurement (YR) is the sum of publication year (PUBYR) and age of the work (AFPUB). According to our data, the year 1984 is a key point to the observation, because the songs in our study start to fall into the public domain in that year. We make a new variable PY84, defined as PY84=YR-1984. Since the period is another effect of interest and PY84 was not too highly correlated with PD, it was included in the model (Figure 3 shows an increase in total appearance after year 1984). Including either PUBYR or AFPUB in this model (along with PD and PY84) will cause some confounding, so we did not attempt this.

Of course, just because appearance is more likely for PD than CP events, does not prove that PD is significantly higher than CP. The main confounder is year, because there were many more PD eligible songs during later years, and there seems to be a strong year effect. To investigate this, we considered a seven-level hierarchy of linear models:

ln(P/Q) = B0

ln(P/Q) = B0 + B1*PD

ln(P/Q) = B0 + B1*PD + B2*PY84

ln(P/Q) = B0 + B1*PD + a(grp)

ln(P/Q) = B0 + B1*PD + B2*PY84

ln(P/Q) = B0 + B1*PD + a(grp)

ln(P/Q) = B0 + B1*PD + B2*PY84

+ B3*PD*PY84

[Model 0]

[Model 1] {PD only}

[Model 1L] Linear in PY84

[Model 1G] grouped year

[Model 2L] {Linear, Additive}

[Model 2G] {grouped, Additive}

[Model 3L] {Linear, Interaction}

In fact, Models 1L and 1G are similar in all cases, since the trend is close to linear. The grouped method uses five blocks of eight years, but similar results occurred with ten blocks of four years. The real question concerns whether the B1 coefficient in model 2L (or 2G) is significantly different from zero, or whether it can be thrown out, reducing to Model 1L (or 1G). It turns out that, in every case, the answer is 'not significant'; there is no effect of PD/CP on appearance, once one controls for year effect. The fit for selected models for K [greater than or equal to] 4 is shown in the Table below.

Based on the AIC or BIC, we can either pick the model with continuous year effect or the one with grouped year effect as our final model. Both of the models have the same interpretation of the data, which is that the probability of the songs appearing in a film increases over time, but there is no effect due to PD/CP.

We also perform the same analysis on other thresholds, and the result shows the same trend on the data set. The crucial results come from the analyses of Model 2L for each data set. In each case, the P-value for the 'PD' effect ('B1' in the model) shows no statistically significantly difference from zero, as illustrated in Table 6 below. Thus, after accounting for the increase in appearance rates over time, there is no evidence that presence in, or absence from, the public domain has any positive or negative effect on appearance probability. This holds for all four data sets.


A naive analysis of the data (the Chi-squared & Fisher's tests of section IV.1) demonstrate a clear difference in song availability between copyrighted and public-domain works, with the latter having significantly more appearances in films. A serious objection to this analysis is that it controlled for neither period effects, nor for the popularity of songs considered. After performing the logistic regression analysis to control for time-period effects, we find that copyright status plays no significant role in affecting the probability of a song's appearance in a film.


The previously unpublished data in this Appendix shows the number of publishers from year 60 after publication to year 2006 and suggests no debasement of works after they fell into the public domain at year 75.
 Pub 75/P 80/P
 Title Author Yr 60 65 70 D D

 Pollyanna Porter, 1913 0 0 0 5 4

 O Cather, 1913 2 1 0 3 8
 Pioneers! Willa

 Sons and Lawrence, 1913 3 5 5 7 10
 Lovers D.H.

 Dubliners Joyce, 1914 2 2 4 2 10

 Tarzan of Burroughs, 1914 2 2 3 3 5
 the Apes Edgar

 Human Maugham, 1915 6 5 5 3 6
 Bondage Somerset

 The Song of Cather, 1915 1 1 2 3 5
 the Lark Willa

 The Lone 1915 0 1 2 2 4
Star Ranger Grey, Zane

 A Portrait Joyce, 1916 3 3 6 4 12
of the Artist James

 The Tarkington, 1918 3 1 3 2 3
 Magnificent Booth

 My Cather, 1918 2 1 4 4 18
 Antonia Willa

 Winesburg, Anderson, 1919 2 1 2 6 12
 Ohio Sherwood

This Side of Fitzgerald, 1920 2 1 3 6 12
 Paradise F. Scott

 Main Street Lewis, 1920 2 3 3 8 11

 The Age of Wharton, 1920 1 2 3 12 15
 Innocence Edith

 Scara- Sabatini, 1921 0 1 1 5 3
 mouche Rafael

 Babbit Lewis, 1922 2 3 4 10 12

 The Fitzgerald, 1922 1 2 2 2 11
 Beautiful F. Scott
 and the

 Captain Sabatini, 1922 0 0 0 2 8
 Blood Ra hael

 Ulysses Joyce, 1922 4 5 6 8

 Totals for 38 40 58 93 181
 20 books

 Ave. 1.9 2 2.9 4.7 9
 Publ/Ed Per
 Print Book

 85/P 2006
 Title D status

 Pollyanna 10 30

 O 13 38
 Pioneers! print/5

 Sons and 14 24
 Lovers print/7

 Dubliners 11 24

 Tarzan of 10 39
 the Apes print/6
 Of ebooks

 Human 11 18
 Bondage print/3

 The Song of 10 17
 the Lark print/2

 The Lone 4 18
Star Ranger print/4


 A Portrait 15 34
of the Artist print/4

 The 7 18

 My 42 50
 Antonia print/3

 Winesburg, 26 29
 Ohio print/5

This Side of 23 24
 Paradise print/6

 Main Street 27 27

 The Age of 35 35
 Innocence print/7

 Scara- 18 18

 Babbit 30

 The 18
 and the

 Captain 22

 Ulysses 19

 Totals for 268 532
 20 books print/

 Ave. 13.4 26.6
 Publ/Ed Per
 Print Book

Table 4. Summary of Data Sets Based on Inclusion Threshold (K)

 Songs All Events PD Events

[greater than or equal to] 4 74 19 55 537 470 2886 75 312
 0.1629 0.2404

[greater than or equal to] 3 99 23 76 612 540 3861 76 341
 0.1399 0.2229

[greater than or equal to] 2 146 40 706 633 5694 91 552
 106 0.1112 0.1649

[greater than or equal to] 1 259 79 819 746 113 1058
 180 10101 0.0739 0.1068

 CP Events

[greater than or equal to] 4 395 2574

[greater than or equal to] 3 464 3520

[greater than or equal to] 2 542 5142

[greater than or equal to] 1 633 9043

Table 5. Summary of Seven Hierarchical Models for (K>=4) Dataset

Model BO B 1 B2 B3 -21nL AIC

0 -1.6371 . . . 2565 2567
1 -1.7077 +0.5571 . . 2551 2555
1L -1.9642 . +0.0598 . 2405 2409
1G -1.7840 . [GRP 5] . 2398 2408 *
2L -1.9602 -0.0534 +0.0603 . 2405 2411
2G -1.7708 -0.0530 [GRP 5] . 2398 2410
3L -1.9704 +0.4635 +0.0618 -0.0365 2403 2411

Model SBC

0 2573
1 2567
1L 2421 *
1G 2437
2L 2429
2G 2446
3L 2435

Table 6. Parameter Estimates of BI for 2L Models

DataSet B1-estimate SE(B1)

K [greater than or equal to] 4 -0.0534 0.1504
K [greater than or equal to] 3 +0.0328 0.1505
K [greater than or equal to] 2 -0.0854 0.1304
K [greater than or equal to] 1 -0.1171 0.1145

DataSet z-stat 2-tailed

K [greater than or equal to] 4 -0.355 0.7241
K [greater than or equal to] 3 +0.218 0.8277
K [greater than or equal to] 2 -0.655 0.5120
K [greater than or equal to] 1 -1.022 0.3067

