New developments in readership research.
The primary objective of readership research is to provide a 'currency' of readership estimates by which the value of advertising in the titles concerned can be assessed and traded. The most usual estimate is Average Issue Readership (AIR). There are a variety of ways of arriving at an estimate of AIR. Most surveys employ the Recent Reading model, which assumes that respondents who claim to have read or looked at any issue of a title within its publication interval prior to the day of interview count as readers of an average issue.
There are other models, such as Through-the-Book and Frequency, as well as variations of the Recent Reading model itself. Traditionally, much debate has been focused on the relative drawbacks and potential biases of each of these models. No one model is ideal and some are more practical than others. Indeed, it is unlikely that an ideal model is possible so long as we have to rely on the respondent's awareness of what are usually not memorable events, however skilfully prompted.
In the meantime, the Recent Reading model continues to be dominant around the world. According to Erhard Meier's most recent 'Summary of current readership research' (1999), of the 62 total audience readership surveys included only four do not use the Recent Reading model or a variant of it. The new generation of readership surveys in China, Russia and Latin America all employ Recent Reading. Switzerland's MACH survey, formerly Frequency-based, has recently presented first results using Recent Reading.
However, it is universally acknowledged that the Recent Reading model cannot help but contain certain biases, though these biases may partly (and to an unquantified degree) compensate for each other. 'Replicated reading' occurs when issues of a publication are read over an extended period of time (leading to overestimation), and 'parallel reading' when two or more issues are read in the same publication interval (leading to underestimation). Furthermore, the Recent Reading model is at the mercy of how accurately the respondent is able to recall when they last read a title. They may forget, offer their usual rather than specific reading behaviour or recall their reading as being more recent than it actually was ('telescoping').
There are variants on the Recent Reading model which are designed to remove or lessen the inherent biases, but these do not appear to be gaining ground. FRY (First Read Yesterday) is used in Finland and Norway (Forbruker & Media -- newspapers only). It shortens the recall period to yesterday, thereby eliminating replicated (but not parallel) reading. This, however, is at the cost of sample size: only a minority of readers of titles other than dailies will happen to have read them yesterday. FRY was used in the Netherlands for over ten years, before practical difficulties forced a return to Recent Reading.
FRIPI (First Reading In the Publication Interval) relies on the respondent recalling when they first read an issue. If the respondent is able to do this, then replicated reading will be eliminated. The 'if' is a big 'if', however, as the respondent's powers of recall and recognition are further stretched and the question sequence must be extended for the titles concerned. FRIPI has been used on the South African All Media & Products Survey since 1988. It was also tested in the Netherlands at the time of switching from FRY, but not adopted. The guardians of SummoScanner judged that the corrections over basic Recent Reading were sufficiently small not to warrant the extra effort and interviewing time required (Tschaoussoglou 1997). Indeed, given well-founded concerns about respondent overload, and the continual pressure to survey more titles, it can be argued that extending and complicating the question sequence is not a step in the right direction.
So far an alternative to Recent Reading which is both commercially viable and technically desirable has proved elusive. Recent Reading remains a practical and relatively inexpensive method of collecting readership claims for the hundreds of publications typically included in any one survey, bearing in mind inevitable limitations to the respondent's memory and patience.
Through-the-Book (TTB) is now rare. Having been abandoned by Simmons in the USA in 1995, and by Canada's PMB in 1999, it is now only used by Australia's Roy Morgan for business (but not consumer) magazines. The TTB technique involved establishing whether respondents remembered reading actual issues of magazines, or stripped-down versions. Aside from technical debate of its merits and weaknesses, particularly in relation to the age of the issues used, the practical reality is that interviewers cannot carry the prompt material necessary for any survey covering a sizeable number of titles, as most do. Neil Shepherd-Smith suggested in his paper 'The ideal readership survey' (1999) that it might be possible to revisit Through-the-Book by storing articles and pictures on CD 'to identify each issue beyond any doubt'. However, transferring the printed page to screen inevitably changes the medium, however good the reproduction. The size, scale and physical experience of turning and looking at 'pages' are likely to be quite different. This may not aid recognition, especially for the infrequent or casual reader.
Of the alternatives, there is some suggestion that a Frequency-based model is currently attracting most attention. Frequency has been used for some time in Sweden (Orvesto) and Denmark (Index). More recently it has been introduced in the USA by Simmons, though the other US service, MRI, uses Recent Reading. Mail surveys using Frequency are already well established in the US in some specialist markets such as farming. In Norway, the Forbruker & Media survey now collects magazine data by means of a self-completion questionnaire using Frequency, while the newspaper data are collected separately by Computer Assisted Telephone Interviewing (CATI) using FRY.
So far, all the total population readership currencies using a Frequency model calculate AIR from claimed frequency of reading by taking those frequency claims at face value, or close to it. For instance, if a respondent says they have read about half the issues of a particular title, the probability of AIR is taken as being 0.5. In Sweden, Orvesto takes nominal probabilities for some frequency claims, but varies the values allocated to 'Almost no issues' and 'Almost all issues' according to the publication interval of the title concerned. These adjustments are made 'on judgement'.
External calibration of the probabilities has not been widely pursued as a practical option, though arguments in its favour have been propounded since the 1960s on the basis that frequency of readership may itself be over- or understated by respondents. The resulting biases are likely to vary by title, depending, for instance, on publication interval, what proportion of readership is infrequent and so on. If there is an increasing spotlight on Frequency models, external calibration will no doubt be revisited.
The considerable advantage of a Frequency-based model is that it asks less of the respondent. He or she no longer has to answer recency and frequency. Furthermore, he or she is asked to answer in terms of 'usual' behaviour, which may come rather more easily than recollecting when specific reading occasions took place. There are other advantages, such as the ability to include publications with an irregular publication interval, which are usually excluded by Recent Reading.
Magazines take time to build their audience. A monthly magazine, for instance, will not accumulate all of its readership within a month of first going on sale. In an extreme example, a magazine in a doctor's waiting room may still be accumulating 'first-time' readers (who have not seen that particular issue before) years after its on sale date. Whether these readers are of any use to advertisers is a different matter.
The time it takes for a particular title to build readership is obviously highly relevant to media planning, especially when an advertising campaign's success depends on linking exposure to purchasing decisions. Planners need to know when a magazine is read, or at least on what time-scale they can expect it to accumulate readership. The commercial pressure to incorporate such information into media planning models has been growing.
A number of investigations into audience accumulation have been carried out since the 1960s, producing differing evidence about just how much time magazines take to build their audiences and, crucially, how this varies by type of magazine. One would expect a monthly magazine to take longer than a weekly, for instance, but the content may also be relevant, along with a range of other factors such as distribution, provenance, in-home versus out-of-home readership and the demographic profile of readers.
Two new pieces of work on audience accumulation were presented at the most recent Worldwide Readership Symposium by MRI in the United States (Baim et a!. 1999) and Mediaxis in Belgium (Debeer et al. 1999). Although the two studies were very different in their methodology, both were driven by a determination not only to investigate the issue as previous studies had done, but to incorporate the findings directly into media planning models.
The MRI study was based on over 1,000 respondents who filled in week-long diaries recording their daily reading. Relatively little was asked (the name of the magazine, issue date, whether this was the 'first time' of reading and whether the reading was at home or not). After pilot work, MRI was confident that respondents had little difficulty in reporting the correct issue date and understanding the 'first time reading' concept when it was posed to them in this way. The resulting accumulation curves sometimes suggested a slower build of audience than previous studies. As expected, the curves varied according to publication interval and content. Particularly striking were the accumulation curves by in-home/out-of-home readership, the latter being markedly slower to build and 'lagging by some 10-20% at key points after the on sale date'. A much larger scale study to capture the first-time readership of over 10,000 respondents is now planned by MRI.
In Belgium, the Mediaxis project interviewed over 2,500 respondents face to face in central locations. Each respondent was asked about up to four magazine titles that they had read, encompassing the six most recent issues of each magazine. Questions were based on the Recent Reading model followed by Through-the-Book in the same interview to enable comparison between the two. Unlike the MRI experiment, in the context of a face-to-face interview, answers to 'first-time reading' were demonstrated to be insufficiently reliable. The researchers commented: 'Using the word "first" in audience research definitely seems like asking for trouble, judging on our experience and some FRY studies.' They conclude that publishing interval and frequency of reading (i.e. regular versus occasional reading) are the most powerful determinants of accumulation, but make no mention of in-home versus out-of-home reading. On the other hand, on the basis of the sample they had, they found little evidence that socio-demographics had a b earing on accumulation. Although TV weeklies accumulated significantly faster than other types of magazine, other comparisons (apart from those between weeklies and monthlies) were not significant. Their most surprising finding was that the estimate of 'First Issue Readership' via Through-the-Book was on average 65% higher than the traditional Recent Reading estimate. It seems that there is more work to be done to understand and reconcile that comparison.
Although not all the results of these two projects are in complete accord (as has been the case with previous work into audience accumulation), both have fed directly into new work on models of audience accumulation. MRI is developing its MMF model, and Mediaxis its MagTime software. Larger-scale versions of these tests should allow for more understanding of how not only types of magazine differ, if indeed they do, but also the variation between individual titles.
A further quantified understanding of 'first-time reading' will also feed into the ongoing debate on how to improve or move away from the Recent Reading model or, indeed, whether some form of external calibration might be a more achievable option. The opportunities that panels and diaries offer to provide calibration and detail alongside the main currency have long been mooted. Such projects are expensive, of course, especially if conducted on a scale likely to be specifically rather than generally illuminating, and can only be funded if there is a strong will to do so in the market.
New technology for data collection
One of the areas where there has been most scope for development is in the use of new technology for data collection. The most notable recent development has been the introduction of Double-Screen-Computer Assisted Personal Interviewing (DS-CAPI) by the AEPM magazine survey in France from January 1999.
Single-Screen-CAPI was first used by the British National Readership Survey (NRS) in 1992. Nearly a decade later there are still only four general population readership surveys conducted by CAPI: the British NRS; the French L'Audience de la Presse Magazine (AEPM) (doublescreen); the Belgian MMP CIM; and most recently the Italian AUDIPRESS. In South Africa, CAPI trials are nearing completion with the intention to introduce CAPI in the near future. There have of course been CAPI trials elsewhere, and also trials of Computer Assisted Self Interview (CASI) and CASI Audio, in Canada and the UK, for instance. Most recently, the Dutch SummoScanner has been running parallel quantitative tests of CAPI and CASI with an additional test of CAWI (computerised self-completion on the Web) using an internet panel, CAPI@Home (Soels & Tschaoussoglou 1999). The outcome of this internet trial will be of particular interest to those considering how the internet might be used as a practical tool for readership research in the not too distant future.
In France, the guardians of the AIEPM were attracted to Double-Screen CAPI as being a potentially more efficient and respondent-friendly way of collecting data for 140 plus magazines, especially as the publishers considered it essential that each title be presented individually, rather than on a grouped title card. The interviewer reads questions from a miniaturised laptop and records the answers there. This laptop also sends images of relevant magazine mastheads, response grids and so on to a second screen in the form of a tablet propped on a stand in front of the respondent. Crucially, this second screen enabled the AEPM to implement more thorough and complex rotations of the individually presented titles than had been possible with 140 masthead cards (De Langhe & Le Van Truoc 1999). At the same time the survey could accommodate more magazines. The first year of official data has confirmed that these enhanced rotations have had the desired effect, reducing order effects for titles shown in the final quinti le over the first quintile by 39% (Marx 2000). There are still concerns, however, about the overall length of the media list.
The additional control of interviewing processes that CAPI allows is well documented and not unique to Double-Screen-CAPI. In this instance, improved control was of particular interest because the AEPM is conducted by three different research agencies. Since the introduction of CAPI, differences in the data collected by each agency at the initial readership filter question have been considerably reduced.
Overall, levels of readership at the filter question have risen for all publication groups, and this is attributed to computerised control of the process of showing mastheads. The AEPM also believes that respondents pay more attention to mastheads shown on-screen than they do to paper prompts. Not surprisingly, there are also changes in the AIR levels recorded, although here it is more difficult to untangle the specific effects of DS-CAPI from other changes to the recency question.
The French experience has demonstrated very clearly the benefits of enhanced presentation and control which DS-CAPI offers. It will be interesting to see how much further this can be taken to justify the investment (and disruption) that the upgrade in technology requires. There will be opportunities not just to transfer existing survey designs and show material on-screen, but also to rethink those designs and the reaction and interest of the respondent.
In the UK, NRS has announced it wishes to pursue a 'paperless' interview as the chosen method of data collection for the next NRS contract, so it seems likely there will be further developments on this front in the near future.
Concern over markedly declining response rates took centre stage for the first time at the most recent Worldwide Readership Symposium. Most alarming were reports from the USA of response rates as low as 25-35% to regional readership surveys. More than once, the spectre of legal action was raised.
There were no solutions offered, but many old strategies were revisited. There seemed to be a growing awareness that although most of the factors impacting response rate trends are outside the control of the market research community, some of the strategies for chasing response might be making things worse long-term. Ivor Thompson spoke of numerous callbacks and persistent efforts to convert refusals (Thompson 1999). He urged a compromise, which took 'intrusiveness' into account.
Given the very different nature of refusers versus non-contacts, and most importantly their reading habits, it was demonstrated that response rate strategies which were more successful at targeting one or the other group of non-respondents could impact on readership levels accordingly. The possibility of response rate strategies aggravating rather than reducing bias in title-by-title readership estimates is important, although it makes the way forward even more fraught.
Recent studies confirm that refusers tend to be older, less educated, of lower social grade and income (Windle 1996). They may also be different in character and outlook from respondents with similar socio-demographics who do participate in market research. What we know of the refusers' reading habits indicates, not surprisingly, that they read fewer publications overall (and as one would expect, some types of publication fare worse than others).
Non-contacts tend to be younger, better educated and of higher social grade and income. Awareness of their reading habits is, for understandable reasons, more sparse. Reference is usually made to respondents interviewed at later call-backs, who tend to read more than respondents interviewed on earlier calls, and are particularly likely to be readers of certain sorts of publication, such as quality daily papers.
To give an example of how specific response rate strategies can impact on readership results, Ivor Thompson described what happened when a survey employed monetary incentives midway. A proportion of refusals were successfully converted. There were some differences in their profile compared to those respondents who participated without incentive; for instance, their average household income was lower. Readership levels for the newspapers concerned were lower among the 'incentivised' respondents.
On the other hand, there is evidence that many non-contacts would be prepared to cooperate, if a way could be found of accessing them at a convenient time. Data from the British NRS suggest that around half the respondents interviewed at later calls are willing to participate without reservation. One would expect strategies aimed at achieving contact, such as extended interviewing periods and call-backs, to have a quite different impact on readership levels, were they able to improve response rates by any significant degree.
Some work is being done as to how readership research might adapt to the ever more hectic lifestyles of respondents, particularly in urban areas. For instance, the CASI/CAPI/CAWI experiments in the Netherlands have been particularly motivated by concerns about falling response rates to the SummoScanner, which is currently conducted by CATI. The CAWI experiment, which allowed respondents to complete the survey 'in their own time', found that 20% of respondents completed the questionnaire after 10 o'clock in the evening, a time usually outside the scope of interviewers attempting to achieve personal contact (Jansen Van Doom 2000).
In the UK, experiments are underway to test a mixed methodology approach for the NRS. Tests are being conducted in London where the decline in response rates is particularly acute. If it is not possible to interview using CAPI because the selected person is rarely at home (or it is rarely convenient for them to do a personal interview) then a self-completion option is offered. Initial tests have been encouraging enough to prompt a large-scale quantitative test of both response levels and how the readership data vary from those collected face to face. The latter will, of course, be critical in assessing if and how it is possible to integrate the two data sets, though the assessment is complicated by the fact that not only is the methodology different, but the additional respondents are likely to have a demographically skewed profile which will be relevant to their reading behaviour.
Demands to measure more titles
The commercial pressure to measure many publications within one survey is strong, and perhaps the greatest influence of all in shaping methodology. Worldwide, the average number of titles measured per survey is around 200. In Great Britain, each NRS respondent is currently shown just under 300 titles, plus potentially up to almost 100 newspaper sections (depending on readership of the parent title).
Many national readership surveys report a queue of magazines wishing to be measured as part of the standard currency, and are looking for ways to accommodate them without overburdening the respondent. A lengthy media list may have a negative impact on readership levels if it taxes the respondent's attention and interest. The interviewer's role is also relevant. He or she knows better than the respondent the length of the task ahead, and may rush the respondent through in order to 'keep' the interview. In addition to concerns about data quality, the length of the interview may have a bearing on response rates.
Traditionally, the most obvious solution to a media list deemed overlong has been to suggest it is split and, once surveyed separately, put back together by means of fusion. There are, of course, technical issues as to the quality of the fusion process itself. Furthermore, unless the funds are found for extra sample, there will also be an increase in the standard errors of estimates based on a subdivided sample.
The Dutch SummoScanner recently braved these issues by splitting and fusing their survey (Tschaoussoglou & van der Noort 2000). They were partly forced into finding a solution by the Dutch Anti-Trust Authority who ruled that they could not deny entry to a new business magazine. They were also concerned that the combined impact of an increasing media list and declining response rate might be having a negative impact on the levels of respondents screened in as readers. The overall sample size was increased from 24,000 to 32,000 and the print media list divided into four sets. Each respondent will see two of these sets, but in four different combinations. In this way media behaviour can be included as a variable in the four-way fusion process.
Sadly, the reduction in interview length from 30 minutes to 20 minutes has not halted the decline in response rate. Overall readership levels have not increased, but appear rather stable in comparison with previous data.
In Italy, AUDIPRESS has introduced a novel solution to an over-long media list, namely to split the media list across two interviews conducted with the same respondent. However, this seems to have resulted in severe problems with response rate.
It may be, however, that there are new ways to think about splitting the media list. Looking at the issue from the respondent's point of view, the vast majority of titles they are shown in a readership interview are irrelevant. The British NRS shows almost 300 individual titles, but the average respondent has read only 16 of these in the past year. More specifically, of the 240 or so consumer magazines shown, they have read an average of eight in the past year.
In the UK, Ipsos-RSL has outlined preliminary proposals which would select which magazines a respondent is shown on the basis of a few demographic and interest questions asked before the readership questions. All the respondents with demographics rendering them more likely to read a group of magazines would be sampled, and in addition a proportion of those much less likely to be readers. Any missing readership claims can be modelled from the latter sample, while none of the readers in the prime demographics is missed. Although, as with fusion, some increase in standard errors is inevitable, depending partly on how well the segmentation process can predict readership, these increases are not of the order seen after subdividing the sample regardless of who is likely to read what. There may also be an improvement in the quality of the respondent's claims, not only because he or she is shown fewer titles, but because a higher proportion of those titles are likely to be relevant to the respondent.
In some countries, there is also a market requirement to measure specific newspaper sections. The latter is a relatively recent phenomenon, which has raised new methodological issues.
Half the readership surveys described in Erhard Meier's Summary provide data relating to newspaper sections. Some use their standard model to do this, others ask habitual or frequent reading or use topic interest. Two new techniques have been developed in Denmark and in the UK. These techniques, although very different, were both based on the realisation that it was necessary to develop new techniques rather than simply extend existing ones. Here the issue is not so much of respondent overload as respondent confusion. Respondents do not necessarily distinguish sections in the same way that publishers do. There are many different types of section and ways in which they are presented within the paper (in-paper, pull-out, stand-alone), and to make things even more interesting, some of them change quite often or are not necessarily published on a regular basis. Given that we can hardly expect readership of a particular section to be an outstanding event in the respondent's memory, the usual problems of identific ation and recall are magnified.
In Denmark, Gallup's solution has been to gather data on sections by means of a separate survey which is run continuously (Arnaa & Mortensen 1999). They then transfer probabilities of readership for the sections to the main readership survey, according to a segmentation based on the key variables shaping readership and frequency of reading the parent paper. In this way data are reported for 112 newspaper sections. This technique offers great potential to satisfy demands for additional data over and above the capacity of the main readership survey. Indeed, Gallup has used a similar technique to explore how to incorporate data on newspaper websites after they have been collected off-survey.
In the UK, newspaper sections are measured within the main NRS, but using specially developed show material in a separate part of the interview (Birt 1997). A set of A4 showcards are used, one for each Saturday or Sunday newspaper concerned, depicting the logos and stylised outlines of all physically separate sections (but not in-issue sections). Respondents who have read the parent paper in the past year are asked about past year readership for all the sections shown, even though recency is collected for only some of the supplements. Development work indicated that in order to minimise confusion it was helpful to show and prompt all the separate sections in the context of one another, along with the 'main section' of the paper. After extensive testing, this solution has produced credible figures for the marketplace. The list of sections for which recency data are collected has recently been doubled to around 40 including, for instance, financial and entertainment sections.
Although NRS now measures those sections accounting for nearly 90% of the display advertising revenue placed in British newspapers, there is a demand to extend the list further and in particular to include key sections where they appear in-paper.
So far, measurement of the online versions of print media brands is developing quite independently of print readership research, as befits a distinctly different new medium. Although there seems to be growing confidence that the new media brands will not cannibalise their parents' readerships, so far there has been relatively little progress in measuring and tracking the overlap between online and print consumption, on a title/site specific basis. This will only be worthwhile as and when the print brands achieve significant penetration online. However, it will be an important element in understanding how print/online audiences are evolving and what commercial opportunities they offer. We should also have an eye to what respondents come to regard as 'reading' and whether there is a new set of opportunities for 'title confusion'.
If there is a theme to the various developments in print media research it appears to be towards greater flexibility. This may take the form of mixing measurement techniques for different types of print media within the same survey or of exploring how a combination of data collection methods might improve response rates. There is a renewed focus on the potential of using separate surveys to gather data which can be fed into the main currency, or used alongside it. Such flexibility may help readership researchers meet the respective challenges from market and field.
Katherine Page is the Research Director for the National Readership Survey contract at Ipsos-RSL, and previously ran Ipsos RSL's international readership measurement surveys. She has focused on print media throughout her career.
Arnaa, K. & Mortensen, P. (1999) Automatic segmentation for reach/frequency estimation of newspaper sections and internet papers. Worldwide Readership Symposium 9, Florence. London: BMRB/IPSOS-RSL Ltd.
Baim, J., Frankel, M. & Agnosti, J. (1999) Magazine audience accumulation: developments of a measurement system and initial results. Worldwide Readership Symposium 9, Florence. London:BMRB/IPSOS-RSL Ltd.
Birt, H. (1997) The measurement of newspaper sections readership on the UK NRS. Worldwide Readership Symposium 8, Vancouver. London: BMRB/IPSOS-RSL Ltd.
Debeer, V., Peeter, S. & Lanckriet, T. (1999) Magazines need time -- the build-up of magazine audiences over time. Worldwide Readership Symposium 9, Florence. London: BMRB/IPSOS-RSL Ltd.
De Langhe, E. & Le Van Truoc, O. (1999) The CAPI double-screen questionnaire in the French NRS: from clay model to practice. Worldwide Readership Symposium 9, Florence. London: BMRB/IPSOS-RSL Ltd..
Jansen Van Doorn, O. (2000) Summo Pilot 2000. EMRO 2000. Unpublished.
Marx, J-L. (2000) The double-screen CAPI readership survey. How it improves data. EMRO 2000. Unpublished.
Meier, E. (1999) Summary of current readership research. Worldwide Readership Symposium 9, Florence. London: BMRB/IPSOS-RSL Ltd.
Shepherd-Smith, N. (1999) The ideal readership survey. Worldwide Readership Symposium 9, Florence. London: BMRB/IPSOS-RSL Ltd.
Soels, B. & Tschaoussoglou, C. (1999) What's new pussycat? CATI, CAPI and CASI: or what else? Worldwide Readership Symposium 9, Florence. London: BMRB/IPSOS-RSL Ltd.
Thompson, I.W. (1999) Can we balance intrusiveness with the needs of readership measurement? Worldwide Readership Symposium 9, Florence. London: BMRB/IPSOS-RSL Ltd.
Tschaoussoglou, C. (1997) From FRY to FRIPI. Worldwide Readership Symposium 8, Vancouver. London: BMRB/IPSOS-RSL Ltd.
Tschaoussoglou, C. & van der Noort, W. (2000) Split and fusion -- SummoScanner evaluation. EMRO 2000. Unpublished.
Windle, R. (1996) Public Co-operation in Market Research. London: Market Research Society.
|Printer friendly Cite/link Email Feedback|
|Publication:||International Journal of Market Research|
|Date:||Dec 22, 2000|
|Previous Article:||Guest editorial.|
|Next Article:||Radio research in transition.|