Printer Friendly

Consumer language, patient language, and thesauri: a review of the literature.


Online social networks, or social networking sites (SNS), have been a feature of the web since 1997, with the founding of SNS are websites that allow users to "(1) construct a public or semi-public profile within a bounded system, (2) articulate a list of other users with whom they share a connection, and (3) view and traverse their list of connections and those made by others within the system" [1]. More and more SNS target people who define themselves by communities according to geographic location, sexual orientation, belief systems, ethnicity, education, and countless other social attributes [2]. Online information, advocacy, and support organizations oriented to specific medical diagnoses were among the first communities of Internet users [3], and SNS for patients are now also a part of the web landscape--making "community" the "killer app in health care" [4]. In fact, the Pew Internet and American Life Project identifies 39% of US "e-patients" as users of social networks, particularly users aged 18-29 [5], which implies a long potential lifespan for this trend. For example, PatientsLikeMe [6] hosts patient communities in 16 varied diagnostic categories, including approximately 5% of all amyotrophic lateral sclerosis (ALS) and primary lateral sclerosis patients in the United States [7, 8]. This SNS incorporates not only a bulletin board, but also clinical tools. Community members report symptoms to find other "patients like them"; "tagging" of symptoms becomes useful data "emergent from shared information" [9].

It is clear that for health information systems to meet the needs of laypersons in a Web 2.0 era, these systems must incorporate patient experiences in order to be useful to SNS-using patient communities. This requires that system builders have an understanding of lay language. Keselman et al., in a 2008 white paper sponsored by the American Medical Informatics Association Consumer Health Working Group, point to the highly text-based nature of health communication and the critical importance of vocabulary to the comprehension of textual messages. The problem is bidirectional: Not only do many consumers have difficulty with medical terminology, but information systems constructed to understand medical terminologies have difficulty with consumer language. For this reason, consumer health vocabulary research and development is identified in this white paper as a strategy for improved, consumer-centered health communication [10].

What do we know about the language of laypersons, inside and outside the formal health care setting? How do we discover it? And how has this knowledge been used to contribute to controlled vocabularies and thesauri for health information representation and provision?

The discussion that follows reviews (1) research in library and information studies (LIS) and in medicine and nursing, in which the language of users is related to controlled vocabularies and thesauri for information systems, and (2) studies in these domains that focus on the direct language, including health language, of laypeople. "Laypeople" are frequently "consumers" in both large bodies of literature, a group that includes but is not entirely synonymous with "patients." The article concludes with a reflection on the implications of these findings for social networks devoted to patients and the patient experience.


Searches were conducted during the month of May 2010, and databases were searched from the beginning of each file, as noted below.

Consumer language

To locate studies on the relationship of consumers to controlled vocabularies and thesauri, the library and information studies databases used were Library and Information Science Abstracts (LISA; CSA, file begins 1969-), Library and Information Science and Technology Abstracts (LISTA; EBSCO, file begins 1961-), and Library Literature and Information Science/Library Literature and Information Science Retro (HW Wilson, current file begins 1984, Retro file begins 1905). No document types were excluded in these searches. The only limitations were to English-language articles (note that LISTA does not permit limiting by language of publication, so filtering had to be done manually). The LISA search strategy was:

[(DE=("controlled vocabulary" or "thesauri") AND (KW= user OR end-user OR customer OR client)]

Total articles found were 109. The LISTA search strategy was:

(DE=Subject headings) AND (user OR end-user OR customer OR client)

"User OR end-user OR customer OR client" was searched as "All Text." Total articles found were 162. The Library Literature Search strategy was: ((user <in> Keyword OR end-user <in> Keyword OR (customer OR consumer OR client) <in> Keyword) AND (("Terminology" OR "Authority control" OR "Indexing vocabularies") OR ("Indexing vocabularies" OR "Thesauri")

Total articles were 109 in the current file and 5 in the retrospective file. Terms not identified as "keyword" were searched as "Subject."

Patient language

To locate studies on patient language, the databases searched were PubMed MEDLINE (file begins 1965) and CINAHLPlus (EBSCO, file begins 1937). Document types were chosen to focus on research about language, as opposed to research on patients in which language was one of multiple facets being investigated. For PubMed MEDLINE, included document types were: meta-analysis, review, address, bibliography, biography, classical article, comment, comparative study, congress, corrected or republished, duplicate publication, English abstract, evaluation study, festschrift, government publication, historical article, interview, journal article, introductory journal article, lecture, published erratum, retracted publication, technical report, and validation study. No document types were excluded in the CINAHL search. Controlled vocabulary terms in Medical Subject Headings (MeSH) and CINAHL Headings were all major. All articles were limited to the English language. The PubMed MEDLINE search strategy was:

["patient language" {searched in All Fields}] OR [Patients AND (Terminology as Topic OR Vocabulary OR Communication Barriers OR Language)]

Total articles found were 170. The CINAHL search strategy was:

[Patients AND (Language OR Vocabulary, Controlled)] OR "patient language" {searched as keyword in All Text}

Total articles found were 52.

Eliminated from the literature review that follows were articles on bi- or multilingual patients when the focus was the native language of the patient (for example, preparation of patient education materials in Spanish or translation of a survey or test instrument into another language) and articles about email communication between physicians and patients, unless terminology was the major point of the article.


The library and information science databases had an overlap of 29 articles (8%, 5 due to internal database duplication, leaving 24 or 6%). Only 5 citations (1%) were shared by LISA, LISTA, and Library Literature. Fourteen articles were found in LISA, 16 in LISTA, and 9 in Library Literature's combined files that dealt with user contributions to controlled vocabulary and thesauri (Table 1). In the health-related databases, CINAHL and MEDLINE had an overlap of 5 articles (2%). One hundred forty-one relevant articles were identified in MEDLINE and 7 in CINAHL relating to patient language (Table 1).

Consumer language: library and information science literature

Knowledge representation by users, not builders, is a key element of Web 2.0, because user-generated keywords, or tags, are a critical feature of SNS. Tagging by web users enables retrieval of information according to community attributes of interest. The community orientation of tagging is so strong that a person who customizes narrowly--with tags perceived as esoteric, idiosyncratic, and thus socially unuseful--is called a "selfish tagger"; the individual is compiling a "personomy" [11]. Tufekci [12] argues that SNS are a feature of "expressive" Internet use, and an SNS user's profile has been called "a representation of the self" [13].

An SNS for a specific population can be seen as a community information system, depending on "a tight interplay between the organization of knowledge and communicative processes within communities of practice" [14]. There must exist a complete and accurate understanding of the community, the information it exchanges, and the recipients of that information. Information systems can represent the institutions that create them; text can "represent the institution as an authoritative source in information provision and decision-making procedures affecting the patient" [15]. However, system and user need to understand each other, or "We may be alienating a user community by not speaking their language" [16].

When an SNS is focused on health status, knowledge representation for information exchange presents a particular challenge. In the Web 2.0 era, builders and users may be two discrete but overlapping groups of people. User-created content may drive the information provided by the system. Arguably, folksonomies become collections of user-created descriptors for that user-created content.

The estimated 20% of web users who browse via links are best served by lists of readable, understandable, meaningful labels as guides to navigation [17]. Tags that are ranked by popularity can be labels and constitute a "self-rewarding positive loop" [18]. Immediate community feedback in SNS assists retrieval: "A user can clarify and focus on her own image or concept of her need. The structure of the index language serves as a catalyst" [19]. Furthermore, the knowledge represented in user-generated tags is graphically displayed simultaneously as the medium and the object of collaboration [20]. Individual members of purposeful communities thus increase the quality of knowledge representation geometrically, because their representations engender knowledge work by other users in turn.

The field of information science is where the automation of controlled vocabularies began. An early question asked by information scientists was: Who controls the controlled vocabulary? One author found user-generated terms not useful: "[I]f no [controlled] index vocabulary is available, do not attempt to generate your own" [21]. However, others argued for a "committee approach" to thesaurus construction, suggesting that users could be a part of this approach: A "thesaurus for sociology or political science should not only be useful for the specialists, but also for the common man" [22].

One early research study on personal indexing systems suggested that personal index terms themselves could populate a classification system. One advantage was that the user would be choosing keywords out of the full text of the document: "They are his own selection and he knows them.... As the user, he should in any case have a strong voice in the selection" [23]. Strong and Drott argued almost twenty years later for "a mechanism.for users to suggest their own facets when they do not find the thesaurus facets adequate" [24].

In information science, however, these were minority voices. LIS studies of users and terms have not relied on user-submitted terms. Instead, researchers have investigated users' choices from existing lists to expand queries [25] or to test the "fit" of captured terms. Today's tagging researchers do the same thing with captured tags: Library of Congress Subject Headings cataloging terms (Adler [26]), Connotea user tags, and MeSH terms (Lin et al. [27]), while Daly and Ballantyne solicited tags from a known image community [28]. However, in all of these "extent-of-match" studies, users are not, themselves, presented with a thesaurus. Researchers who ask users for terms typically do so to increase the relevance of returned results. Palmquist and Balakrishnan wanted to understand more about users' unexpressed ideas [29], while Wacholder and Liu [30] investigated searchers' preference for human-constructed versus computer-generated term sets. Again, neither study related users' terms to either a controlled vocabulary or thesaurus.

In information science research, then, user queries become proxies for the larger, implicitly expressed subtextual information need. The researcher/scientist mediates between user and thesaurus, as librarians have historically mediated between user and system. As in indexing research, "Few studies mention user participation" [31]. Transaction log analysis and similar studies are called exemplars of "user-centered" research [32]. However, users are typically represented in this literature primarily as end users: people existing in relation to systems, not thought about unless they are sitting at a terminal, and certainly not people explicitly involved in controlled vocabularies and thesaurus building. Three notable exceptions in the LIS literature are found, all involving dictionary building: actual musicians contributing to a "musician's word list" [33], heroin addicts compiling a glossary of drug addiction [34], and indigenous language-speaking community members helping to develop a dictionary of medical terms [35].

Thesaurus builders, unlike dictionary builders, tend to present an existing list of thesaurus terms to users and then ask for their opinions, rather than solicit terms themselves. Commercial and government systems developers typically solicit user feedback as a continuous quality improvement process, as described for the development of the CINAHL thesaurus [36]. User interaction with the existing controlled vocabulary is occasionally mentioned in the literature as a maintenance issue [37]. But in general, the user as contributor has been absent from the literature of thesaurus development since the 1980s. Is the situation any different in the realm of health care?

Patient language: medical and nursing literature

Human medicine is the only scientific field in which the subject literally tells the scientist what the problem is. [38]

When the ideas of "patient" and "language" are discussed in the literature of medicine and nursing, the intention is seldom to discuss "language of patients." Instead, patients are typically represented either as "receivers" of language or as members of a group that health care providers are talking or writing about. For example, the patient is a member of health care's target audience for patient education (Lambert et al. investigated causes of patient confusion about drugs [39]; Stapleton et al. studied the effect of word usage in midwife consultations [40]). Other researchers study what we call patients in health policy [41], mental health [42], and particularly obstetrics [43]. Finally, health care professional language is itself a subject of much study, inasmuch as it relates to patient comprehension of the language and its impact on the patient receiving it (Nordby provides an excellent summary [44]). Researchers consider seman tic gaps in the meaning of "asthma" [45], "life expectancy" [46], "gift" (in the context of organ donation) [47], "back pain" [48], "black" and "white" (in the context of moral values) [49], and "euthanasia" and "assisted suicide" [50]. The following discussion excludes these themes, each of which has a copious literature of its own, but instead considers a smaller body of work: studies concerning language used by patients.

The importance of data contributed by the patient--and of asking the patient the right questions--was realized as early as 1000 AD by Rufus of Ephesus, who wrote, "It is important to ask questions of patients because with the help of those questions one will know more exactly some of the things that concern disease, and one will treat the disease better" [51]. The cognitive anthropologist Charles Frake stressed the importance of questions and answers in linguistic discovery of representations of illness: "For every response, the set of inquiries which appropriately evoke the response should also be discovered" [52].

Literature about patients' own language does suggest some general conclusions about the usefulness of patient-generated verbal descriptors for understanding patients' health status. It also identifies some factors affecting these verbal descriptors and their susceptibility to processing by information systems. The oldest and most well-documented form of patient language must be the chief complaint, "the patient's primary reason for seeking medical care" expressed during medical history-taking [53]. The stress on eliciting the patient's own words whenever possible was made by William Osler, who wrote, "In taking histories ... ask no leading questions.... Give the patient's own words in the complaint" [54]. Later medical educators extended this logically to suggest that physicians literally, verbally, echo the language of the patient during the interview, "[A]voiding, of course, the use of four letter words or obscenities." Orthographic rules exist for history-taking precisely in order to distinguish patient language from clinician-mediated language: "Responses should be recorded as nearly as possible verbatim which should be indicated by quotation marks" [53], a phenomenon still observed in natural language processing research. Medical informatics research has considered the chief complaint as a data source. No standard coding terminology exists for data of this kind, which exhibits characteristics "idiosyncratic to a specific area or hospital" [55]. This, in turn, presents all kinds of problems for natural language processing, among them, all the communications hurdles encountered in verbal expression, for example, synonyms and paraphrasing.

Patient language appears in the literature of health care in several other domains. First is the work by developers of pain assessment instruments. Pain language is an experience so personal, so individual, and so subjective that no "gold standard" exists for describing clinical pain [56]; thus, "The only way to successfully assess pain is to believe the patient" [57]. Reliance on patient language can cause difficulties in knowledge representation, however, when words fail the patient: "No pathognomonic sensory descriptor exists for neuropathic pain" [58]. Thus, language in pain assessment instruments was first used to help quantify how much pain the patient was in and, using "the verbal judgment of the patient," to objectively represent that subjective experience [59]. This high subjectivity of pain descriptors means, too, that they have little predictive power in discriminating between kinds of pain or kinds of patients with pain. Putzke et al. found of pain instruments that "12 of 15 words ... [were] selected by >20% of subjects across a variety of pain populations" [60], for example, acute versus chronic pain sufferers [61], but Mauro et al. found no statistically significant differences [62]. In pain research, patients serve not only as recipients, but also as judges of pain instruments (of the Verbal Rating Scale [59, 63], the Visual Analogue Scale [56, 59, 63, 64], the Verbal Descriptor Checklist [64], and the Tursky Pain Perception Profile [61], among others). The ubiquitous McGill Pain Questionnaire [65] has been translated into other languages, a process itself documented in the research literature [56, 62, 66].

Most importantly, patients themselves have spoken about words and pain. A majority in one study preferred verbal over quantitative scales, finding words "easier to understand" and saying they "felt more comfortable using words than numbers," because verbal descriptors "allowed them better communication with their physician about their pain experience" [67]. Verbal descriptors may convey greater subtlety of meaning than numbers, because they use "natural language of the person ... and [have] an inherent face validity" [61]. Clark, Gironda, and Young found a direct correlation between years of education and a preference for verbal, as opposed to numeric, scales [67]. Other cultural factors have been shown to play a role in how people verbally express pain [68], including "anxiety, fear and depression" and age, which has been found to affect subjects' interpretation of verbal pain descriptors [69]. Particular cultures have "particular semiotics of pain expression" [70], with a terminological effect: "In some languages more than a dozen specific pain terms are in common use, each indicating a particular pain experience, while in other languages a single inclusive term is the norm" [62].

As was noted above, patient language has been investigated as a contributor to physician-patient communication dysfunction, but researchers interested in semantic gaps usually present patients with a list of words, as opposed to asking patients to generate words. For example, Vincent et al. [71] studied patient terms for worsening, and Levin [72] looked at Xhosa patient versus English-speaking physician labels for the same health concepts. Bernstein reminds, like Osler before him, that "patients alone know what they are feeling" [73], but the patient's verbal expression of that feeling needs to be understood by the listener. Only Yoos et al. asked open-ended questions of young asthmatics and their parents and allowed them to "generate" terms for their own asthma symptoms [74].

Patient language has also been investigated from sociocultural perspectives. Rondahl, Innala, and Carlsson were concerned about sexual orientation--of both nurses and patients--as a factor implicated in "very cautious communication from both personnel and patients" [70]. Schouten and Meeuwesen's three-year review of medicine and psychology literature found linguistic barriers to be a key predictor of cross-cultural communication problems [76]. The most common type of study found involved ethnic minority patients visiting white physicians, who were then compared with white patients visiting white physicians. Language concordance was found to have an effect in several of these studies [77]. Gender was investigated by pain researchers Vodopiutz et al., but few gender differences were found [78]. Bischoff, Hudelson, and Bovier asked what difference gender made to language discordance between physician and patient and found interpreters to function not just as mediators of language, but as mediators of culture [79].

Vocabulary development is one way in which information systems communicate with users. The information needs of health care professionals have, of course, been the driving force behind many medical informatics initiatives, and their origins are reflected in these systems' vocabularies. For example, Stetson et al. attempted to develop an ontology modeling the contribution of clinical communication problems to clinical errors [80]. Health paraprofessionals like medical librarians have also served as human mediators of needs for the purposes of system design [81], particularly for ontology development [82]. The most extreme expression of user and librarian involvement found thus far has been the Faculty Research Interests Project (FRIP) at the University of Pittsburgh [83]. The University of Pittsburgh's clinical faculty and researchers were profiled in a database that was pre-populated with mined MeSH terms attached to their publications. These authors were asked to supply additional author-generated keywords as they thought necessary. The keywords were then displayed online alongside the MeSH terms as part of the researcher's profile. Searchers of FRIP could thus use these keywords side by side with a browsable thesaurus, as terminological assists and search augmentations [84]. Users were generating terms to describe themselves, or, at least, that portion of themselves revealed by their MEDLINE-indexed publications. This prefigured the practice of tagging in Web 2.0 by several years.

However, FRIP was concerned with profiling biomedical clinicians and researchers. What about systems profiling consumers and patients? How have lay people contributed their own language to knowledge representation in health care information systems?

In medical informatics, the professional-lay communication gap has been operationalized as "the consumer health vocabulary problem" [85]. Consumer health language research, and specifically vocabulary development targeting consumers, was identified by Keselman et al. as one important informatics strategy for addressing professional-consumer communication problems in health care [10]. Medical informatics researchers have explored the extent and severity of the consumer-professional gap [86, 87], its role in communications dysfunctions, and the implications for health literacy initiatives [88]. Exploration in this "consumer health vocabulary" subdomain, however, has typically been done to describe and demarcate what "consumer" territory exists, highlighting its difference from professional territory. It is not done explicitly in the interest of vocabulary development and control, although enhancing thesauri through addition of synonyms and entry terms is a desired corollary outcome.

Where does lay language come from in medical informatics? Various extant "consumer" texts, sources as diverse as the Dictionary of American Regional English [85] and emails [88], have been explored as sources of terms. An example is a large extent-of-match study, focusing on consumer utterances and using data from the Medical Library Association-funded project Ten Thousand Questions [89]. Consumers and patients are not typically "present" in this research: They are represented in the aggregate, and in absentia, by the trails they leave in the system. For example, patients are represented by query logs at [90], MedlinePlus [91], or "Ask-A Doctor" sites [92]. Other online consumer "speech" includes emails to a university health website [93], a cancer information service [94], or nurses caring for specific patients [95, 96]. Web-based bulletin board posts have also been studied for their characteristics and content [97, 98]. Typically, the data generated in these research explorations derive from text capture and are obtained from large groups of anonymous "speakers," operationalized as "consumers" because any more specific identity is unknown to the researcher. If consumers, including patients, are consulted for their term preferences in an information systems context, it is typically via focus groups and interviews, such as those described by Slaughter, Ruland, and Rotegard [99]. These focus groups are feedback groups: Like the previously discussed patients in pain studies, they rate existing lists but do not build new lists. One interesting near-exception was the study done by Zeng et al., using a mix of transaction log analysis and patient interviews. Queries logged at a "Find A Doctor" site were mined for content, and individual patients seen at the same hospital were interviewed in order to test their understanding of frequently used terms taken from the query logs [100]. Most common, unfortunately, is the kind of study in which the verbal expressions of patients are channeled by clinicians, so that the data are always secondhand. For example, plastic surgeons were interested in developing a set of lay synonyms for body parts to help in clinical consultations about liposuction and "made lists ... we have heard from our patient populations over the years" [101].

Health communications researchers have written about the medical interview and its impact on patient-physician relations. Lindfors and Raevarra found that a physician's questions erected a kind of conversational scaffold, producing a direct effect on the patient's elicited story. Homeopaths asked, "How do you use alcohol?" This encoded the homeopathic physician's assumption that alcohol was used, and patients then structured their responses in that direction [102].

Another example of physician-structured interaction was found by Farmer, Roter, and Higginson, who studied emergency department patients and discovered that patient language not only can be transformed by physicians, but it can apparently be conversationally transmissible. Patients were admitted to the emergency department with symptoms of chest pain and then observed during medical history taking. Health care providers perceived the patient's expressions of pain as "vague," which frustrated them. They then began to structure the interactions by verbalizing a "menu of potential responses," naturally associated with their professional, clinical preconceptions about the likely basis of the pain. The prompting of the emergency department physicians began to structure the patient's own verbal expressions. The more trouble the patient had in communicating, the more suggestions were on the physician's "menu." After repeated interviews, the patient was "trained" to express symptoms in more focused--and more professional--language, increasingly citing physician expertise and increasingly quoting physician statements that incorporated more medical terminology, for example, "My doctor said I have angina." The conclusion of these researchers could have been stated by any website designer or information architect who attempts to create a browsable web-based ontology: Patients "required to select from a finite number of descriptors.may not find a word that accurately reflects their experience" [103]. In fact, it may reflect somebody else's experience: their doctors'.

Finally, pain researcher Nutkiewicz studied thirty-two children from ethnically diverse backgrounds seen for chronic pain. He compared the use of "oral testimony," or "the children's own words," to find that physician language is "informative and directive" versus the patients' "expressive, subjective, and experiential" narrative: "Children and their doctors have two separate orientations toward pain and employ two separate vocabularies to describe pain that are inextricably linked to these orientations..[They] appear as tacit and embedded approaches to disease and illness" [104]. Nutkiewicz is describing the challenge that is central to understanding patient language and the patient's part in the conversation that is health care: to "name the illness in a way that is meaningful to physician and patient, wherein the patient's experience of illness is validated and accepted untransformed [italics added] and then later reconciled with the physician's diagnostic categories" [105].


This literature review has identified a serious lack of information about consumer contributions to controlled vocabulary, which appears to be a seriously under-researched area inside and outside of health care. The growing interest in folksonomy research among library scientists, information scientists, and computer scientists is a positive sign--but only a sign--that this deficiency is potentially reversible. Health care researchers need to engage with laypeople in their roles as patients to ensure that information systems can truly support health communication between laypeople and health care providers.

It can be argued that systems that deliver information run the risk of reinforcing the information designer's communication biases. For example, Beach has argued that research into the medical interview resembles the interview itself: It reiterates "medical authority and the institutional character of professional/lay communication" [106]. The same might be said of research into patient language, inevitably constricted by the fact that the patient is a patient, being observed and recorded by people seeing the individual as a patient. Sarangi contends that the assessment of "patient talk" has in fact meant attention to "responses to physician questions" [107]. As Keselman et al. have written, "Paradoxically, there is voluminous literature on the information needs of health care professionals but very little on those of patients and little about the needs of the general public. In practice, systems design is typically guided by the providers' perception of patients' information needs, rather than by actual needs assessment" [10]. This shows the effect of what Foucault called the "clinical gaze": By defining the person as a patient, the physician also defines the direction and the content of the subsequent conversation [108].

Making internalized understanding externally visible for the use of external others has always been a difficult task. Decades of informatics work on clinical data standards makes clear that symptoms, like other expressions of the lived patient experience, are both "unconscious and procedural ... hard to formalize and communicate to others" [109]. Forsythe commented on the same problem for representing clinical information: "the tacit, taken-for-granted, non-standardized information so essential to comprehension in particular situations" [110]. She used ethnographic methods precisely because of their value in eliciting implicit knowledge for explicit representation.

What are the consequences of Foucault's clinical gaze for information systems that serve the needs of patients and consumers? Oudshoorn and Somers looked at three Dutch patient organizations and the websites they built. One site, maintained by health professionals, focused on clinical depression. The second and third were maintained by patients, people living with cancer and repetitive stress injuries (RSI). A dichotomy was found between "implicit" and "explicit" techniques used for knowledge representation by these three organizations. The depression site relied on an implicit form of modeling, in which no input was sought from site users: "We have tried to think from the perspective of the target group ... just the two of us sitting in front of the computer and giving comments to each other." The patient-driven RSI developers also relied on implicit design, but this time informed by their personal expertise as RSI patients: "You try to imagine what kind of questions will be asked, and what problems actually exist." Only one organization, devoted to the information and advocacy needs of young people with cancer, demonstrated even vaguely explicit methods in its website design, and knowledge representation in this organization was based not only on personal experience, but also on "extensive interactions with young people" who had cancer [111]. The implicit-explicit distinction is another knowledge representation challenge.

Bringing the patient to the center of the stage means accepting, first, that "diagnostic outcomes rely very much on the accounts that patients give" and that "patients' contributions--volunteered or elicited--play a central part in what is diagnosed and how" [107]. Librarians, medical informatics researchers, and systems developers need to stop seeing consumers and patients as passive recipients of terminologies and ask, instead, for help in developing the terminologies.

To be sure, involvement of lay people in knowledge representation has its own challenges. Smith and Wicks, who studied symptom expressions in the PatientsLikeMe online community, found that although 42% of patients' self-initiated descriptions of symptoms concurred with the source vocabularies of the Unified Medical Language System, community members often confused diagnoses with symptoms and represented unusual and not necessarily useful aspects of their clinical conditions as "symptoms," for example, descriptions of clinical events or the time of day the symptom occurs [112]. Patient symptom expressions hint at the considerable challenges of web-based information systems for health communication and networking between physicians and patients.

But even with these acknowledged limitations--which should themselves be the focus of much new research--patients are still the target audiences of patient-oriented websites, and patient participation is reinforced by the strong value of empathy and identity politics in online community. This is another positive sign for improving clinical terminologies with consumer and patient feedback. Implicit knowledge representations generated by outsiders, what Oudshoorn and Somers call the "I-Methodology" [111], may not do justice to the patient's experience. Social networking does not merely permit the transcendence of boundaries that distinguish and separate individuals and communities, it can also render obsolete the authorities empowered by the existence of those boundaries [113]. As a result, in the health care experience, the insider-outsider distinction may now be difficult to make.

DOI: 10.3163/1536-5050.99.2.005


The author gratefully acknowledges the valuable contributions of all the people who reviewed this manuscript as it evolved.

Received August 2010; accepted November 2010


[1.] Silver D. Smart start-ups: how to make a fortune from starting online communities. New York, NY: Wiley; 2007. p. 5.

[2.] Boyd D, Ellison NB. Social network sites: definition, history and scholarship. J Comput Mediated Comm. 2008 Oct;13:210-30.

[3.] Jadad AR, Enkin MW, Glouberman S, Groff P, Stern A. Are virtual communities good for our health? BMJ. 2006 Apr 22;332(7547):925-6.

[4.] Sarasohn-Kahn J. The wisdom of patients: health care meets online social media [Internet]. California Healthcare Foundation, Health Reports; Apr 2008. p. 2 [cited 29 Oct 2010]. < -wisdom-of-patients-health-care-meets-online-social-media>.

[5.] Fox S, Jones S. The social life of health information [Internet]. Pew Internet and American Life Project; 11 Jun 2009 [cited 28 Oct 2010]. < Reports/2009/8-The-Social-Life-of-Health-Information.aspx>.

[6.] PatientsLikeMe. PatientsLikeMe [Internet]. c2005-2010 [cited 12 Nov 2010]. <>.

[7.] Wicks P. The value of openness: charting the course of PLS and PMA [Internet]. 11 Aug 2009 [cited 12 Nov 2010]. < -course-of-pls-and-pma/>.

[8.] Spastic Paraplegia Foundation. General information [Internet]. The Foundation; 19 Feb 2010 [cited 28 Oct 2010]. <>.

[9.] Gruber T. Ontology of folksonomy: a mash-up of apples and oranges. Int J Semantic Web Inf Syst. 2007;3(2):111.

[10.] Keselman A, Logan R, Smith CA, Leroy G, ZengTreitler Q. Developing informatics tools and strategies for consumer-centered health communication. J Am Med Inform Assoc. 2008 Jul-Aug;15(4):473-83. DOI: 10.1197/ jamia.M2744.

[11.] Hotho A, Jaeschke R, Schmitz C, Stumme G. Information retrieval in folksonomies: search and ranking. Lecture N Comp Sci. 2006;4011:411-26.

[12.] Tufekci Z. Grooming, gossip, Facebook and Myspace. Inf Comm Soc. 2008 Jun;11(4):544-64.

[13.] Gross R, Acquisti A, Heinz HJ. Information revelation and privacy in online social networks. In: WPES '05: Proceedings of the 2005 ACM Workshop on Privacy in the Electronic Society (WPES); 7 Nov 2005; Alexandria, VA. New York, NY: ACM Press. p. 71-80.

[14.] Wenger E. Communities of practice: learning, meaning and identity. Cambridge, UK: Cambridge University Press; 1998.

[15.] Dray S, Papen U. Literacy and health: towards a methodology for investigating patients' participation in healthcare. J Appl Linguistics. 2004;1(3):127.

[16.] Bearman D, Trant J. Social terminology enhancement through vernacular engagement: exploring collaborative annotation to encourage interaction with museum collections. D-Lib [Internet]. 2005 Sep;11(9) [cited 28 Oct 2010]. < .html>.

[17.] Nielsen J. Designing web usability. Indianapolis, IN: New Riders; 2000.

[18.] Zhang L, Wu X, Yu Y. Emergent semantics from folksonomies: a quantitative study. Lect N Comp Sci. 2006;4090:168-86.

[19.] Soergel D. Organizing information: principles of database and retrieval systems. Orlando, FL: Academic; 1985. p. 239.

[20.] Suthers DD. Collaborative knowledge construction through shared representations. In: HICSS '05: Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS); 3-6 Jan 2005; Wakoloa, HI. In: ACM Digital Library [Internet]. [cited 28 Oct 2010]. <>.

[21.] Jahoda G. Information storage and retrieval systems for individual researchers. New York, NY: Wiley-Interscience; 1970. p. 27.

[22.] Ghose A, Dhawle AS. Problems of thesaurus construction. J Am Soc Info Sci. 1977 Jul;28(4):211-7.

[23.] Cooney S, Cooney S. Standard procedure for generating personal classifications and indexes. J Inform Sci. 1980 Apr;2(2):81-90.

[24.] Strong GW, Drott MC. A thesaurus for end-user indexing and retrieval. Inform Proc Mgmt. 1986 Dec;22(6):487-92.

[25.] Shiri A. Metadata-enhanced visual interfaces to digital libraries. J Inform Sci. 2008 Dec;34(6):763-5.

[26.] Adler M. Transcending library catalogs: a comparative study of controlled terms in Library of Congress Subject Headings and user-generated tags in LibraryThing for transgender books. J Web Libr. 2009 Oct;3(4):309-31.

[27.] Lin X, Beaudoin JE, Bul Y, Desal K. Exploring characteristics of social classification. In: Furner J, Tennis JT, eds. Advances in classification research. Vol. 17: Proceedings of the 17th American Society for Information Science and Technology (ASIS&T) Special Interest Group/ Classification Research (SIG/CR) Workshop; 4 Nov 2006; Austin, TX [Internet]. [cited 28 Oct 2010]. <http://citeseerx>.

[28.] Daly E, Ballantyne N. Ensuring the discoverability of digital images for social work education: an online "tagging" survey to test controlled vocabularies. Webology [Internet]. 2009 Jun;6(2):1-16 [cited 28 Oct 2010]. <http://>.

[29.] Palmquist RA, Balakrishnan B. Using a continuous word association test to enhance a user's description of an information need: a quasi-experimental study. In: Borgman C, Pai EYH, eds. ASIS '88: Proceedings of the 51st American Society for Information Science (ASIS) Annual Meeting; 23-27 Oct 1988; Atlanta, GA. Medford, NJ: Learned Information. p. 160-3.

[30.] Wacholder N, Liu L. User preference: a measure of query-term quality. J Am Soc Inf Sci Technol. 2006 Oct;57(12):1566-80.

[31.] Matusiak KK. Towards user-centered indexing in digital image collections. OCLC Syst Serv. 2006;22(4):283-98.

[32.] Shiri AA, Revie C, Chowdhury G. Thesaurus-assisted search term selection and query expansion: a review of user centered studies. Knowledge Org. 2002;29(1):1-19.

[33.] Webb HB. The slang of jazz. Am Speech. 1937;12(3): 179-84.

[34.] Lewy MG, Preble E. Tragic magic: word usage among New York City heroin addicts. Psych Q. 1973;47(2):228-45.

[35.] Mbananga N, Mniki S, Oelofse A, Makapan S, Lubisi M. A model of developing medical terms in indigenous languages: a step towards consumer health informatics in South Africa. Stud Health Technol Inform. 2004;107(pt 2):1216-8.

[36.] Fishel CC, Graham KE, Greer DM, Gupta AD, Lockwood DK, Prime EE. CINAHL list of subject headings: a nursing thesaurus revised. Bull Med Libr Assoc. 1985 Apr;73(2):153-9.

[37.] Marshall J. Controlled vocabularies: a primer. Key Words. 2005 Oct-Dec;13(4):120-4.

[38.] Smith RC. The patient's story: integrated patient-doctor interviewing. Boston, MA: Little, Brown; 1996.

[39.] Lambert BL, Dickey LW, Fisher WM, Gibbons RD, Lin SJ, Luce PA, McLennan CT, Senders JW, Yu CT. Listen carefully: the risk of error in spoken medication orders. Soc Sci Med. 2010 May;70(10):1599-608.

[40.] Stapleton H, Kirkham M, Thomas G, Curtis P. Language use in antenatal consultations. Br J Midwifery. 2002 May;10(5):273-7.

[41.] Deber RB, Kraetschmer N, Urowitz S, Sharpe N. Patient, consumer, client, or customer: what do people want to be called? Health Expect. 2005 Dec;8(4):345-51.

[42.] Hensley MA. Why I am not a "mental health consumer." Psychiatr Rehabil J. 2006 Summer;30(1):67-9.

[43.] Baskett TF. What women want: don't call us clients, and we prefer female doctors. J Obstet Gynaecol Can. 2002 Jul;24(7):572-4.

[44.] Nordby H. Medical explanations and lay conceptions of disease and illness in doctor-patient interaction. Theor Med Bioeth. 2008;29(6):357-70.

[45.] Houle CR, Caldwell CH, Conrad FG, Joiner TA, Parker EA, Clark NM. Blowing the whistle: what do African American adolescents with asthma and their caregivers understand by "wheeze?" J Asthma. 2010 Feb;47(1):26-32.

[46.] Partridge B, Hall W, Lucke J, Underwood M, Bartlett H. Mapping community concerns about radical extensions of human life expectancy. Am J Bioeth. 2009 Dec;9(12):W4-5.

[47.] Shaw R. Perceptions of the gift relationship in organ and tissue donation: views of intensivists and donor and recipient coordinators. Soc Sci Med. 2010 Feb;70(4):609-15.

[48.] Barker KL, Reid M, Lowe CJM. Divided by a lack of common language? a qualitative study exploring the use of language by health professionals treating back pain. BMC Musculoskelet Disord. 2009 Oct 5;10:123.

[49.] Sherman GD, Clore GL. The color of sin: white and black are perceptual symbols of moral purity and pollution. Psychological Sci. 2009 Aug;20(8):1019-25.

[50.] Parkinson L, Rainbird K, Kerridge I, Carter G, Cavenagh J, McPhee J, Ravenscroft P. Cancer patients' attitudes toward euthanasia and physician-assisted suicide: the influence of question wording and patients' own definitions on responses. J Bioeth Inq. 2005;2(2):82-9.

[51.] Sigerist HE. A history of medicine. Vol. 1. Primitive and archaic medicine. New York, NY: Oxford; 1951. p. 586.

[52.] Franklin KJ. Some comments on eliciting cultural data. Ling Anthrop. 1971 Oct;13(7):339-48.

[53.] Small IF. Introduction to the clinical history. Flushing, NY: Medical Examination Publishing; 1971. p. 46.

[54.] Bean WB, ed. Sir William Osler aphorisms: from his bedside teachings and writings. New York, NY: Schuman; 1950. p. 37.

[55.] Dara J, Dowling JN, Travers D, Cooper GF, Chapman WW. Evaluation of preprocessing techniques for chief complaint classification. J Biomed Inform. 2008 Aug;41(4):613-23.

[56.] De Conno F, Ripamonti C, Caraceni A, Saita L. Palliative care at the National Cancer Institute of Milan. Support Care Cancer. 2001 May;9(3):141-7.

[57.] Williamson A, Hoggart B. Pain: a review of three commonly used pain rating scales. J Clin Nurs. 2005 Aug;14(7):803.

[58.] Cruccu G, Truini A. Sensory profiles: a new strategy for selecting patients in treatment trials for neuropathic pain. Pain. 2009 Nov;146(1):5-6.

[59.] Ohnhaus EE, Adler R. Methodological problems in the measurement of pain: a comparison between the verbal rating scale and the visual analogue scale. Pain. 1975 Dec;1(4):379-84.

[60.] Putzke JD, Richards JS, Ness T, Kezar L. Interrater reliability of the International Association for the Study of Pain and Tunks' spinal cord injury pain classification schemes. Am J Phys Med Rehabil. 2003 Jun;82(6):437-40.

[61.] Morley S, Pallin V. Scaling the affective domain of pain: a study of the dimensionality of verbal descriptors. Pain. 1995 Jul;62(1):39-49.

[62.] Mauro G, Tagliaferro G, Montini M, Zanolla L. Diffusion model of pain language and quality of life in orofacial pain patients. J Orofac Pain. 2001;15(1):36-46.

[63.] Closs SJ, Briggs M. Patients' verbal descriptions of pain and discomfort following orthopaedic surgery. Int J Nurs Stud. 2002 Jul;39(5):563-72.

[64.] Duncan GH, Bushnell MC, Lavigne GJ. Comparison of verbal and visual analogue scales for measuring the intensity and unpleasantness of experimental pain. Pain. 1989 Jun;37(3):295-303.

[65.] Melzack R. The McGill pain questionnaire: major properties and scoring methods. Pain. 1975 Sep;1(3):277-9.

[66.] Radvila A, Adler RH, Galeazzi RL, Vorkauf H. The development of a German language (Berne) pain questionnaire and its application in a situation causing acute pain. Pain. 1987 Feb;28(2):185-95.

[67.] Clark ME, Gironda RJ, Young RW. Development and validation of the pain outcomes questionnaire-VA. J Rehabil Res Dev. 2003 Sep-Oct;40(5):381-96.

[68.] Greenwald HP. Interethnic differences in pain perceptions. Pain. 1991 Feb;44(2):157-63.

[69.] Tammaro S, Berggren U, Bergenholtz G. Representation of verbal pain descriptors on a visual analogue scale by dental patients and dental students. Eur J Oral Sci. 1997 Jun;105(3):207-12.

[70.] Rondahl G, Innala S, Carlsson M. Heterosexual assumptions in verbal and non-verbal communication in nursing. J Adv Nurs. 2006 Nov;56(4):373-81.

[71.] Vincent SD, Toelle BG, Aroni RA, Jenkins CR, Reddel HK. "Exasperations" of asthma: a qualitative study of patient language about worsening asthma. Med J Aust. 2006 May 1;184(9):451-4.

[72.] Levin ME. Different use of medical terminology and culture-specific models of disease affecting communication between Xhosa-speaking patients and English-speaking doctors at a South African paediatric teaching hospital. S Afr Med J. 2006 Oct;96(10):1080-4.

[73.] Bernstein J. Fine wigns. Clin Orthop Relat Res. 2010 Apr;468(4):1165-7.

[74.] Yoos HL, Kitzman H, McMullen A, Sidora-Arcoleo K, Anson E. The language of breathlessness: do families and health care providers speak the same language when describing asthma symptoms? J Pediatr Health Care. 2005 Jul-Aug;19(4):197-205.

[75.] Jackson JL. Communication about symptoms in primary care: impact on patient outcomes. J Altern Complement Med. 2005;11(suppl 1):S51-6.

[76.] Schouten BC, Meeuwesen L. Cultural differences in medical communication: a review of the literature. Patient Educ Couns. 2006 Dec;64(1-3):21-34.

[77.] Ngo-Metzger Q, Sorkin DH, Phillips RS, Greenfield S, Massagli MP, Claridge B, Kaplan SH. Providing high-quality care for limited English proficient patients: the importance of language concordance and interpreter use. J Gen Intern Med. 2007 Nov;22, (suppl 2):324-30.

[78.] Vodopiutz J, Poller S, Schneider B, Lalouschek J, Menz F, Stollberger C. Chest pain in hospitalized patients: cause specific and gender-specific differences. J Womens Health (Larchmt). 2002 Oct;11(8):719-27.

[79.] Bischoff A, Hudelson P, Bovier PA. Doctor-patient gender concordance and patient satisfaction in interpreter-mediated consultations: an exploratory study. J Travel Med. 2008 Jan-Feb;15(1):1-5.

[80.] Stetson PD, McKnight LK, Bakken S, Curran C, Kubose TT, Cimino JJ. Development of an ontology to model medical errors, information needs, and the clinical communication space. Proc AMIA Symp. 2001: 672-6.

[81.] Peng P, Aguirre A, Johnson SB, Cimino JJ. Generating MEDLINE search strategies using a librarian knowledge-based system. Proc Annu Symp Comput Appl Med Care. 1993:596-600.

[82.] Bhavnani SK, Bichakjian CK, Schwartz JL, Strecher VJ, Dunn RL, Johnson TM, Lu X. Getting patients to the right healthcare sources: from real-world questions to strategy hubs. Proc AMIA Symp. 2002:51-5.

[83.] University of Pittsburgh. Faculty interests in the health sciences [Internet]. Pittsburgh, PA: The University; 20062007 [cited 29 Oct 2010]. <>.

[84.] Friedman PW, Winnick BL, Friedman CP, Mickelson PC. Development of a MeSH-based index of faculty research interests. Proc AMIA Symp. 2000:265-9.

[85.] Patrick TB, Monga HK, Sievert ME, Houston Hall J, Longo DR. Evaluation of controlled vocabulary resources for development of a consumer entry vocabulary for diabetes. J Med Internet Res. 2001 Jul-Sep;3(3):E24.

[86.] Zeng QT, Tse T. Exploring and developing consumer health vocabularies. J Am Med Inform Assoc. 2006 Jan Feb;13(1):24-9.

[87.] Smith CA, Stavri PZ, Chapman WW. In their own words? a terminological analysis of e-mail to a cancer information service. Proc AMIA Symp. 2002:697-701.

[88.] McCray AT. Promoting health literacy. J Am Med Inform Assoc. 2005 Mar-Apr;12(2):152-63.

[89.] Keselman A, Smith CA, Divita G, Kim H, Browne AC, Leroy G, Zeng-Treitler Q. Consumer health concepts that do not map to the UMLS: where do they fit? J Am Med Inform Assoc. 2008 Jul-Aug;15(4):496-505.

[90.] McCray AT, Ide NC, Loane RR, Tse T. Strategies for supporting consumer health information seeking. Stud Health Technol Inform. 2004;107(pt 2):1152-6.

[91.] Zeng QT, Tse T, Crowell J, Divita G, Roth L, Browne AC. Identifying consumer-friendly display (CFD) names for health concepts. Proc AMIA Symp. 2005:859-63.

[92.] Slaughter LA, Ruland CM, Vatne TM. Constructing an effective information architecture for a pediatric cancer symptom assessment tool. Proc AMIA Symp. 2006:1102.

[93.] Sievert MEC, Patrick TB, Reid JC. Need a bloody nose be a nosebleed? or, lexical variants cause surprising results. Bull Med Libr Assoc. 2001 Jan;89(1):68-71.

[94.] Smith CA, Stavri PZ, Chapman WW. In their own words? a terminological analysis of e-mail to a cancer information service. Proc AMIA Symp. 2002:697-701.

[95.] Brennan PF, Aronson AR. Towards linking patients and clinical information: detecting UMLS concepts in e-mail. J Biomed Inform. 2003 Aug-Oct;36(4-5):334-41.

[96.] Hsieh Y, Hardardottir GA, Brennan PF. Linguistic analysis: terms and phrases used by patients in e-mail messages to nurses. Stud Health Technol Inform. 2004;107(pt 1):511-5.

[97.] Tse T, Soergel D. Exploring medical expressions used by consumers and the media: an emerging view of consumer health vocabularies. Proc AMIA Symp. 2003:674-8.

[98.] Smith CA. Nursery, gutter, or anatomy class? obscene expression in consumer health. Proc AMIA Symp. 2007:676-80.

[99.] Slaughter L, Ruland C, Rotegard AK. Mapping cancer patients' symptoms to UMLS concepts. Proc AMIA Symp. 2005:699-703.

[100.] Zeng Q, Kogan S, Ash N, Greenes RA, Boxwala AA. Characteristics of consumer terminology for health information retrieval. Methods Inf Med. 2002;41(4):289-98.

[101.] Coleman K, Coleman WP, Flynn TC. Lexicon of areas amenable to liposuction. Dermatol Surg. 2006 Apr;32(4): 558-61.

[102.] Lindfors P, Raevaara L. Discussing patients' drinking and eating habits in medical and homeopathic consultations. Commun Med. 2005;2(2):137-49.

[103.] Farmer SA, Roter DL, Higginson IJ. Chest pain: communication of symptoms and history in a London emergency department. Patient Educ Couns. 2006 Oct;63(1 2):138-44.

[104.] Nutkiewicz M. Diagnosis versus dialogue: oral testimony and the study of pediatric pain. Oral Hist Rev. 2008 Winter-Spring;35(1):11-21.

[105.] Epstein RM, Quill TE, McWhinney IR. Somatization reconsidered: incorporating the patient's experience of illness. Arch Intern Med. 1999 Feb 8;159(3):215-22.

[106.] Beach WA. Diagnosing 'lay diagnosis.' Text Interdiscip J Study Discourse. 2001 Jun;1-2:13-8.

[107.] Sarangi S. This issue of Communication & Medicine. Commun Med. 2008;5(2):103-4.

[108.] Foucault M. The birth of the clinic: an archaeology of medical perception. Smith AMS, translator. New York, NY: Pantheon; 1973.

[109.] Spaniol M, Klamma R, Springer L, Jarke M. Aphasic communities of learning on the web. Int J Dist Ed Tech. 2006;4(1):31-45.

[110.] Forsythe DE. Using ethnography to investigate life scientists' information needs. Bull Med Libr Assoc. 1998 Jul;86(3):402-9.

[111.] Oudshoorn N, Somers A. Constructing the digital patient: patient organizations and the development of health websites. Inf Comm Soc. 2006 Oct;9(5):657-75.

[112.] Smith CA, Wicks PJ. PatientsLikeMe: consumer health vocabulary as a folksonomy. Proc AMIA Symp. 2008:682-6.

[113.] Matusiak KK. Towards user-centered indexing in digital image collections. OCLC Syst Serv. 2006;22(4):283-98.


* The unstructured language of patients has been explored in numerous dimensions including linguistic, sociocultural, and medical informatics studies.

* Patients do not typically contribute their language directly to health information systems; rather, physicians and information systems themselves serve as mediators.

* Folksonomies developed in Web 2.0 have the potential to change this communication dynamic.


* Information professionals need to be aware of the built-in biases of clinical information systems and their vocabularies.

* Librarians and information scientists can use their expertise to assist in building and maintaining more consumer-friendly tagsets, ontologies, and thesauri.


Catherine A. Smith, MA, MILS, MSIS, PhD,, Assistant Professor, School of Library and Information Studies, University of Wisconsin-Madison, 600 North Park Street, #4255, Madison, WI 53706
Table 1
Citations reviewed

                                                        about user
                                            Total      to thesauri
                                          citations     or patient
Database                                  retrieved      language

Library and Information Science              109            14

Library and Information Science and          162            16
Technology Abstracts

Library Literature (current and              114            9
Retro file)

PubMed MEDLINE                               170           141

CINAHL                                       52             7
COPYRIGHT 2011 Medical Library Association
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2011 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Smith, Catherine A.
Publication:Journal of the Medical Library Association
Article Type:Report
Geographic Code:1USA
Date:Apr 1, 2011
Previous Article:Putting the pieces together: endometriosis blogs, cognitive authority, and collaborative information behavior.
Next Article:Drilling deeper into the core: an analysis of journal evaluation methodologies used to create the "Basic List of Veterinary Medical Serials," third...

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters