Literature searches in the conduct of systematic reviews and evaluations.

Summary: Performing a literature search is an important part of performing a systematic review or a metaanalysis of biomedical literature, which have now become the gold standards for determining what qualifies as 'evidence-based' medicine. Combining searches of English-language databases and the large Chineselanguage databases can identify new, potentially important, sources of data that are not include in the traditional English-only reviews. Selection of a restricted subset of databases for conducting the literature search or using inappropriate methods to identify appropriate articles within each database can lead to biased results and incorrect conclusions. This article introduces common English and Chinese databases, describes the search engines available for conducting searches, discusses the basic methods and common pitfalls of conducting searches, and provides an example of a search to highlight these issues.

Keywords: literature review; publication bias; databases; bibliography; systematic review; meta-analysis

Systematic evaluation of literature is a relatively new method in biomedical research. If sufficient studies with comparable methodologies are identified, a metaanalysis that pools the results of the original studies --considered a type of secondary data processing [1] --can be conducted. The results of such systematic reviews and meta-analysis are often used as the highest level of evidence available to support changes in the clinical guidelines for the treatment of various illnesses (i.e., 'evidence based medicine'). However, biases in literature searches that occur because of incomplete coverage of databases or errors in the search strategy can seriously undermine the internal validity of systematic reviews and meta-analyses. [2,3] Researchers conducting systematic reviews and meta-analyses must carefully choose appropriate databases and use multiple search methods to find all relevant publications for the topic of interest. This issue has become more important as an ever-increasing proportion of the global medical literature is appearing in non-English publications, particularly Chinese and Spanish.

1. Selection of databases

The Ulrich's Periodicals Directory currently lists more than 56,800 active academic journals including more than 23,500 peer-reviewed journals. [4] About half of these journals are life science or biomedical journals published by over 2000 publishers; and about 26.6% of these are in non-English languages. It is almost impossible to search all of these journals one by one, so a variety of abstract-based databases that cover different subsets of these journals have been developed to assist clinicians and researchers in the identification of relevant literature when deciding how best to treat a specific class of patients or when conducting systematic reviews or meta-analyses. The coverage of journals and the timeframe of the included publications for each database is different, and therefore each database has its unique strengths and limitations. Some databases are largely focused on biomedical research (e.g., MEDLINE and EMBASE), some are limited to clinical trials (e.g., the CENTRAL database of the Cochrane Collaboration), some include a stronger health services component (e.g., CINAHL), some include social science topics relevant to health (e.g., the Social Science Citation Index in the Web of Science search engine), some are focused on a specific field (e.g., PsychInfo collects articles from publications relevant to psychology), some are limited to non-English languages (e.g., SinoMed only includes Chinese-language journals from mainland China), some are region-specific (e.g., LILACS is focused on Latin America, and TEPS is limited to journals published in Taiwan), and some are country-specific (e.g., the Cinii database in Japan and the IndMed database in India). Researchers conducting literature searches need to understand the coverage and limitations of the various databases and select the databases that provide the best fit for the topic of interest. As the proportion of global medical literature appearing in non-English languages increases (particularly Chinese and Spanish) it is increasingly important to include databases that provide good coverage of journals in other languages.

Searches of the major international databases such as MEDLINE, EMBASE, and PsycINFO can be conducted using their built-in search systems or by using authorized third-party platforms such as OVID and Web of Knowledge. The search expressions are slightly different in different systems. One advantage of OVID is that it allows users to specify the distance between keywords using 'adj'. For example, the term 'generalized adj/2 anxiety' in OVID means 'generalized' and 'anxiety' should be within the distance of two words. Therefore, the search finds articles that contain 'generalized social anxiety or 'generalized anxiety'. This function has not been made available in Pubmed and other platforms. Besides user-specified keywords, most biomedical databases support the use of medical subject headings terms (MeSH terms). There are two main purposes of MeSH terms. First, MeSH terms combine different expressions of one subject into one term. For example, the MeSH term 'Dementia' (i.e., 'dementia [MeSH]') includes 'dementia' and 'amentia'. Second, MeSH terms are organized into hierarchies. Searches using the upstream terms can be expanded to include all downstream terms using the 'exp' function. For example, 'exp Dementia[MeSH]' searches all articles tagged with terms including Alzheimer's Disease, Huntington's Disease, Lewy Body Disease, and Kluver-Bucy Syndrome. However, there are differences in expressions of these MeSH terms in different databases (see Table 1).

MEDLINE indexes more than 5000 biomedical journals published since 1960 in >70 countries with a total of >20 million articles covering a wide spectrum of life and biomedical science including basic and clinical medical science, nursing, dentistry, pharmacology, nutritional science, environmental science, public health, and health care management. The vast majority are published in English (~90%). About half are from the United States and 80% of articles have English abstracts. Every week, there are approximately 2000~4000 new articles entering the system. There are multiple platforms to search MEDLINE including OVID, Dialog, Proquest, EBSCO, ISI, and PubMed. Although the search languages are slightly different across different platforms, all of them support the use of MeSH terms and Boolean combinations of keywords. OVID was the first web-based MEDLINE search engine and has gained popularity among researchers in the United States and Europe. Since its launch in 1997, PubMed has become another popular platform around the world (including China) as it is the only free search engine for MEDLINE. PubMed also includes articles that are undergoing the indexing process (in the Pre-Medline system). For these articles, MeSH terms are not available. In addition to MEDLINE, PubMed also include articles from PubMed Central (PMC), which was established in 2002 by the United States National Library of Medicine (NLM) and provides access to the full-text of articles free of charge.

EMBASE is another commonly used international database in biomedical researchers that indexes over 5000 journals around the world covering biomedicine, pharmacology, public health, and social medicine. It does not cover dentistry, nursing, or veterinary medicine. Similar to Medline, EMBASE can be searched using the OVID platform. However, the subject heading in EMBASE is EMtree instead of MeSH terms. One advantage of EMBASE is that it has 61 EMtree terms in pharmacology, which facilitates searches related to clinical drugs.

CINAHL is a database in nursing which covers over 3000 journals with more than 2.80 million articles in 17 related fields including nursing, biomedical research, alternative medicine, and dentistry. Similar to PubMed, articles in the indexing process are placed in the Pre-CINAHL system.

LILACS database includes more than 700,000 articles about clinical trials, cohort studies, and systematic reviews published in over 880 journals in Latin America and the Caribbean since 1986. [5] Similar to MeSH terms, LILACS uses approximately 32,000 DeCS as subject headings including 27,000 directly from MeSH.

SinoMed is a Chinese database that includes more than 5.5 million articles published in more than 1800 Chinese journals since 1978 in basic and clinical medicine, public health, pharmacology, traditional Chinese medicine, and other related fields. SinoMed uses MeSH terms and additional terms for traditional Chinese medicine to index every article. In addition to searches based on free keywords, Sinomed supports searches based on subject headings and terms from the Chinese Library Classification, which improves users' ability to identify relevant articles and systematic reviews. In contrast to SinoMed, the other full-text Chinese-language databases available in mainland China (CNKI, Wanfang, and Chongqing VIP) lack comprehensive search platforms, do not have reliable subject heading functions, and do not include articles from many biomedical journals due to copyright liabilities. For example, none of the databases include articles published before 1989, CNKI does not include articles from the 115 journals published by the Chinese Medical Association Publishing House since 2007, and Wanfang does not include articles published by the journals sponsored by the Chinese Medical Doctors Association.

PsycINFO is a commonly used database in psychology that indexes publications since 1872 from more than 1900 academic journals in psychology from more than 50 countries in over 35 languages. Web of knowledge is a popular platform to search PsycINFO. Besides searching 'Topic' using free keywords, one can conduct searches using 'Descriptors' in PsycINFO to improve the coverage of searches.

Cochrane CENTRAL is the registry with the broadest coverage of clinical trials; it includes more than 400,000 such reports. [6] Users can search using free keywords or MeSH terms. By applying the 'trial' filter in the system, users can restrict their searches to clinical trials registered in the Cochrane system. Although many completed trials are retrievable on MEDLINE and EMBASE, ongoing trials are only available in the Cochrane CENTRAL registry. This can provide a more up-to-date picture of certain research topics when conducting a literature review.

2. PICOS-based design of search strategies

In evidence-based medicine, the construction of a research question should be guided by the PICOS tool which identifies the following five components of clinical evidence for systematic reviews (Table 2): patients/problems (P), interventions (I), comparison (C), outcomes (O), and study design (S). [7] (see Table 2)

For a clearly defined research question, the search strategy is usually devised to address 'P', 'I', and 'S'; 'C' and 'O' are usually addressed during the screening of articles. In MEDLINE and EMBASE, search terms about 'P' and 'I' should include relevant free keywords and MeSH terms and are combined using the 'or' Boolean function. Study design ('S') is generally clear. Here, we provide an example to show the conduct of such searches in MEDLINE via the OVID platform (see Table 3). The research question is whether perazine can effectively treat schizophrenia.

#1 and #2 are both free keywords and #3 refers to searches using 'schizophrenia' as a MeSH term; 'exp' means searching subheadings under 'schizophrenia' in order to improve the coverage. #5 to #9 in 'I' are also free keywords. '*' evokes the wildcard search function where all words containing 'perazin' will be searched including 'perazin', 'perazine', and 'pernazinum'. #12 and #13 in the 'S' column means searching articles tagged as randomized controlled trials or clinical controlled trials. #14 to #18 aim to search for articles that contain certain keywords in the title or abstract. #20 is to eliminate studies tagged as animal studies. The final search strategy is devised by combining all three portions using the 'and' function. Searches in EMBASE, CINAHL, and SINOMED are conducted in a similar fashion.

To illustrate the relative coverage of the four Chinese databases (see Table 4), we applied the following specifications to SinoMed, CNKI, Wanfang, and Chongqing VIP: P=depression, I=antidepressant medication, C=placebo, O=any measure of effectiveness; S=RCT, which searches for randomized controlled trials on the effect of antidepressant in the treatment of depression. Table 4 shows the search results. Judging from the number of publications, SinoMed outperforms the other three datasets finding approximately 50% more articles than the other three databases when searched one by one. Similarly, when two or three databases were searched jointly, the ones with SinoMed found significantly more articles compared to the ones without SinoMed. The same trend is observed when we shift the focus to case-control studies on suicide or suicide attempt.

3. Discussion

Besides computerized searches of online databases, researchers should also hand check reference lists of relevant articles and search other resources including technical reports, conferences papers, and theses for unpublished studies when necessary. In addition, the indexing in different databases lags behind the publication of articles. The lapse is especially long (~3 months) in Chinese databases (e.g., SinoMed, CNKI, Wanfang, and Chongqing VIP). Therefore, researchers should search major journals in the field for the most recent publications.

Xiaochun QIU, Cheng WANG (*)


Xiaochun Qiu graduated from Tongji Medical College in 1993 and received his master's degree in Informatics in 2002. Since 1993, he has been working at the Shanghai Jiao Tong University Medical School Library where he is currently the Deputy Librarian and the head of the Department of Medical Literature Review. He is also the Deputy Director of the Library Society of the China Medical Council and an executive member of the National Medical Literature Review Association of China. His research interest is quantitative analysis of literature and literature research in evidence-based medicine.

Shanghai Jiao Tong University School of Medicine Library, Shanghai, China

(*) correspondence: Dr. Wang Cheng, mailing address: RM 502, Shanghai Jiao Tong University School of Medicine Library, RD Chongqing 280, Shanghai.

Table 1. Comparisons of commonly used English and Chinese databases

database  field                          geographic

MEDLINE   medicine, pharmacology,        global with a focus on North
          and nursing                    America
EMBASE    medicine, public health,       global with a focus on Europe
          and pharmacology
PsycINFO  psychology and psychiatry      global
CINAHL    nursing and health care        global
LILACS    medicine, public health,       Latin America and the
          pharmacology, and nursing      Caribbean
SINOMED   medicine, public health,       mainland China
          pharmacology, traditional
          Chinese medicine, and nursing
CENTRAL   clinical trials                global

database  subject heading


EMBASE    Emtree

PsycINFO  Descriptors

SINOMED   MeSH, traditional Chinese
          medicine headings


Table 2. The PICOS tool

PICOS                  key question

patients/problems (P)  Who are the patients or what are their problems
                       (e.g., main health conditions, comorbid
                       conditions, and other clinically significant
intervention (I)       What is the intervention under consideration
                       (e.g. diagnosis, treatment, or prognosisrelated
comparison (C)         Is there a standard intervention to compare
outcome (O)            What are the ultimate goals of the intervention?
study design (S)       What is the study design or the intervention

Table 3. An example of PICOS search strategy

P (Schizophrenia)           |(Perazine)

#1 schizophren  (*)         #5 perazin  (*)
#2 dementia Praecox         #6 taxilan  (*)
#3 exp schizophrenia[Mesh]  #7 pernazin  (*)
#4 or 1/3                   #8 piperazin  (*)
                            #9 phenothiazine tranquilizer  (*)
                            #10 perazine[Mesh]
                            #11 or 5/10

#4 and #11 and #21

P (Schizophrenia)           S (RCT)

#1 schizophren  (*)         #12 randomized controlled trial[pt]
#2 dementia Praecox         #13 controlled clinical trial[pt]
#3 exp schizophrenia[Mesh]  #14 randomized [tiab]
#4 or 1/3                   #15 placebo [tiab]
                            #16 randomly [tiab]
                            #17 trial [tiab]
                            #18 groups [tiab]
                            #19 or 11/17
                            #20 animals [MeSh] not human [MeSh]
                            #21 #19 not #20
#4 and #11 and #21

Table 4. Analysis showing the cross coverage of search results for
different types of studies using the 4 different Chinese-language

                                   search for clinical   search for risk
                                   intervention          factor
                                   studies (a)           studies (b)
                                   n (%)                 n (%)

Articles in a single database
SinoMed                             983(79.98%)           73(65.18%)
CNKI                                407(33.12%)           63(56.25%)
Wanfang                             464(37.75%)           69(61.61%)
Chongqing VIP                       492(40.03%)           65(58.04%)
Articles in 2 databases
SinoMed+CNKI                       1027(83.56%)           93(83.04%)
SinoMed+Wanfang                    1061(86.33%)           91(81.25%)
SinoMed+Chongqing VIP              1130(91.94%)           88(78.57%)
CNKI+Wanfang                        517(42.88%)           89(79.46%)
CNKI+Chongqing VIP                  527(42.88%)           85(75.89%)
Wanfang+Chongqing VIP               553(45.00%)           84(75.00%)
Articles in three databases
SinoMed+CNKI+Wanfang               1080(87.88%)          104(92.86%)
SinoMed+CNKI+Chongqing VIP         1091(88.77%)          102(91.07%)
SinoMed+Wanfang+Chongqing VIP      1107(90.07%)          101(90.18%)
CNKI+Wanfang+Chongqing VIP          659(53.62%)           98(87.50%)

Articles in all 4 databases 1229                    112

(a) search for articles about randomized controlled trials (RCTs) for
depression using any antidepressant versus placebo
(b) search for case-control studies about risk factors for suicide or
suicide attempt
Author:Qiu, Xiaochun; Wang, Cheng
Publication:Shanghai Archives of Psychiatry
Article Type:Report
Geographic Code:9CHIN
Date:Jun 1, 2016
