Leveraging open and public access content from government-funded research.
Without access to the resources held by subscribing institutions, researchers and the public face individual article fees. These can range from less than $20 to more than $100 per article. How can the public benefit from their investment in R&D?
Before blaming any participant for the paywalls, consider this: A great deal of workflow transpires in the scholarly publishing process. In its simplest form, researchers write grants; funding agencies distribute awards; the actual research is done; and results are written and submitted to journals. The publisher facilitates the peer-review system, provides a number of services and tools, and often augments content with media, advanced search capabilities, and other added content. Libraries, especially academic and research libraries, are the primary customers of publisher content, providing database and journal access to their communities.
MANDATES, POLICIES, AND INITIATIVES
Mandates, policies, and initiatives are changing the research landscape. What are current and emerging search strategies? As policies and platforms are explored, a fundamental understanding is that publisher policies, together with government and agency policies, drive full text access across platforms.
The February 2013 release of a memorandum from the White House Office of Science and Technology Policy (OSTP) began a major shift in access and delivery of scientific content in the United States. The memo, titled "Increasing Access to the Results of Federally Funded Research," mandates that the direct results of federally funded scientific research be made available to the public. The memo addresses both scholarly articles and data. Federal agencies receiving in excess of $100,000,000 in federal research and development funding were asked to submit plans to support public access to research. Now, more than 2 years later, the plans have been approved and are being released and implemented.
The National Institutes of Health (NIH) has led the way in sharing government-funded research with the public. Since 2005, accepted manuscripts and links to final published versions of articles have been available in PubMed Central (PMC). There is a vast amount of biomedical and life sciences literature available on the site.
PubMed Central International (PMCI) is an outgrowth of PMC, with partnerships in Canada and Europe. PMC Canada is sponsored by the Canadian Institutes of Health Research (CIHR), the National Research Council's Canada Institute for Scientific and Technical Information (NRC-CISTI), and the National Library of Medicine (NLM). Europe PMC is sponsored by the Wellcome Trust and other European funding agencies. Both the European and Canadian PMC databases receive their final published articles directly from the U.S. central site.
From its inception, PMC has created sophisticated search tools. Among its newer features is PubReader, introduced in late 2012, which optimizes content for tablets and smaller screens, and is available across a range of browsers. In February 2015, PMC made the plain text format of articles available on its FTP site. This facilitates text and data mining via downloading gzipped archive files. The extracted full text is for all of the articles in the PMC open access subset. When images, tables, and supplementary data are not needed, this text mining capability presents a powerful analysis option. Each file is greater than a gigabyte. Files are updated weekly on Saturdays, and researchers are expected to comply with copyright. PMC training continues to expand and includes handouts, videos, and tutorials.
Among the long-established PMC tools is a controlled vocabulary, Medical Subject Headings (MeSH). MeSH searches provide an authoritative, focused approach to advanced searches. MeSH has been adopted as an alternative search taxonomy for at least one publisher database, IEEE's Xplore Digital Library. The ability to bring a PubMed Central search into Xplore is very powerful. The public can search Xplore, read through abstracts, and request articles from local public libraries. Scientists and engineers can join IEEE and receive access to the Digital Library. Many academic and research institutions subscribe to Xplore, giving employees and students institutional access. MeSH in IEEE is a great example of a publisher-funder partnership.
DEPARTMENT OF ENERGY
The U.S. Department of Energy (DOE) was the first federal funding agency to publish a final plan to fulfill the requirements of the White House OSTP memorandum. PAGES (Public Access Gateway for Energy & Science) is the DOE's open portal for funded research. The database is currently in beta and will contain metadata in addition to links to the best available version of an article--either the publisher version or the author's accepted manuscript. This infrastructure provides a hybrid approach of centralized metadata and decentralized content.
PAGES functionality will be enhanced through collaborations with several publishing support organizations. One such collaboration, with FundRef, will allow PAGES to provide a standardized metadata element to identify agency funding sources for published research. Standardized names will enable searching of publications funded by specific agencies.
Search results can be filtered by full text or citation-only, as well as by author. As PAGES evolves from beta, further facets will become available. Citations can be exported in several file formats, such as CVS or XML, as well as in a number of citation formats. Results can be sorted by date or by relevance.
CENTERS FOR DISEASE CONTROL AND PREVENTION
The Centers for Disease Control and Prevention (CDC) uses Stacks as its digital archive. The National Oceanic and Atmospheric Administration (NOAA) is partnering with the CDC for its repository infrastructure. The CDC will be the systems provider, and the NOAA Central Library will be the content manager for NOAAs public access repository solution.
CDC Stacks is an online archive with several collections. The archive includes journal articles along with health resources such as the Health Alert Network and digitized historic collections such as the first 30 years of the Morbidity and Mortality Weekly Report. Collections can be searched or browsed, and many search features and facets are built into the system. A simple search box is featured, and advanced search is available. The advanced search allows search across all collections or within specific collections. Once the content is chosen, three faceted search boxes are available with Boolean logic.
Search results can be sorted by relevance, last modified, published date, title, or volume/issue. Facets include Subject; Name as Subject; Place as Subject; Genre, such as Bibliography or Conference Publication; and Document Type, such as book, journal article, newsletter, or pamphlet. Within a single result, content can be downloaded, emailed, printed, or shared. Facets within a result include Related Subjects Across Collections or You May Also Like, a titles recommendation section. Authors and document types are hyperlinked, and there is a section for funding sources. In addition to the PDF version of the content, there is frequently an option to download text and supporting content, such as pictures and graphs.
To track forthcoming CDC public access publications, navigate to the Coming Soon tab at the top of the CDC site. Articles are listed with Title, Author(s), Date Published, Source, Description, and Public Access Available Date.
Federal agencies are working together on discovery platforms and services. CENDI, a federal membership organization made up of the major science agencies and other stakeholders, has helped facilitate collaborative conversations. A U.S. cross-agency search platform, Science.gov, is a result of CENDI support. There are currently 15 agencies represented, and more will be added. Search results include tabs for text, multimedia, and data. Facets include Topics, Authors, and Dates. Results can also be narrowed by content source. Alerts are available. Selected science topics are highlighted on the front page to take people to directory listings. A resource for the advanced searcher is a link to taxonomies and thesauri. Science.gov provides listings of agencies with their corresponding vocabularies.
The Clearinghouse for Open Research of the United States (CHORUS) has tools and services for funders, publishers, institutions, researchers, and the public. On the funder level, government agencies such as the Department of Energy are using CHORUS to drive their own search databases. CHORUS linking is enabled by CrossRef. Relationships with CLOCKSS and Portico ensure preservation and archiving of content. Researchers can use the CHORUS Search function to find research published by specific agencies and organizations. Several facets for narrowing search results are available.
An initiative of the Association of Research Libraries (ARL) with support from the Association of American Universities (AAU) and the Association of Public and Land-grant Universities (APLU), SHARE wants to maximize the impact of scholarly research. It collects, connects, and enhances scholarly metadata across the research life cycle to identify the various elements as part of a single research project. As of May 2015, SHARE has processed more than 750,000 research release events from 38 content providers and launched the beta version of SHARE Notify.
Parallel to the United States efforts are a number of international initiatives to create access to publicly funded research. In June 2012, a report stemming from a working group of individuals representing funders, publishers, and academia was released. "Accessibility, Sustainability, Excellence: How to Expand Access to Research Publications," also known as the "Finch Report," is an attempt to build a model for access to publicly funded research. Progress on recommendations and other initiatives is tracked on a Finch implementation site and reported by the Research Information Network (RIN). [For more information about the "Finch Report," see Joanne Ptolomey's article: "Finch and Open Access: Debating the Future of Academic Publishing," Online Searcher, Vol. 37, No. 1, Jan./Feb. 2013: pp. 31-34.--Ed.]
The University of Nottingham Centre for Research Communications is home to Securing a Hybrid Environment for Research Preservation and Access (SHERPA). SHERPA includes a number of scholarly communication tools, including the Directory of Open Access Repositories (OpenDOAR). OpenDOAR is an international directory of repositories and has grown to nearly 3,000 instances. In addition to academic repositories, agency archives such as the National Institutes for Health in the U.S. and the Wellcome Trust in the U.K. are included. The OpenDOAR team checks each listing for accuracy and adds metadata to aid in search normalization.
OpenDOAR searchers can identify individual repositories or search for content across repositories. Cross-disciplinary subjects are strongly represented, as is health, medicine, science, and technology. The social sciences are also part of OpenDOAR, with the arts and humanities areas growing. Content is largely full text. The OpenDOAR Search tool uses the Google Custom Search Engine and its indexes to search across repositories. This tool is currently on trial and the result lists do not yet include facets for narrowing results.
The WorldWideScience Alliance, an international collaboration, built the website WorldWideScience.org. Maintained by U.S. DOE's Office of Scientific and Technical Information (OSTI), this global science gateway provides federated searching of databases and portals worldwide. Microsoft Translator is used to translate queries and results. Ten languages are currently supported: Arabic, Chinese, English, French, German, Japanese, Korean, Portuguese, Russian, and Spanish. More than 100 databases from more than 70 countries are indexed, and much of the content is lull text. Data sources are included. Database functionality is similar to Science.gov. Both gateways are supported by DeepWeb Technologies.
A quick comparison of OpenDOAR and WorldWideScience.org highlights content differences. Using biosurveiLLance as a search term, OpenDOAR returns approximately 47,400 hits from repositories worldwide. The identical search in WorldWideScience.org returns 1,065 results from international agency resources. OpenDOAR, still in beta, made available only the first 100 of the 47,400 stated results. Most of these results did lead to full text, but some led to paywalls.
The WorldWideScience.org results included many non-English-language titles. The Translate option is available on the results page. A test of several French titles from specific agencies revealed that content on secure pages could not be translated. Several sets of French and Russian metadata also could not be translated, along with accompanying websites. The full text was either not available or the actual document maintained its original language. While full translations are likely available among WorldWideScience.org content, this ability is still more a novelty than a feature.
EVOLUTION OF SCHOLARLY PUBLISHING
The scholarly publication landscape is evolving. This article provides an overview of several platforms, from institutional repositories to international collaborations. Government legislation, funding agency policies, and changing publisher business models will continue to impact search and access. Some of the larger changes will occur in media and data content. Paths to access at a global scale with the ability to perform text analysis, cross-disciplinary search, and data analytics are being discussed. Watch for further developments in CHORUS, SHARE, and OpenDOAR. The National Science Communication Institute (nSCI) is facilitating an international discussion on an Open Scholarship Initiative and an All Scholarship Repository. Conversations that involve all stakeholders are critical.
Government-funded research is an important piece of the evolving content landscape. While search is evolving, the direction is clear. More content across platforms and formats will be available. Search tools will continue to become more sophisticated. Staying abreast of change and competencies is now, and will remain, a constant need. Remain connected, keep reading, and search well.
Dee Magnoni (firstname.lastname@example.org) is research library director at Los Alamos National Laboratory.
Comments? Email the editor-in-chief (email@example.com).
URLs for Cited Websites
Crowd-sourced table of U.S. agency responses to OSTP memo
Finch report implementation and review
National Academies Press
National Library of Medicine
National Oceanic and Atmospheric Administration
OSTP Memo Increasing Access to the Results of Federally Funded Research
Table 1: Choosing a Platform for Your Government-Funded Content Search Funder Policy Link Applies to Funding Awarded as of (date) Bill & Malinda gatesfoundation.org-how-we- 1-Jan-2015 Gates Foundation work-general-information-open- access-policy CDC cdc.gov/od/science/docs/Final- 1-Jul-2013 CDC-Public-Access-Plan- Jan-2015_508-Compliant.pdf DoD dtic.mil/dtic/pdf/dod_public_ estimate 2015 access_plan_feb2015.pdf DOE energy.gov/sites/prod/files/ 1-Oct-2014 2014/08/f18/DOE_Public_Access% 20Plan_FINAL.pdf NASA http://science.nasa.gov/media/ 1-Oct-2015 medialibrary/2014/12/05/NASA_ Plan_for_increasing_access_to_ results_of_federally_funded_ research.pdf NIH grants.nih.gov/grants/NIH- 2008 Public-Access-Plan.pdf NIST nist.gov/data/upload/NIST- 1-Oct-2015 Plan-for-Public-Access.pdf NOAA docs.lib.noaa.gov/noaa_ 1-Feb-2015 documents/NOAA_Research_ Council/NOAA_PARR_Plan_ v5.04.pdf NSF nsf.gov/pubs/2015/nsf15052/ 1-Jan-2016 nsf15052.pdf USDA usda.gov/documents/USDA-Public- 1-Jan-2016 Access-Implementation-Plan.pdf Funder Search Platform Bill & Malinda To be specified Gates Foundation CDC CDC Stacks DoD DTIC DOE PAGES NASA PubMed Central NIH PubMed Central NIST PubMed Central NOAA CDC Stacks NSF PAGES partnership USDA PubAg Table 2: A Sampling of Agency Policies Source When to Use Notes OpenDOAR Allows search at the Each repository is repository level. Both visited by OpenDOAR staff broad in international to check accuracy of coverage and specific in information. Staff also single institution assign metadata to allow search. broader searching across repositories. Agency Use individual funding Platforms are at various platforms agency platforms for stages of development. specific agency research. Search environments range Ideal for known funder. from the highly Can leverage specific developed, centralized platform search tools. PubMed Central to the beta, distributed PAGES to the exploratory Bill & Melinda Gates Foundation strategy. CHORUS Search by known agency. CHORUS Search is in beta. Search results have Results do not currently several facets available: indicate whether article sub-agencies and research is publicly available or centers; category behind a paywall. The (subject or topic); type CHORUS team is working on (conference paper or this and other functions. journal article); year; CHORUS linking is enabled publication name; by CrossRef. publisher. Can search Relationships with many agencies within same CLOCKSS and Portico search environment. ensure preservation and archiving of content. Google Search for all scholarly Google Scholar goes Scholar content, across beyond publicly funded content-type. Includes research and is included all funding types. here for breadth. National Search for ebooks Print books may be Academies published by the ordered for a charge. Press Acedemies. Academies PDF downloads are free. include National Academy of Science, National Academy of Engineering, Institute of Medicine, and National Research Council. Publisher Use specific publisher Special issues on sites site when access is specific topics, added available through media, and other institutional funding or supplemental material are other means. Search and often available. browse publications. Science.gov Search by topic, Science.gov is a regardless of agency. federated gateway to the Search results have research of more than 60 topic, author, and date databases from 15 federal facets. Set up alerts agencies. within searches. WorldWide Search by topic across WorldWideScience.org is a Science.org international agencies federated gateway to and organizations. Search national and results include the international science following facets: topics, portals. country, authors, publications, publishers, dates
|Printer friendly Cite/link Email Feedback|
|Comment:||Leveraging open and public access content from government-funded research.|
|Date:||Jul 1, 2015|
|Previous Article:||LexisNexis Legal & Professional released a new version of its Early Data Analyzer software for the management of saved searches and the creation of...|
|Next Article:||Tracking the U.S. government through its records: findings on law enforcement, immigration, the IRS, and the courts.|