Is It Time for a Universal Academic Search Engine?
We who teach information literacy have to contend with a large number of diverse systems. Our messages to students are often, "You can do that with this system but not with that system," or, "This one has subject filters; that one doesn't." What is more, even scholars complain that academic search engines are too challenging (Innovative Research Report, April 2019, "Academic Libraries Have the Most Trusted Resources, but Their Tools Are Hard to Use"; go.iii.com/2019-academic-trends.html).
I'm wondering if the time has come to consider creating a universal academic search engine using the best features of existing engines. But that's ridiculous, isn't it? The major platforms out there compete with one another. They spend a lot of cash on and devote a lot of time to crafting their search tools. There is no way these competitors are going give up their best features, let alone get together to establish anything in common.
So why consider it at all? Simply because info lit instructors are frustrated with tools that are lacking in the functionality we know would improve them. There is also too much variety, which serves users poorly.
Maybe what follows is only a useless thought exercise, but I'd like to pursue it, even if only to figure out what an optimum universal engine might look like. I'll focus on four platforms: EBSCO, ProQuest, JSTOR, and Elsevier. I'll also throw into the mix Google Scholar. In each case, I'll look at great features to adopt and gaps that other platforms might fill.
Full confession: I'm an immense fan of the EBSCO platform overall. Some features are a plus: the left-column location for filters, the commitment to metadata, the subject heading system that merges various vocabularies into one integrated language, the easy creation of permalinks to both individual citations and searches, and the easy navigation within the platform. EBSCO has led the way in moving as far as possible from the limitations of a keyword-based search tool, providing many opportunities to limit and manipulate results. Our students learn how it works fairly quickly and are soon comfortable using its advanced features.
Does that mean EBSCO is the model for a potential universal academic search engine? Not entirely. One vexing problem is that the features designed to work with results (in the right column) are clunky. While you can easily save multiple results to a folder, you have to open each record to get to the citation and export features. Other databases, such as JSTOR, have those features at the citation level without having to open records. A second challenge is that the citation creator, while it works well with APA, does not do as well with a number of other bibliographic styles, where small nagging errors are common. The EBSCO platform insists on adding its own URLs when a citation lacks a DOI.
Overall, the EBSCO platform is strong, but using the results page to good effect requires a number of clicks to get at many of the features. It works better with the search filtering than it does with the tools to manipulate search results.
ProQuest has adopted many of the features first introduced by EBSCO--the left column filter system, plus the ability to distinguish item types, date ranges, subject, and language. While subject headings are unique to a particular database, rather than being crosslinked to all ProQuest databases, the subject heading filter has an "include" or "exclude" feature, which can be very helpful in narrowing to, or screening out, records. Like EBSCO, ProQuest has permanent links in its records.
ProQuest's citation feature is available on the results page. You can select multiple items and then create citations without having to load your search results into a folder, as is done in EBSCO. ProQuest uses the RefWorks citation tool, so the records look quite good, despite the addition of permanent links to the formatted citations, a feature that will frustrate anyone trying to read formatted reference lists.
Like EBSCO, a ProQuest user can create a personal account to store citations. An additional feature, available because ProQuest owns RefWorks, is that users can synchronize their ProQuest account with RefWorks to create automatic downloads to the RefWorks bibliographic manager.
ProQuest has a few benefits to offer, although, for the most part, it has followed EBSCO's lead.
For all the superb quality of resources available in JSTOR, its search function is very limited. A look at its bibliographic records explains why this is. Quite simply, JSTOR's metadata is minimal. Beyond the citation and permanent link, each record only has "topics," which, unlike subject headings, are broad-based and more like categories than subjects.
JSTOR does shine in providing download, cite, and save links prominently to the right of each citation. The citation feature, while adding permanent links which usually have to be removed by users, offers clear and well-formatted results. The export function is somewhat hidden. It's found in the citation link. In JSTOR, like EBSCO and ProQuest, you can also export multiple citations.
The real genius with JSTOR may well be its Text Analyzer, which I wrote about in the September/October 2017 InfoLit Land column ("The Benefits and Pitfalls of Text-Match Searching," Online Searcher, vol. 41, no. 5, pp. 57-59). This is a tool that allows users to manipulate results in dramatically new ways, starting with a search on a body of full text and ending with multiple nuanced filters. When I show the Analyzer to my students, they are amazed and ask why this tool is available only in JSTOR.
In many ways, Elsevier's ScienceDirect is like other databases--basic search, advanced search, filters in the left column, and standard features to download or export full text. But it is important to recognize that this is a science database, despite its inclusion of non-scientific journals. It hyperlinks authors, with affiliation and email information, because those options are important to the scientific community. It offers the ability to share citations on social media. It provides an HTML version of the article when you open the record, along with a hyperlinked table of contents, because researchers in the sciences know there is a standard structure to articles and want to get at the "good stuff" quickly.
Like other scientific databases, subject headings are not a priority. Why? Simply because scientific terminology is so standardized, keyword searches bring up reliable and relevant results. This means that ScienceDirect is non-hierarchical, unlike a platform such as EBSCO's, for which you can use subject headings to narrow down and focus results. Woe betide the beginning researcher who searches ScienceDirect on "polar bears" and gets nearly 5,000 results. True, you can filter by date, item type, and journal name. There is even an open access option. But the researcher will have to do a new search with more keywords to get the results down to a manageable number.
I know that some accuse the categories and hierarchies of subject heading systems of doing everything from limiting options to forcing searchers into establishment/colonial categories. At the same time, I constantly see students, struggling to focus their thinking, use subject headings to narrow the scope of their results and cut the number of citations they need to deal with. Sorry, but any universal academic search engine will require subject categories to refine searches.
The world's most popular academic search engine needs to be considered. In some ways, it is already close to a universal tool. Searching is simple. It has great features such as connection to citing literature, a related items search, a citation formatter, and even the ability to connect with holdings of a local academic library. But GS is a mess. Let me explain.
Most searches generate thousands of citations. Other than a clunky date limiter or the option to use the (almost hidden) advanced search to narrow to title words or specific publications, there is no way to cut the number of search results to something manageable. Not only is there limited metadata, but there is no way to filter for type of material: book, article, conference paper, etc.
GS is big, pretty much comprehensive, and a really weak search tool.
FEATURES OF AN OPTIMUM UNIVERSAL ACADEMIC SEARCH ENGINE
I suppose that best practices in search engines are always going to be, to some extent, a matter of personal choice, but the following seem to me to be essentials:
1. Movement from simple to complex
In the era of Google, most searches are going to begin with a keyword search. So, a simple keyword box is OK at the beginning, as long as there are multiple ways to refine things on the results page. Thus, the best search engine is going to get much more complex on the results page. Unless there are effective ways to cull a result set of 12,000 down to a result set of 95 and provide superior focus in the final set of citations, the search engine is not useful.
2. Controlled vocabularies for subject and author
While some of us may think that the day of the subject heading is over, we need to recognize that all topics are hierarchical. Simple keywords don't reveal the ways in which a topic is embedded within a larger category and has narrower categories below it. A robust subject heading system, in which EBSCO remains the leader, is essential, although ProQuest's "include" and "exclude" options are a good addition. The best search engine will also have linked author metadata to find other works by the same author.
3. Simple means to work with results
JSTOR's highly visible download, save, and cite icons next to each citation in the result list make it easy to determine what to do with the products of your search. ProQuest's function to select and cite multiple citations right from the results page is helpful. The more a database can provide all citation manipulation features right on the results page, the better users can ensure that their searches actually produce something of value.
4. Good metadata
Records with author and subject links, abstracts, and permanent links are essential. Most academic databases, except Google Scholar and JSTOR, do this quite well. The optimum search engine enhances whatever metadata they start with to make it as strong as possible.
5. Full lists of article references and citing literature
This was always a specialty of Web of Science, but is echoed in the new Dimensions database, as well as in Google Scholar's citations feature.
6. An interface that is intuitive and relatively easy to learn
A lot of databases seem to have been built for scholars; however, most users are students, and many of them have limited search skills. We don't want just a single search box with a few filters. At the same time, in a more complex tool, everything needs to make sense so that users can relatively quickly learn how to optimize more sophisticated search features. EBSCO, for example, has focused much of its development research on balancing ease of use with complexity.
7. An alternative
Here I am referring to another means of searching the database, as seen most encouragingly with JSTOR's Text Analyzer, which puts result filtering totally in the hands of the user. If JSTOR could release this tool to other database providers, we would have the alternative that we need.
A UNIVERSAL ACADEMIC SEARCH ENGINE?
The idea of a universal academic search engine is a pipe dream. I admit it. Companies providing databases protect their own bells and whistles ferociously. So why am I even suggesting such a thing? Simply because we need to think about what a really good search engine should look like. We need to advocate for best practices. If the current players are unwilling to give up their turf, maybe someone will build a better engine from scratch.
William E. Badke
Trinity Western University
William E. Badke (firstname.lastname@example.org) is associate librarian at Trinity Western University and the author of Research Strategies: Finding Your Way Through the Information Fog, 6th Edition (iUniverse.com, 2017).
Comments? Email the editor-in-chief (email@example.com).
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||infolit land|
|Author:||Badke, William E.|
|Date:||Jan 1, 2020|
|Previous Article:||New Year, New Tech: How Fitness Technology Transforms More Than Just Our Workouts.|
|Next Article:||Data Sets, Game, and Match.|