Good and bad examples of government portals: not all government portals are created equal. (Digital Librarianship).
The DOE Portal
This Department of Energy portal (http://www.energy.gov) is a very well-designed one with clear paths for those who want to fly through the department's numerous topical units and areas and drill down for details when something catches their fancy. It also has a good alphabetic index with direct links to the related site(s). Within the library section, the featured site as of this writing is the Renewable Resource Data Center, which is an excellent starting point for research by major energy categories.
If drilling down is not your favorite way to get information, the home page of DOE offers a simple query cell with the option of an advanced screen. Simply typing in the query [air leakage solar energy] will bring up 562 mostly consumer-oriented documents, the vast majority of them relevant or highly relevant for the topic. The search function is provided via FirstGov, which gathers the information from predefined DOE domains that are visible to its search engine, including both the administrative and laboratory sites. It also offers an exemplary advanced query template (see Figure 1) if you need to run more refined queries than a string of keywords, which is-correctly--"AND-ed" together by the software automatically. It is exemplary because even a novice could make sense of the many search options, such as exact phrase searching, the level of sites (federal, state), the domains, the document formats to be searched (including ASCII, HTML, XML, Word, Excel, PowerPoint, and PDF), and many other options. Althou gh it does not include the invisible sites, the documents that are stored in proprietary databases, it provides a link to the Energy Library (which is kind of a portal), and to Online
Databases, which takes you (and me) to the EnergyPortal.
The EnergyPortal (http://kratos.osti.gov:1999) lets you run your query across a variety of databases, including several from DOE and others from government agencies such as the Department of Defense and NASA, which are grouped under Multidisciplinary Databases. There are 24 total open-access databases to choose from, and you can search a maximum of 10 at a time. (See Figure 2.) Be warned, however, that using more than five databases often chokes the portal search program.
The concept is good. It is much better to let the user choose one or more databases that may have relevant items for a specific topic than to dump tens of thousands of bibliographic records (of often-substandard and certainly non-standard content and format as they've been received from 30-plus publishers) into an energy science database. This has been the case with the much over-hyped PubSCIENCE database, as I just discussed in the October 2002 issue of Information Today ["Should PubSCIENCE Go the Way of Caesar?", p.32]. This undoubtedly increased the size of PubSCIENCE, along with the massive volume of duplicate and triplicate records. (See Figure 3.) It also greatly diluted its content and quality. The deal to have links from those records to the journal articles was touted as the ultimate in information technology, but it is not. For example, for the medical journals, PubMed has a far better solution.
The EnergyPortal search works very well for single-word queries, like biofuel, culling matching records quite quickly from a number of databases selected by the user. Importantly, in the single-word query scenario, the portal search finds the same number of records as you would by going to each of the sites and typing in the same query. Sometimes there are limits, like a maximum of 100 records from the Energy Citations database (because the native software puts the first-round limit at 100).
Searching with more than one term, however, is poorly implemented, and gives a very misleading impression of the coverage of several databases. The problem is that the Help file of EnergyPortal claims that "If a phrase is entered, the AND operator is assumed" (for the space between the words). This would be good, as it is with the DOE portal search. Unfortunately, it is not true. Unidirectional adjacency is assumed in executing the search, meaning that the words entered in your query must appear adjacent in the same order as entered in the documents to qualify as a hit. This spells big trouble when an unsuspecting user believes that the query words will be interpreted with an AND relationship. No wonder the software returns empty-handed from all the sites after searching for the exact phrase of [air leakage solar heating]. With the promised AND relationship between words, that query should have returned 857 records from the four databases. The zero-hit result is so absurd that hopefully it would catch the us er's attention, and cause her to question the result and to try again typing in her own AND between words.
The situation gets more dangerous when a few records are retrieved but they represent only a fraction of the relevant records the databases may have. Here's an example: The query [Hawaii solar energy], interpreted as a sentence, finds two matching records from the DOE Infobridge database (which it searches full text), three from its Energy Citations database, and none from PubSCIENCE. Doing the exact same search at the original sites would yield 918 records from Info-bridge, 114 records from Energy Citations (which I limited to journal articles and the time frame of 1974-2002/09/10), and one from the Archive subset of PubSCIENCE. The odd thing is that if the original query just would be passed on to the search programs of Infobridge and Energy Citations, the portal software would retrieve the correct number of records because it does consider a space as an AND operator. PubSCIENCE indeed interprets a space as unidirectional proximity, but the portal software should pass the query to it with ANDs inserted bet ween words, just as the Help file promises.
Now, all these are databases developed and maintained under the auspices of OSTI, the Office of Scientific and Technical Information, so knowing the default parameters is not a big deal. You can imagine how bad the case becomes when adjusting queries for databases of other offices, let alone other agencies.
The Subject Portals
There are a dozen subject portals offered by OSTI (http://www.osti.gov/sub jectportals), which are very redundant, typically searching three databases that can be searched through the EnergyPortal. The subject portals are visually more appealing but the software is the same, so it makes the same mistakes as the EnergyPortal, and then some. In all my tests, the Concentrating Solar Power portal crashed. (See Figure 4.) It did not find any item from the National Renewable Energy Laboratory (NREL) (which by the way has good software), where I found one matching item when I searched it directly. The point is not finding only one item of course, but the reason why the subject portal's search program can't find it when it clearly exists.
Not All Created Equal
Portals are meant to facilitate your information retrieval process. Some do so, but others provide a rather obfuscating and inaccurate service. Those users who like the convenience of one-stop searching may get shortchanged by bad portals. OSTI, for example, has many problems streamlining its data flow (which is obviously very flawed), resulting in thousands of duplicate and triplicate records in PubSCIENCE--but not in the Energy Citations database. This is obviously because the same records were added again and again to PubSCIENCE. Portals should not add further confusion to the highly redundant slicing and dicing of the master database(s) in order to spawn seemingly new products and services. Portals should help to integrate those databases and to allow users to limit their search to one or more domains, such as articles and reports, by virtually segregating the components of the master database when they wish to do so.
If you want to see more proof of my points, check my Web site. I had limited space for illustrations here, but I will post extensive illustrations at http://www2.hawaii.edu/~jacso/extra.
I did find one advantage in spending a tad too much time with energy databases that are not really of interest to me. The patent errors in the design and implementation of some of the components of the energy information system are so "good" that I can use them in my upcoming systems analysis course. I will use the DOE example to demonstrate to students the good and the bad of an actual digital information system that has been much in the limelight lately. I hope that my use of this example will help my students learn to avoid the worst mistakes.
Peter Jacso is associate professor of library and information science at the University of Hawaii's Department of Information and Computer Sciences. He is also a columnist for Information Today, and a popular conference speaker. His email address is firstname.lastname@example.org.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Information Service Review|
|Publication:||Computers in Libraries|
|Article Type:||Product/Service Evaluation|
|Date:||Nov 1, 2002|
|Previous Article:||Once upon a time, librarians managed card catalogs. (Online Treasures).|
|Next Article:||The Association of Specialized and Cooperative Library Agencies (ASCLA). (People & Places).|