Printer Friendly

Looking Beyond the Major Search Engines.

Shirley Duglin Kennedy is an information systems consultant a: Honeywell's aerospace facility in Clearwater, Florida. Her book, Best Bet Internet; Reference and Research When You Don't Have Time to Mess Around, published by the American Library Association, is now available. (See a list of URLs from the book at http:// www.ala.o rg/editions/openstacks/best bet/index.html.) Her e-mail address is skennedy@reporters. net.

The Internet has vast resources of untapped information available

Puffery-w hat a wonderful word, and an apt description for the manner in which the major search engines tout the size and quality of their respective databases. To list a few, "Who's the new Godzilla of the Net at 140 million pages?" (http://www.altavista.digital.com), "the #1 rated search engine" (http://www.hotbot.com), "access to more than 50 million Web pages, 140,000 pre-selected Web site listings, and thousands of Usenet postings" (http://www.excite.com), "Over 500,000 of the best pages on Web" (http://www.infoseekeom), "Combining the Best of the Web with Premium Material" (http://www.nlsearch.com),

But experienced Internet users-especially those of us who search professionally for our daily bread-know that even the biggest and best generalized Web search engine cannot begin to comprehensively index the incredible, ever-growing, and mutating body of data that comprises the digital universe. Gradually, you come to realize that there are huge categories of specialized online resources-beyond those requiring a subscription fee, like Dialog, etc-that will never make it into the databases of AltaVista, HotBot, et al.

Welcome to the "hidden Internet." The ability to mine its rich veins is what separates the men/women from the boys/girls in the tumultuous realm of Net searching. What constitutes the hidden Internet? Well, for one thing, there are the repositories of "formatted" files-PDF (Adobe Acrobat) and postscript files being the most prominent examples. The "spiders" that crawl the Web, sucking up the text of documents for the various search engines, cannot "look" inside these formatted files to index the words contained therein. They can index only words that link to a particular formatted file. Think about all the government documents, product specification sheets, and research papers that are available only as formatted files.

Navigating the Hidden Internet

Then there are the great number of online publications-newspapers, magazines, etc.-that have much to offer in the way of content, but require registration and login before their information may be accessed. Even if the registration is free (e.g., http://www.nytimes.com), Web spiders are pretty much barred at the door since. for all their technological sophistication, they can't fill out the requisite online form. For this reason, it's worth knowing about Excite's NewsTracker (http://nt.excite.com), which can search at least a few days' worth of about 300 different online publications. Apparently, the spider is programmed to login where and when necessary to search content. Thus, NewsTracker can be useful to a certain extent for current awareness, but its search engine is only as good as Excite's technology (i.e.. no field searching or limiting).

For research in the physical or social sciences, the Web is an abundant source of data sets-collections of statistics on topics like labor and employment, atmospheric and meteorological events, industrial and agricultural production, etc. Users retrieve this information by selecting or specifying variables via a specially designed form interface available on the site. Thus, these databases are not directly accessible/browsable by Web spiders. A good example of one of these is ArcData Online (http://www.esri.com/data/online), where the Environmental Systems Research Institute has collected basemap data and thematic data sets that allow users to create on-the-fly themed maps based on household income, crime rates, etc. (While you're here, be sure to visit the Map Gallery where you'll find maps created by past visitors, up-to-date U.S. Geological Service topographic maps, U.S. street data, and recent earthquake maps.)

A similar situation exists among the plethora of subject-specific Web databases where results appear as dynamically created HTML pages-pages that don't "exist" until they are pulled together as a result of an individual search. Think about the average online yellow pages directory. You search for all the pizza places in a particular ZIP code, and a page of listings materializes before your eyes. But you know this specific set of results is not stored in the yellow pages database as a whole page. Another person who searches for all the Pizza Huts in a particular city or state may well receive some of the same entries you saw if the geography is compatible.

Another significant part of the hidden Internet is the often-overlooked collection of non-Web resources-ftp archives, gopher servers, telnet sites, and many mailing list archives. I've had excellent luck prowling specific university department ftp servers for research papers by professors and graduate students, I can often find something instantly-and for free-that I would otherwise have had to order from a vendor. (Note that these papers tend to be formatted files-mainly PostScript. Unless your printer can handle PostScript, you'll need a PostScript viewer utility. Get the scoop from Aaron "Wigs" Wigley's Internet PostScript Resources page at http://yoyo.cc.monash.edu.au/wigs/postscript.)

OK. So what's the best way to approach the hidden Internet? Well, as with so many other information quests on the Net, don't reinvent the wheel. Undoubtedly, someone who knows more than you do about the subject in which you're interested has already "been there, done that," I have a particular fondness for and admiration of academic librarians. When I need to find reliable Internet resources in a subject area with which I'm unfamiliar, I've learned to place my trust in academic library subject specialists. Large academic library Web sites can be a gold mine of pointers to hidden Internet resources.

Gary Price, at George Washington University's Gelman Library, has compiled an annotated list of "DIRECT LINKS to the search interfaces of resources that are not easily searchable from general search tools such as AltaVista, HotBot, and Infoseek" (http://gwis2.circ.gwu.edu/gprice/direct.htm). This is an excellent jump station for anyone who wants to inhale the essence of the hidden Internet. The links are esoteric; Price has loosely categorized them under some general headings such as archives and library catalogs, news sources and serials, business/economics, etc. Don't miss the rich collection of ready reference tools, which includes such specialized resources as the U.S. Department of Defense's Dictionary of Military Terms (http://www.dtic.mil/doctrine/jel/doddict) and Edmund's List of Incentives and Rebates-"programs offered by (automobile) manufacturers to increase the sales of slow selling models or to reduce excess inventories"-(http://www.edmunds.com/edweb/Incentives.html).Price's page is not the fan ciest one you'll see on the Internet, but he's found some real jewels to share,

Meanwhile, on the other side of the world, at Nanyang Technological University in Singapore, library Web administrator Amanda Harizan has put together a collection of "special engines to search on topics ranging from architecture, fine arts, cybercafes to even orchids" (http://web.ntu.edu.sg/library/specialcat2.htm). This site is especially deserving of a bookmark for its exhaustive listing of links to worldwide regional search engines, classified by country names. A broad assortment of subject-specific search engines originating in Asian countries is useful for anyone with business interests in that part of the world.

One of my favorite all-time fishing holes is The Internet Sleuth (http://www.isleuth.com). It's an ever-expanding collection of search interfaces to a vast number of specialized databases, grouped by topic. What's cool is that you can search any of them from right here-sometimes several at a time-and brief hints are provided. Or you can follow the links to the actual databases themselves-about 3,000 at last count. There's also a "research forum" where you can ask questions about finding things on the Web.

Get Spooked

On a related-well, not really-note, a guy named William Knowles maintains that "the major intelligence agencies" arc scanning "messages floating around the Internet, looking for something interesting." As a matter of fact, he says, many people think this is why "the Internet is so slow" OK. For what it's worth, Knowles has posted his collection of "spook words"- terms the intelligence agencies are ostensibly scanning for in digital communications-at http:llwww.dis.org/erehwon/spookwords.html. Here's an idea: Copy the list and paste it between the header tags of your own Web page and see who taps your phone or knocks on your door.

Sardine, chameleon man, Monica, silicon pimp, toad, fish data havens, UNIX

For Experienced Netters

Walt Howe (http://members.delphi.com/walthowe/web/index.html), who manages the Publishing on the Web and Navigating the Net forums on Delphi Internet (among other things), maintains,., "You are an old timer on the nets if you remember when:

* Most Web pages had a Netscape gray background by default.

* Everyone used Mosaic for graphical browsing.

* Those wonderful Vatican exhibit Web site pictures took 40 minutes to download and view. And you did it anyway. because it was so new and exciting!

* Most Web pages were text only.

* You used vi or emacs to write your Web pages.

* The best search guide to the nets was at CERN.

* The best search guide to the nets was VERONICA.

* The best search guide to the nets was WAIS, at think.com.

* The best search guide to the nets was ARCHIE.

* You used command-line ftp.

* You read your mail with PINE, ELM or VMSMAIL.

* Gopher was the new, exciting Internet protocol that made the nets user friendly."

Wallow in a bit more nostalgia in a nicely written history of the Internet available in Walt's Navigating the Net forum at http:// www.delphi.com/navnet/faq/history.html.

bird dog, domestic disruption, ionosphere, mole, keyhole, mixmaster, flame, infowar, Steve Case

Adobe Photoshop Aids

Adobe Photoshop is an amazing program for creating and manipulating graphics, Alas, most of you reading this column- and most of us reading this publication- did not start out to be graphic artists. And yet, maybe we are now nursemaiding Web pages that could use some pizzazz. Creating Cool Photoshop 4 Web Graphics, by David D. Busch (IDG Books, 1997, ISBN: 0-7645-3033-X, $29.99) is worth a look. Rather than attempting to teach you to master this extremely complicated piece of software, Busch shows you, step by step, how to do the things you need/want to do- create buttons, retouch photos and make photo montages, use filters, choose the proper file format for your graphics, etc. Virtually all of the information provided in this book will also be useful to those upgrading to Photoshop 5.0.

Meanwhile, Photoshop mavens may want to grab a copy of O'Reilly and Associates' Photoshop in a Nutshell, by Donnie 0. Quinn and Matt Leclair (1997, ISBN: 1-56592-313-8, $19.99). This is the one that takes you through every tool, command, palette, menu, and submenu, including their major uses ... and misuses.

On the other hand, if you just want to make a few 3-D buttons, there's a neat online toy that can help you. MediaBuilder's ButtonMaker (http://www.buttonmaker. corn) can turn any GIF-even animated ones-into "a nicely beveled button." You can pick one of the images from their library, plug in the URL of an image located somewhere on the Web, or transform an image that resides on your own hard drive-fast, fun, free.

nitrate, LEXIS-NEXIS, SIGDASYS, Capricorn, artichoke, captain, rebels, curly, Tangimoana Beach, Armani

Road Trip Planning

Where do you want to go today? Probably not to the Department of Justice. But if you're planning a summer road trip, you may want to check out Rand McNally's online road construction database (http:// www.randmcnally.com/construction/index.html) to see if your potential route will be disrupted by heavy machinery and workers in orange vests. Search by U.S. state or Canadian province, by road type (Interstate, state, etc.), or by date. The database is updated monthly.

Wine for the Uneducated

Enjoy wine and want to learn more about it? Intimidated and/or nauseated by wine snobs? Surf on over to The Uneducated Palette: Wine for the Rest of Us (http://www.enjoywine.com), created and maintained by "a man and a woman who are learning about wine themselves (Not only that-this couple actually met online and were due to get married in he Napa Valley in May.) They concentrate on the basics here, and the emphasis is on wines costing less than $15 per bottle that are widely available in most locations.

Study Incentives

Something to enrich the mind of the teenage male in your household: "FREE download of Vocabulary Rasher for Windows, a unique educational shareware program that helps you enhance your vocabulary while entertaining and motivating you with pictures of swimsuit models" (http:// www.zoft.com/vocabulary).

Secure Internet Connections, Firewalls, plutonium, William Gates, illuminati, clone, Flu, Loin, Ft. Meade, burned, indigo, wire transfer, Bubba the Love Sponge ...
COPYRIGHT 1998 Information Today, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 1998 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Kennedy, Shirley Duglin
Publication:Information Today
Article Type:Directory
Date:Jul 1, 1998
Words:2176
Previous Article:Web Access Available for BIOSIS Products from SilverPlatter.
Next Article:Euromonitor Launches Sources Database on the Web.
Topics:

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters