The ASIDIC 1997 fall meeting; speakers focused on search-and-retrieval technologies and techniques.
What a coup of a keynoter! The printed copy of the speakers' biographies stated, "Andrew Elston is currently evaluating opportunities to continue his career in publishing and information services while he oversees the final closing of operations at NewsNet, Inc. this month."
It appeared that Elston was at ASIDIC to tell us why he thinks NewsNet failed. NewsNet was a 15-year-old, established online database of about 1,000 newsletters and other news formats. It spent a lot of money building interfaces to the Web and went live there just 2 years ago. But when it became clear that NewsNet was no longer a competitive product, the parent company first tried to sell it, then simply gave it up.
One important factor was that what appeared on the Web was the same proprietary product as its online version. NewsNet acquired new users on the Net, but the kind that didn't stick around after retrieving a--meaning one--quick answer. The traditional users who migrated from online to the Web and ordinarily stayed online to obtain an average of 5 documents were now getting an average of 1.2 documents and flitting to other news sites, many of which delivered the same information for free. Since there are many competing sites for news--much of it free--that competition simply began to kill off NewsNet.
Elston rued that NewsNet was the victim of literally not understanding that the behavior of Web users would be very different from that of online users. He noted, in his understandably pessimistic mood, that Knight-Ridder cut off DIALOG, which became Knight-Ridder Information Services, then hung it out to dry, and has since been trying to sell it.
Elston predicted that DIALOG will break up. I also heard this from a respected colleague who predicted the break-up in about 5 years.
Let's Get Some Perspective
But, wait up. I'm not quite sure I go along with these predictions on the basis of the NewsNet experience. I brought my love of historical perspective to bear and thought about what advantages online databases had over printed reference texts back in the 1960s when online was considered revolutionary. Online solved important interdisciplinary problems in the sense that good answers could be obtained from a variety of interdisciplinary reference works--and fast.
Many of the databases had controlled indexing, and skilled professionals could search successfully through Boolean and its sophistications.
That was the value-added aspect lacking in the NewsNet product in its shift to the Web. It had nothing more to offer than its text, so it became just another unindexed site, even though it was probably better organized than most of the free sites. It is understandable that it failed.
But that doesn't mean DIALOG and its databases will fail, fall, or crack up that fast. Databases within DIALOG will fail if they have nothing more to offer than text with no real intelligent added-value features. Will that kill off DIALOG? It depends on whether downsizing in corporate information centers continues unabated. There are indications that, after the downsizing onslaught, there may be a new middle road to the role of intermediary.
Elston spoke of the resurgence of newspapers in print in the 1990s. The recent move of The New York Times in dividing its present version into more sections seems germane. Another interesting point made by Elston concerned venture capital. The venturers are not looking at established companies but to those with uncharted futures. The young, untried
technology bucks are looking good. The venturers are now true adventurers.
In a summary wrap-up at the end of a conference on search engines a couple of years ago, one point made by James Callam of the University of Massachusetts was that "Boolean is dead! Long live Boolean!" Recent conferences, including this one, have led me to wonder about "Online is dead! Long live Online!," even though the NewsNet demise was a rather frighteningly speedy one.
I was intrigued by IsoQuest and its Data Extraction Technology Tool-Kit. Tony Hall reviewed how far a 15-month-old software company has come with its NetOwl [see related news announcement in the Internet Publishing Today section] He spoke of an automatic-index software tool that can browse dynamically for company names, place names, and people names, and also can automatically extract pieces of text from full text, thus providing a useful summarization of that article.
Automatic abstracting rears its head once more. Individual, Excalibur, and Infoseek are three retrieval companies that have bought into this product. Entity extraction, the more scholarly name for this kind of retrieval, will be the subject of a major talk presented by the president of IsoQuest, Paul Jacobs, at the "Search Engine and Beyond" conference to be held in Boston April 1-2, 1998.
OCLC's Kilroy Project
Terry Noreault reviewed OCLC's Kilroy Internet database project. OCLC is reaching out to explore the Internet resources at some 800,000 Web sites to establish databases of these resources. Significantly, OCLC is developing statistical means to find these resources and feed them into its traditional Dewey and LC classifications automatically (Scorpion). It is enhancing its classification schema and states that "automatic assignment of classification is feasible." I really don't believe in classification systems for the long run, but apparently, "Classification is dead! Long live Classification!"
Those of you who have been reading Sue Feldman's articles realize what a good communicator she is. Her article in the May 1997 issue of Searcher on search engines is worth reviewing.
In her ASIDIC presentation, Feldman said that searching the Web is good for finding an answer--one good one, that is. If that's what you want, the Web's the place to go. Some may think that's a bit exaggerated, but she wanted to emphasize that that's as far as good searching will go on the Internet. For end users, that's where it's at.
Feldman was particularly good at discussing spamming problems. Other barriers she covered were the size of the Internet, rapid changes of Web pages, and inaccessible text. I liked the fact that she wasn't too enthusiastic about Excite's power-search feature, which was a trend back to Boolean. She stated outright that Boolean is not for Web searching. There are many who naively think that adding Boolean to search engines is a sign of progress. It isn't for end users.
I don't mean to suggest that Feldman is anti-Web. She isn't. She is for improvement and played a devil's advocate's role--much needed at this time.
Sue Lachance spoke of Infoseek's search engine features opening with "Is it the World Wide Web or the World Wild Web?" The features she discussed were automatic phrase recognition, proper name recognition, distributed search, topical directories created with neural network NET technology, and quality indexing guidelines. She had little to say about distributed search, which is an important new development out of Infoseek. It has received a patent for a method of searching the Web via multiple search engines, a technique that is expected to be fully implemented by the beginning of next year. Infoseek president Steve Kirsch will be addressing this at the Boston meeting in April.
Yes, you've probably guessed it by now. Announcing this Boston meeting is self-serving. I have designed and will chair the program. Please attend anyway. I promise a landmark conference. For more information, contact me or visit the Web site (http://www.infonortics.com) and click on "Search Engines Meeting."
The Gorilla Story
Mark Chussil of Advanced Competitive Strategies, Inc. conducts War Games and War Colleges for corporations and other institutions. He is one of my favorite speakers. This was the fourth time I've heard him, and I never tire of his presentations. He was asked to speak because Harry Collier usually designs programs to contain at least one speaker from an allied but remote sphere related to the audience. In this case, we learned a bit about a competitive-intelligence technique.
I paraphrase here his opening story, which teaches us about out-of-the-box thinking, something that's important in these changing times.
In experimenting to see how
intelligent a gorilla was, a
graduate student shut one up in
a room to see how long it would
take him to learn to use the
doorknob to get out. After a
long period of disinterest on the
part of the gorilla, the student
entered the room to release him,
whereupon the gorilla picked up
the student and threw him
against the wall, thus creating a
large hole in the wall through
which the gorilla then exited.
We learn three lessons from
this story: Never assume that
there is only one answer to a
question, never assume that
you know the best answer, and
never assume that you are
smarter than a gorilla.
After revealing what simulation is all about and what a corporation goes through in applying itself to War Games and the War College, Chussil ended his talk with an Arnold Palmer quote, "The more I practice, the luckier I get."
If you're at all involved in trying to make major competitive changes and decisions, you should consider Chussil's technique, and you too may get luckier.
Gordon Short of Excalibur spoke on advanced techniques in imaging, which are becoming more and more feasible and thus commercial. He spoke of Kanji recognition (recognizing strokes), face recognition (recognizing patterns), scene-change detection (image similarity/difference), and general-image searching ("gestalt," color-shape-texture).
Excalibur seems quite advanced in the commercialization of these techniques. Short showed a picture of a waterlily, which he used as a reference image, and asked his system to retrieve eight similar objects. The eighth likeness was a bunch of bananas, and one could actually detect the seemingly absurd relationship. However, one could limit the search to flowers and come up with a more relevant set.
Concept-based retrieval of images would be the next revolution of image retrieval and Short predicted it was 3-5 years off. (Brenner's law says to double every prediction you hear or see.)
Smart, Dumb, Dumber
David Bellick is search schema manager of MSN Publishing & Tools of the Microsoft Corporation and has been analyzing 2,000 queries he chose randomly from the Web. He started out by saying that NET IR is still inadequate and is technology driven. We all know that, but what he had to say further was evident, although I hadn't realized it--that the end user on the Net today represents the intelligentsia: the people who are likely to have been to college, the big earners who can afford a computer at home as well as at work.
So now I realize we have three levels of users: 1) the smart professionals who know how to search, 2) the smart end users who are pretty dumb at searching, and 3) a mass of uneducated end users who may possibly be yet dumber at searching when they finally begin to use the computer.
Bellick did a KWIC index of the terms used in the 2,000 queries and surprisingly found 4,528 total terms used and only 2,807 of them unique. The top terms were sex (17.5 percent), computer/Internet (14.9 percent), entertainment (14.1 percent), recreation/leisure (12.7 percent), business/investing (5.6 percent), and medicinal/fitness (4.6 percent).
I wonder how intelligent the intelligentsia really are. Think about it.
Ev Brenner managed the Central Abstracting & Indexing Service of the American Petroleum Institute for 30 years and is now a well-known information industry observer. He can be reached by e-mail at firstname.lastname@example.org.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Association of Information and Dissemination Centers|
|Date:||Nov 1, 1997|
|Previous Article:||ACCESS software for litigation support; create your own litigation support database using Microsoft's ACCESS.|
|Next Article:||WavePhore adds Intralinks to Newscast Today v. 3.1.|