AltaVista, Compaq and IBM Researchers Create World's Largest, Most Accurate Picture of the Web; `Bow Tie' Theory Shows the Web is Not as Connected as Previously Thought.Business Editors/Technology Writers NOTE TO MEDIA: Photo available on BW PhotoWire/AP PhotoExpress, NewsCom, PressLink and on Business Wire's Web site at www.businesswire.com SAN JOSE San Jose, city, United States San Jose (sănəzā`, săn hōzā`), city (1990 pop. 782,248), seat of Santa Clara co., W central Calif.; founded 1777, inc. 1850. , PALO ALTO Palo Alto, city, California Palo Alto (păl`ō ăl`tō), city (1990 pop. 55,900), Santa Clara co., W Calif.; inc. 1894. Although primarily residential, Palo Alto has aerospace, electronics, and advanced research industries. and SAN MATEO San Mateo (săn mətā`ō), city (1990 pop. 85,486), San Mateo co., W Calif., on San Francisco Bay; inc. 1894. It is a commercial and retail center with some high-technology manufacturing. San Mateo, Spanish for St. , Calif.--(BUSINESS WIRE)--May 11, 2000 Scientists from IBM Research IBM Research, a division of IBM, is a research and advanced development organization and currently consists of eight locations throughout the world and hundreds of projects. , Compaq Corporate Research Laboratories and AltaVista Company have completed the first comprehensive "map" of the World Wide Web, and uncovered divisive di·vi·sive adj. Creating dissension or discord. di·vi sive·ly adv.di·vi boundaries between regions of the Internet that can make navigation difficult or, in some cases, impossible. Previous studies, based on small samplings of the Web, suggested that there was a high degree of connectivity between sites as evidenced by recent reports on the "small world Web" and 19 degrees of separation. Contrary to those preliminary findings, the new study -- based on analysis of more than 500 million pages -- found that the World Wide Web is fundamentally divided into four large regions, each containing approximately the same number of pages. The findings further indicate that there are massive constellations of Web sites that are inaccessible by links, the most common route of travel between sites for Web surfers. Developing the "Bow Tie" Theory explained the dynamic behavior of the Web, and yielded insights into the complex organization of the Web. These discoveries will help computer scientists better understand the structure of the Internet, and lead to new technologies and design advances that will speed and simplify e-business. "Bow Tie" Theory Explains the Four Regions of the Web The image of the Web that emerged through the research was that of a bow tie. Four distinct regions make up approximately 90% of the Web (the bow tie), with approximately 10% of the Web completely disconnected from the entire bow tie. The "strongly-connected core" (the knot of the bow tie) contains about one-third of all Web sites. Web surfers can easily travel between these sites via hyperlinks; this large "connected core" is at the heart of the Web. One side of the bow contains "origination" pages, constituting almost one-quarter of the Web. "Origination" pages are pages that allow users to eventually reach the connected core, but cannot be reached from it. The other side of the bow contains "termination" page, constituting approximately almost one-quarter of the Web. "Termination" pages can be accessed from connected core, but do not link back to it. The fourth and final region contains "disconnected" pages, constituting approximately one fifth of the Web. Disconnected pages can be connected to origination and/or termination pages but are not accessible to or from the connected core. Impact of the Study With the Bow Tie Theory, and its new explanation of the structure of Internet, the Internet, the, international computer network linking together thousands of individual networks at military and government agencies, educational institutions, nonprofit organizations, industrial and financial corporations of all sizes, and commercial enterprises scientific and business communities will now be able to: -- Design more effective Web crawling strategies. Crawling then indexing is the fundamental method employed by search engines to organize the Internet. To achieve more complete coverage, AltaVista and other search engines will be able to develop more advanced crawl strategies to capture more of the Web -- Increase the effectiveness of e-commerce. Through the design of more effective browsing, advertising, measuring and modeling, e-commerce sites may decide to use different strategies for attracting surfers from various regions. For example, an "origination site" will have to increase its efforts to be easily found by Web crawlers See crawler and WebCrawler. . Once the site is linked to the connected core, its strategy may then shift to other traffic-generating measures -- Analyze the behavior of Web algorithms that make use of link information. Because many search engines use link information in ranking algorithms, they become targets for link "spamming" intended to create an artificial increase in a site's linkage. -- Predict and capitalize upon the continued evolution of the Web. The researchers believe that the Bow Tie structure will be maintained as the Web grows. While some pages may evolve into the connected core, new pages will continue to be created in all three other regions -- Create mathematical models
researchers can now develop new models to study the growth of the Web and possibly predict the emergence of new, yet unexplored phenomena on the Web. This study -- the largest ever to be conducted on the topography topography (təpŏg`rəfē), description or representation of the features and configuration of land surfaces. Topographic maps use symbols and coloring, with particular attention given to the shape and elevations of terrain. of the Web -- is part of an ongoing, collaborative project by AltaVista, Compaq and IBM (International Business Machines Corporation, Armonk, NY, www.ibm.com) The world's largest computer company. IBM's product lines include the S/390 mainframes (zSeries), AS/400 midrange business systems (iSeries), RS/6000 workstations and servers (pSeries), Intel-based servers (xSeries) . The researchers expect to update the study on a regular basis from collected data using AltaVista's search engine and advanced connectivity server software with Compaq AlphaServer system containing 16 gigabytes of RAM, enough to hold the entire Web map in memory. IBM Research analyzed the data and contributed to the development of the "Bow Tie" Theory. The initial findings will be presented simultaneously at the 9th International World Wide Web Conference, Amsterdam (May 15-19) and at the ACM (Association for Computing Machinery, New York, www.acm.org) A membership organization founded in 1947 dedicated to advancing the arts and sciences of information processing. In addition to awards and publications, ACM also maintains special interest groups (SIGs) in the computer field. PODS 2000 Conference, Dallas (May 14-19). Visit the following link to retrieve the "Web Map/Bow Tie Theory" conference paper (posted after May 14): http://www9.org/papers/papers.html/ (members of the press community can request an advance copy of the conference paper by contacting the press contacts at the companies). AltaVista Company AltaVista Company is the premier knowledge resource on the Internet at www.altavista.com. Building on its strong search heritage and patented technology, AltaVista unlocks the vast Internet to provide the richest, most relevant information access across multiple dimensions, including: Web pages, shopping, up-to-the-minute news, live audio and video, and community resources. AltaVista offers informative services including the multi-dimensional AltaVista Search, pure Web page Raging Search (www.raging.com) from AltaVista, AltaVista Shopping.com, AltaVista Live! personalized portal See personal portal. , and AltaVista Free Internet Access See how to access the Internet. . AltaVista is a majority-owned operating company operating company A business that engages in transactions with outsiders. of CMGI CMGI Commonly Maintained Grounds Infrastructures CMGI College Marketing Group Information (Services) , Inc. (Nasdaq:CMGI), Andover, Mass. AltaVista is headquartered in Palo Alto, Calif. Compaq Computer Corporation (company) Compaq Computer Corporation - The largest US manufacturer and vendor of IBM PC compatible personal computers and servers. Compaq was started in 1982 by three ex-Texas Instruments employees. Quarterly sales $2499M, profits $210M (Aug 1994). http://compaq.com/. Compaq Computer Corporation, a Fortune Global 100 company, is one of the largest suppliers of computing systems in the world. Compaq designs, develops, manufactures and markets hardware, software, solutions, and services, including industry-leading enterprise computing Refers to information technology in the larger company. See enterprise data and enterprise networking. solutions, fault-tolerant business-critical solutions, and communications products, commercial desktop and portable products, and consumer PCs. Compaq products and services are sold in more than 200 countries directly to businesses, through a network of authorized Compaq marketing partners, and directly to businesses and consumers through Compaq's e-commerce Web site at http://www.compaq.com. Compaq markets its products and services primarily to customers from the business, home, government, and education sectors. Customer support and information about Compaq and its products and services are available at http://www.compaq.com. IBM Research For more information on IBM Research, go to http://www.research.ibm.com. Note: A Photo is available at URL URL in full Uniform Resource Locator Address of a resource on the Internet. The resource can be any type of file stored on a server, such as a Web page, a text file, a graphics file, or an application program. : http://www.businesswire.com/cgi-bin/photo.cgi?pw.051100/bw1 |
|
||||||||||||||||

sive·ly adv.
Printer friendly
Cite/link
Email
Feedback
Reader Opinion