Computing 'fusion trees' to explode barriers.
Developed by Michael L. Fredman of the University of California, San Diego, and Dan E. Willard of the State University of New York at Albany, this mathematical recipe, or algorithm, theoretically could speed the process of shuffling data by circumventing a key step in traditional sorting and searching methods. Whereas standard methods compare two numerically encoded items at a time, the new approach provides a way to compare many numbers at once.
Although not yet applicable to any practical situation, this theoretical result overturns long-held notions, described in nearly every computer-science textbook, concerning the speed of certain algorithms. "When Fredman and I originally made this discovery, people continually asked us where we were cheating," Willard says.
Computer scientists have generally assumed that sorting involves successively comparing two items or numbers at a time to see which one should precede the other. In the worst possible case for a given selection of items, such a process requires making a specific number of comparisons that can be calculated for any number of items. That number represents a kind of minimum speed limit for sorting.
In effect, Fredman and Willard found a way to raise that minimum speed limit. Instead of relying solely on comparisons of number pairs, they manipulate the data and organize the numbers into a new form, dubbed a "fusion tree," which enables them to compare one number against many others during a single computational step. A saving results because the number of instructions needed to manipulate the data is less than the number of comparisons normally required in sorting.
The algorithm developed by Fredman and Willard performs this feat by considering only a small portion of the string of ones and zeros, or bits, used to express each number. "We noticed that when you want to compare one number with several, you don't need all the bits of each of the numbers you are comparing," Willard says.
Their sorting method provides a way to select the relevant bits from a given set of numbers, group these subsets into a single expression and perform certain arithmetical operations to make multiple comparisons in a single step. A similar approach raises the minimum speed limit for searching.
At present, the new algorithms significantly outpace standard methods only when the number of items in the database is sufficiently large. "The complicated nature of these algorithms, as they are currently constituted, does not allow for improved efficiency unless they are being applied to vastly large amounts of data," Fredman says. "Nevertheless, these results suggest that future avenues of research may eventually benefit from the discovery."
That research could lead to practical, efficient methods for sorting items in databases much larger than any in existence today. Fredman and Willard are also exploring the possibility that their approach can speed other algorithms developed in the past to solve various problems, such as finding the shortest or least costly telephone or transportation network to link a number of cities.
"Their work . . . will have profound influence on theoretical computer science for years to come," comments B. Gopinath of Rutgers University in Piscataway, N.J.
A paper describing the new algorithms has been accepted for publication in the JOURNAL OF COMPUTER AND SYSTEM SCIENCES.