Xerox Scientists Apply Insights from Ethnography to Develop New Way to Categorize Documents.ROCHESTER, N.Y. -- Employing the same ethnographic methods used to observe the social order on a Polynesian atoll atoll: see coral reefs. atoll Coral reef enclosing a lagoon. Atolls consist of ribbons of reef that may not be circular but that are closed shapes, sometimes miles across, around a lagoon that may be 160 ft (50 m) deep or more. or document the culture of natives in southern Siberia, Xerox Corporation (company) XEROX Corporation - http://xerox.com/. See also XEROX PARC, XEROX Network Services. (NYSE NYSE See: New York Stock Exchange : XRX XRX Xerox Corporation (stock symbol) ) scientists have injected more human know-how into text mining, the practice of using computer analysis of documents to extract new information. The result is better categorization, with higher-quality, customized results. In a paper titled "Work Practice in Research: A Case Study" being presented here today at the International Council on Systems Engineering The International Council on Systems Engineering or INCOSE (pronounced as in-co-see) is a non-profit membership organization dedicated to the advancement of systems engineering and to raise the professional stature of systems engineers. symposium, Nathaniel G. Martin, an ethnographer and computer scientist in the Xerox Innovation Group in Webster, N.Y., described the new technology. Categorization is a powerful form of text mining. It associates a document with subject categories that a computer learns from a "training set" of documents that a subject matter expert has classified by hand. The new software program improves the speed and accuracy of categorizing systems because it helps the subject matter expert interactively create the training set, choosing and refining the categories and the conditions under which they are applied. It's a technique that could improve results from traditional categorizing systems and is particularly useful for classifying short documents, according to according to prep. 1. As stated or indicated by; on the authority of: according to historians. 2. In keeping with: according to instructions. 3. Martin. The scientists' discovery grew out of request from a Xerox engineering group for help analyzing service logs, the record of calls from service technicians in the field to company engineers about problems with production printer and copier operation. The engineering group was manually classifying these logs so they could identify and devote their efforts to solving the most important problems. They asked XIG XIG Expanded Interceptive Guardians scientists to develop an algorithm that would automate the way service log problems were grouped into categories. A traditional categorizing system would have learned from the work they had done, following the classification pattern already defined by the user. The categories would then remain static. However, when Martin and his colleagues used ethnographic techniques like conducting open-ended interviews and videotaping an engineer as he continued to categorize the service logs, they realized that what he was doing did not fit the traditional description of categorizing. "Instead of performing a routine task of applying a predetermined pre·de·ter·mine v. pre·de·ter·mined, pre·de·ter·min·ing, pre·de·ter·mines v.tr. 1. To determine, decide, or establish in advance: label to each log in a highly constrained fashion, we saw that he was constructing additional categories as he read the logs," Martin said. Working with the subject matter expert, the Xerox scientists developed a system that allows a subject matter expert to develop categories dynamically in a way a machine-learning system could not. "The new system allows exploratory categorization that falls somewhere between categorization and clustering," Martin said. "It provides categories into which text data can be organized, but it allows the subject matter expert to change the categories as new ones are discovered." This new technique reduced the time required to categorize the service logs from a week to a few minutes, and the group is more productive. Now the new software program is being used in other Xerox organizations to analyze unstructured responses such as comments from customers. Xerox has applied for a patent on the technology. In addition, at the INCOSE INCOSE International Council On Systems Engineering symposium Anthony M. Federico, vice president, platform development for the Xerox Production Systems Group, will give a keynote speech keynote speech n. See keynote address. Noun 1. keynote speech - a speech setting forth the keynote keynote address keynote - the principal theme in a speech or literary work tomorrow on "System Engineering in Advanced Color Imaging." Symposium attendees can also tour Xerox's Webster research and manufacturing complex to learn about the principles of color digital printing and how paper choice impacts printing. Xerox Corporation is one of the world's top technology innovators. Document and content management is an active area of research and has yielded innovative technologies to streamline document-intensive processes, bridge the paper and digital worlds, and make it easier to manage information in multiple languages. NOTE TO EDITORS: For more information about Xerox and to receive its RSS (Really Simple Syndication) A syndication format that was developed by Netscape in 1999 and became very popular for aggregating updates to blogs and the news sites. RSS has also stood for "Rich Site Summary" and "RDF Site Summary. news feed, visit www.xerox.com/news and www.xerox.com/innovation. XEROX(R) is a trademark of XEROX CORPORATION. |
|
||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion