Creating order out of chaos with taxonomies: the increasing volume of electronic records and the frequency with which those records change require the development and implementation of taxonomies--a classification system of topics or subject categories--to maximize efficient retrieval of records for legal, business, and regulatory purposes.According to according to prep. 1. As stated or indicated by; on the authority of: according to historians. 2. In keeping with: according to instructions. 3. an online article from LAW.COM (1) (Computer Output Microfilm) Creating microfilm or microfiche from the computer. A COM machine receives print-image output from the computer either online or via tape or disk and creates a film image of each page. , more than 90 percent of new business records are created electronically, and 40 percent of them are never converted to paper. This deluge Deluge (dĕl`y j), in the Bible, the overwhelming flood that covered the earth and destroyed every living thing except the family of Noah and the creatures in his ark. of mostly unstructured digital data, documents, e-mail, and instant messages raises serious issues related to retention, storage, and accessibility. Meanwhile, regulators and stakeholders StakeholdersAll parties that have an interest, financial or otherwise, in a firm-stockholders, creditors, bondholders, employees, customers, management, the community, and the government. compel organizations to enhance transparency, demonstrate accountability, and implement controls. The higher the level of public scrutiny by regulators and stakeholders, the greater the need organizations have for applying management controls. Some types of comprehensive record searches (e.g., divestitures, due diligence Research; analysis; your homework. This term has caught on in all industries, because it sounds so "wired." Who would want to do analysis or research when they can do due diligence. See wired. investigations, and electronic discovery in response to courts and regulators), are difficult to conduct without taxonomies. To maximize efficient and effective retrieval of records for legal, business, and regulatory purposes, organizations must develop and implement taxonomies and metadata to complement text searching, provide multiple access points to information, and incorporate retention requirements. Methods for Organizing Information Historically, classification systems were expressly developed to classify physical objects that existed in physical locations (see "A Little History" sidebar), but technological advancements in the twentieth century brought an explosion of information--both digital and physical--that forever changed Forever Changed was a Christian Rock band from Tallahassee and Orlando, FL. They came together in 1999 and broke up in 2006. Dan Cole was the lead singer, a guitarist, and a pianist. Ben O'Rear was the lead guitarist, Tom Gustafson played bass, and Nathan Lee played the drums. notions of classifying an "information collection." The information contained in physical and digital records changes frequently--daily, weekly, monthly--and sometimes without warning. Frequent updates, such as modification or deletion deletion /de·le·tion/ (de-le´shun) in genetics, loss of genetic material from a chromosome. de·le·tion n. Loss, as from mutation, of one or more nucleotides from a chromosome. of items, are critical when time-sensitive information is involved, but updates can be very disconcerting dis·con·cert tr.v. dis·con·cert·ed, dis·con·cert·ing, dis·con·certs 1. To upset the self-possession of; ruffle. See Synonyms at embarrass. 2. to users who find themselves hunting for moving targets. A 2004 Delphi Group research report indicated that constantly changing information was the biggest impediment A disability or obstruction that prevents an individual from entering into a contract. Infancy, for example, is an impediment in making certain contracts. Impediments to marriage include such factors as consanguinity between the parties or an earlier marriage that is still valid. to relocating and retrieving information. Seventy-three percent of survey respondents reported they spend 10 to 20 percent of their work week searching for information. Compounding the problem of frequently changing information is the increasing volume of hardcopy and digital records that organizations maintain for legal and business purposes. A 2000 Reuters study indicated that "Every day, approximately 20m [20 million] words of technical information are recorded." (As a testament to the volatility of information, the Reuters article is no longer available online.) While the largest library collection in the world, the Library of Congress, consists of nearly 128 million items, a large organization can easily maintain tens of millions of physical and digital records. One tool that is instrumental for managing increasing records volume is a taxonomy taxonomy: see classification. taxonomy In biology, the classification of organisms into a hierarchy of groupings, from the general to the particular, that reflect evolutionary and usually morphological relationships: kingdom, phylum, class, order, : a structured, often hierarchical, classification system of topics or subject categories. Taxonomies speed up the process of retrieving records because end users can select from subject categories or topics, enabling them to narrow the search field and find relevant information rather than relying solely on the blank text search field and their ability to construct an effective query. Taxonomies also provide "serendipitous ser·en·dip·i·ty n. pl. ser·en·dip·i·ties 1. The faculty of making fortunate discoveries by accident. 2. The fact or occurrence of such discoveries. 3. An instance of making such a discovery. guidance," according to a 2003 The Information Management Journal article by Denise Bruno and Heather Richmond, because additional information can be inferred from seeing where a topic resides in the taxonomy's context. End users who are not knowledgeable about a particular topic might begin a search process by navigating through the taxonomy. When an area of interest is discovered, a text search against only the records in this particular area of the taxonomy could be executed. Conversely con·verse 1 intr.v. con·versed, con·vers·ing, con·vers·es 1. To engage in a spoken exchange of thoughts, ideas, or feelings; talk. See Synonyms at speak. 2. , the user might start with a text search producing hundreds or thousands of records. Through the integration of a taxonomy, the results could be displayed as a customized set of folders that organize the content by related topics. According to the Delphi Group report, enterprise content management (ECM (1) (Enterprise Change Management) See version control and configuration management. (2) (Error Correcting Mode) A Group 3 fax capability that can test for errors within a row of pixels and request retransmission. ) products enable taxonomy integration, allowing users to search across repositories, present records from multiple repositories in response to user queries, and personalize per·son·al·ize tr.v. per·son·al·ized, per·son·al·iz·ing, per·son·al·iz·es 1. To take (a general remark or characterization) in a personal manner. 2. To attribute human or personal qualities to; personify. these responses based on the requestor's relationship to the enterprise. In most organizations, there is still no way to search for electronic records in multiple repositories except to search each repository separately. Despite compelling arguments for using taxonomies in records and information management, according to Gartner, more than 70 percent of organizations that invest in such initiatives do not achieve their target return on investment due to under-investment in taxonomy development. It is worthwhile then to compare trade-offs associated with alternatives such as buying pre-built taxonomies, building taxonomies, and automatically generating taxonomies. Buying Pre-existing Taxonomies Pre-built taxonomies covering common business functions, including legal, information technology, human resources The fancy word for "people." The human resources department within an organization, years ago known as the "personnel department," manages the administrative aspects of the employees. , and sales and marketing, are available from search-technology and ECM companies. Vendors also offer taxonomy templates with specific industry terminology for corporate, government, and education sectors. For the corporate sector, for example, there are taxonomies for aerospace; architecture and design; automotive; finance and accounting businesses; commodities; chemistry; earth science; engineering, international business; law; life and medical sciences; pharmaceuticals; physics and astronomy; textiles; and utilities. Industry associations are another source of pre-built taxonomies. In the oil and gas industry, for example, the Petrotechnical Open Software Consortium (POSC POSC Petrotechnical Open Software Corporation POSC Payroll Online Service Center (state of Maryland) POSC Personnel Occupational Specialty Code ) and the PPDM PPDM Public Petroleum Data Model (Association) Association, with work done by Shell Expro and Flare Consultants, have produced an exploration and production (E&P) taxonomy catalog. The catalog, which includes a standard set of metadata attributes for E&P information, provides a logical, standardized standardized pertaining to data that have been submitted to standardization procedures. standardized morbidity rate see morbidity rate. standardized mortality rate see mortality rate. way to index and catalog information so that it can be easily identified and retrieved in the right context. There are also worthwhile pre-built taxonomies in the public domain. The Taxonomy Warehouse (www.taxonomywarehouse. corn) provides a free directory of 501 taxonomies, thesauri, classification schemes, and other authority files from around the world, plus information about taxonomy references, resources, and events. The taxonomies are classified by 73 subject domains, such as patents, real estate, and taxation, and each has ordering information. The use of pre-built taxonomies has pros and cons pros and cons Noun, pl the advantages and disadvantages of a situation [Latin pro for + con(tra) against] . Clearly, pre-built taxonomies can speed up the taxonomy creation process, enabling organizations to deliver immediate results while still allowing taxonomies to be fine-tuned for organization-specific requirements. Pre-built taxonomies have been checked for consistency so that an accounts payable invoice is not called a "bill" in one subject category and a "posting" in another. Furthermore, they incorporate industry best practices and can introduce a more efficient and effective method for organizing records. A significant disadvantage of pre-built taxonomies is that because they are not specific to an organization and its objectives, they have limited applicability. Each organization has its own culture and its own way of categorizing. Using pre-built taxonomies will introduce unfamiliar terminology and make user training more time-consuming. Building Taxonomies Unlike pre-built taxonomies, a custom-developed taxonomy can be very specific to an organization, its objectives, and culture. The developer has control over the selection of terminology to make sure it reflects the understanding and needs of an intended audience, as well as the range of content to which it will he applied. In some cases, building a taxonomy is the only solution because there are no other existing taxonomies that cover a particular area of interest. The primary disadvantage of building a taxonomy is the time it takes. It is much faster to use a pre-existing taxonomy--or even to customize a pre-existing one that is compatible in scope and application. Trying to customize an incompatible taxonomy could be just as time-consuming as building a new one and even more challenging. Another disadvantage to building a taxonomy is that it is usually more expensive than buying a pre-existing one. Despite the disadvantages, however, most companies still build their own taxonomies while leveraging the use of pre-built taxonomies when possible. In constructing and implementing a taxonomy, the goal is to develop a conceptual organizational structure To comply with Wikipedia's lead section guidelines, one should be written. that can be used to classify and search for information. The general process is roughly the same whether a manual or an automatic approach is used. Four interrelated in·ter·re·late tr. & intr.v. in·ter·re·lat·ed, in·ter·re·lat·ing, in·ter·re·lates To place in or come into mutual relationship. in phases must be considered: Phase 1: Planning and analysis Phase 2: Design, development, and testing Phase 3: Implementation Phase 4: Maintenance Phase 1: Planning and Analysis Planning and analysis is the most critical phase of taxonomy development. It requires gaining a thorough understanding of the total information environment in which the taxonomy will be implemented and developing a realistic strategy for integration. Activities in this phase are designed to do the following: * assess resources involved in the taxonomy project and determine how the taxonomy will be used. If necessary, identify outside consulting resources to assist * identify categories to be used and decide on the taxonomy's structure * select a development strategy and identify appropriate technology for developing the taxonomy as well as categorizing content * budget for development and ongoing maintenance Information from the planning phase In amphibious operations, the phase normally denoted by the period extending from the issuance of the order initiating the amphibious operation up to the embarkation phase. The planning phase may occur during movement or at any other time upon receipt of a new mission or change in the is used to firm up the project plan and determine key milestones that demonstrate success. Phase 2: Design, Development, and Testing Changes to a taxonomy can be painful after implementation, so taxonomies should be designed for both short-term and long-term needs to minimize change when the organization's structure changes. "People do not like information architecture to change," content management consultant and author Gerry McGovern said. "Spend the time to get it as right as possible [the] first time." Design and development, therefore, should be an iterative it·er·a·tive adj. 1. Characterized by or involving repetition, recurrence, reiteration, or repetitiousness. 2. Grammar Frequentative. Noun 1. process based on feedback from stakeholders at every major stage of the process. Develop a high-level structure and test with stakeholders. Modify the structure based on their feedback and then test again until general con sensus indicates that taxonomy objectives are being met. Phase 3: Implementation Good planning and design provide a solid foundation for implementing the taxonomy. However, smooth implementation can be achieved only if people, processes, and technologies have been identified and prepared for this phase. The change management process begins early in the project through open communication and expectation setting. It is formalized for·mal·ize tr.v. for·mal·ized, for·mal·iz·ing, for·mal·iz·es 1. To give a definite form or shape to. 2. a. To make formal. b. with stakeholder stakeholder n. a person having in his/her possession (holding) money or property in which he/she has no interest, right or title, awaiting the outcome of a dispute between two or more claimants to the money or property. training on the processes to he used around categorizing new information, searching and retrieving information, and using the technologies employed in the effort. Phase 4: Maintenance Even when taxonomy developers consider short-term and long-term needs in the planning process, change is inevitable. A taxonomy is a strategic part of an organization's information architecture that will be maintained for many years. It will evolve as business needs change and as sophistication so·phis·ti·cate v. so·phis·ti·cat·ed, so·phis·ti·cat·ing, so·phis·ti·cates v.tr. 1. To cause to become less natural, especially to make less naive and more worldly. 2. and understanding grows around records management. Documentation of decisions made throughout the development and implementation process will be instrumental for efficiently assessing requests for change and making changes to the taxonomy as necessary. The change management infrastructure (people, process, and technology) that was implemented in Phase 3 should be maintained for the life of the taxonomy. Developing Manual and Automated Taxonomies There are two basic strategies to building taxonomies: top-down or bottom-up. In "Best Practices in Taxonomy Development and Management," authors Laura Ramos and Daniel Rasmus said that a top-down strategy--usually developed manually--offers control over the broad general concepts found at the highest taxonomy levels and is useful for aligning the taxonomy with business strategy and goals. A bottom-up strategy uses automated technologies to extract basic concepts from the content itself and make generalizations about them. Both strategies have advantages and disadvantages, and both are important for taxonomy development and implementation. Manual taxonomy development offers significant control over the meaning and arrangement of concepts and can be deliberately shaped to reflect common knowledge and practice in an organization. However, manual categorization of documents to the concepts in the taxonomy is low in accuracy simply because of the human judgment involved. Where automated classification methods are used with manually developed taxonomies, it is a significant task to "train" the tools to categorize cat·e·go·rize tr.v. cat·e·go·rized, cat·e·go·riz·ing, cat·e·go·riz·es To put into a category or categories; classify. cat documents to the taxonomy, and, it may be impossible to train an automated tool if there is not sufficient distinction in the meaning of categories. The cost of developing and maintaining a manual taxonomy is high because it is a resource-intensive process. Automatic classification tools can automate the process of categorizing content for an already developed taxonomy or generate the taxonomy structure itself. Tools that automatically generate the taxonomy structure apply various algorithms (statistical analysis, Bayesian probability Bayesian probability is an interpretation of the probability calculus which holds that the concept of probability can be defined as the degree to which a person (or community) believes that a proposition is true. , and clustering) to a corpus of documents in a bottom-up strategy. An automatically generated taxonomy offers little control over the meaning and arrangement of high-level concepts and, consequently, requires significant refinement in order to make sense to users and be more reflective of the way they view information. These tools can categorize a larger number of documents more accurately and faster than humans. However, addition of new concepts requires that the tool be trained to recognize each new concept before content can be automatically classified. The cost of automatically deriving a taxonomy structure is also high because some time-intensive tasks still require human intervention. People must still examine each category to see if it is fit for purpose and if it is named appropriately. Human judgment must determine if some categories should be deleted or new ones added. They must also determine if the final taxonomy "matches" human understanding and purpose. Selection of a taxonomy strategy and associated tools should be based on the goals of the taxonomy development project. The best solutions will use a combination of the strategies--a top-down approach Top-down approach A method of security selection that starts with asset allocation and works systematically through sector and industry allocation to individual security selection. to develop the higher-level categories in the taxonomy aligned with business strategy, and a bottom-up approach to refine lower-level concepts and enable automatic categorization of content. Identifying Best Practices As business awareness and use of taxonomies have grown over the years, the Years, The the seven decades of Eleanor Pargiter’s life. [Br. Lit.: Benét, 1109] See : Time following best practices have emerged for successful taxonomy development: * Make sure the taxonomy is clearly related to business strategy. This will provide one standard against which to measure success and help in controlling the scope of work to be done. * Incorporate existing taxonomy and metadata resources whenever possible. Some resources may be available internally. Others may be found in the public domain (e.g., country codes published by the United Nations Code for Trade and Transport Locations and API (Application Programming Interface) A language and message format used by an application program to communicate with the operating system or some other control program such as a database management system (DBMS) or communications protocol. well numbers published by the American Petroleum Institute The American Petroleum Institute, commonly referred to as API, is the main U.S. trade association for the oil and natural gas industry, representing about 400 corporations involved in production, refinement, distribution, and many other aspects of the industry. ). * Even in large and complex taxonomies, make sure categories are well-defined and distinct. If the meaning of categories is too similar, it will be hard for both people and machines (automatic tagging and categorization) to make distinctions. * Iterative development A discipline for developing systems based on producing deliverables often. Each iteration, consisting of requirements, analysis & design, implementation and testing, results in the release of an executable subset of the final product, which grows incrementally from iteration to is key. Develop a high-level taxonomy, test it with users, expand it, and test again. This technique will increase the probability that the right concepts are identified and encourage buy-in from stakeholders. * Keep the taxonomy as simple as possible. Decompose de·com·pose v. de·com·posed, de·com·pos·ing, de·com·pos·es v.tr. 1. To separate into components or basic elements. 2. To cause to rot. v.intr. 1. to a useful level but avoid so much detail that information becomes fragmented. * Provide for adequate resources to maintain the taxonomy. Taxonomies are not static and will change over time. The appropriate change management infrastructure (people, process, and technology) must be put into place to support necessary change. Planning for the Long Term Faced with an ever-growing challenge to provide efficient search and retrieval across growing record repositories, organizations are looking for Looking for In the context of general equities, this describing a buy interest in which a dealer is asked to offer stock, often involving a capital commitment. Antithesis of in touch with. ways to create order out of chaos, and taxonomies are a primary tool. Taxonomies enhance searching for records because end users can select from standardized categories and hierarchical structures See hierarchical. of information, enabling them to narrow the search field and find relevant information faster. Good planning and design provide a solid foundation for establishing the taxonomy, and people, processes, and technologies must be identified and prepared for implementation. A taxonomy, like a records retention schedule, is a strategic part of an organization's information architecture, and maintenance will require a long-term investment of human and financial resources. A Little History ... Taxonomy originated in the life sciences and can be traced back to Aristotle's theory of categories. "He espoused the idea that things are placed into the same category on the basis of what they have in common," author Arlene Taylor wrote in her book The Organization of Information, and they are arranged hierarchically with things either inside or outside the container. Among the earliest applications of classification to knowledge were 10 broad categories used by Callimachus, a Greek poet and scholar, for classifying works in the Library of Alexandria The Royal Library of Alexandria in Alexandria, Egypt, was once the largest library in the world. It is generally thought to have been founded at the beginning of the 3rd century BC, during the reign of Ptolemy II of Egypt. . These 10 categories remained fairly stable until the late Middle Ages but expanded significantly in the nineteenth century with the rapid growth of libraries and an increased need to provide users with easier access to books. Two large classification systems were developed to address this need and were put into widespread use: the Dewey Decimal Classification Dewey Decimal Classification or Dewey Decimal System System for organizing the contents of a library based on the division of all knowledge into 10 groups. Each group is assigned 100 numbers. and the Library of Congress Classification Library of Congress Classification or LC Classification System of library organization developed during the reorganization of the U.S. Library of Congress. . By the end of the nineteenth century, a movement was underway in Europe to go beyond providing access to books. The Universal Decimal Classification The Universal Decimal Classification is a system of library classification developed by the Belgian bibliographers Paul Otlet and Henri la Fontaine at the end of the 19th century. It is based on the Dewey Decimal Classification, but is much more powerful. system was not developed as a library classification, Taylor said, but rather as a means to organize and analyze documents. At the Core This article * explains how instrumental taxonomies are to managing the e-records deluge * describes advantages and disadvantages of buying versus building taxonomies * details the four phases of building taxonomies * provides best practices for developing taxonomies References Bruno, Denise and Heather Richmond. "The Truth about Taxonomies," The Information Management Journal 37, No. 2 (March/April 2003). Delphi Group. "Information Intelligence: Content Classification and the Enterprise Taxonomy Practice." June 2004. Available at www.delphigroup.com/research/whitepapers/20040601-taxonomy-WP.pdf (accessed 28 March 2005). Ingram, Brian. "Locate Smoking Guns Electronically." LAW.COM. 29 September 2003. Available at www.law.com/special/supplement/e_discovery/smoking_gun.shtml (accessed 28 March, 2005). Knox, Rita E. and Debra Logan. "What Taxonomies Do for the Enterprise." Gartner Research. 10 September 2003. Available at www.gartner.com/resources/117200/117204/117204.pdf (accessed 28 March 2005). McGovern, Gerry. "A Step-by-Step Approach to Web Classification Design." October 2002. Available at www.gerrymcgovern.com/la/wcd.pdf (accessed 28 March 2005). Petrotechnical Open Standards Specifications for hardware and software that are developed by a standards organization or a consortium involved in supporting a standard. Available to the public for developing compliant products, open standards imply "open systems;" that an existing component in a system can be replaced Consortium. "Work Program Summary for 2003." 3 February 2004. Available at www.posc.org/workprgm/summary_2003.shtml (accessed 28 March 2005). Ramos, Laura. "Decision Criteria for Undertaking a Taxonomy Development Project." Giga Information Group. 8 January 2002. Ramos, Laura and Daniel Rasmus. "Best Practices in Taxonomy Development and Management." Giga Information Group. 8 January 2003. Reuters Studies. "The Reuters Guide to Good Information Strategy." Dow Jones Dow Jones the best known of several U.S. indexes of movements in price on Wall Street. [Am. Hist.: Payton, 202] See : Finance Reuters Business Interactive Limited, 2000. Taylor, Arlene. The Organization of Information. Englewood, Colorado Englewood is a city in Arapahoe County, Colorado, USA. As of 2005, the city is estimated to have a total population of 32,350.[5] It is part of the Denver-Aurora Metropolitan Area. : Libraries Unlimited Inc, 1999. Susan L. Cisco, PhD., CRM (Customer Relationship Management) An integrated information system that is used to plan, schedule and control the presales and postsales activities in an organization. , FM, is a Project Manager with Iron Mountain Enterprise Solutions and Services. She holds an M.L.S. and Ph.D. in Library and Information Science from The University of Texas at Austin “University of Texas” redirects here. For other system schools, see University of Texas System. The University of Texas at Austin (often referred to as The University of Texas, UT Austin, UT, or Texas and is a published RIM author and educator. She may be contacted at scisco@ironmountain.com. Wanda K. Jackson, Ph.D., is an IT professional with special expertise in taxonomy development that has enabled her to develop enterprise-wide taxonomies for several large companies. She holds a Ph.D. in Library and Information Science from the University of Texas at Austin and is a certified Project Management Professional CPMP, Certified Project Management Professional, is a certification created by NPMA (National Project Management Association) in Taiwan, R.O.C. External links
I found it really helpful - clarifies the process, gives it a structure. |
|
||||||||||||||||||||

j)
Printer friendly
Cite/link
Email
Feedback
Reader Opinion