Printer Friendly
The Free Library
14,581,517 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Basis Technology Introduces the Rosette Arabic Language Analyzer; First Commercially Available Analyzer for Arabic Developed Entirely in the United States.


Business Editors/High-Tech Writers

CAMBRIDGE, Mass.--(BUSINESS WIRE)--March 4, 2003

Addresses the Needs of U.S. Government Agencies

for Search and Retrieval of Arabic Documents

Basis Technology, the leading provider of globalization globalization

Process by which the experience of everyday life, marked by the diffusion of commodities and ideas, is becoming standardized around the world. Factors that have contributed to globalization include increasingly sophisticated communications and transportation
 software and services, today introduced the Rosette Rosette

D’Albert’s pliable, versatile, talented, acknowledged bedmate. [Fr. Lit.: Mademoiselle de Maupin. Magill I, 542–543]

See : Courtesanship



(language) Rosette - A concurrent object-oriented language from MCC.
(R) Arabic Language Arabic language

Ancient Semitic language whose dialects are spoken throughout the Middle East and North Africa. Though Arabic words and proper names are found in Aramaic inscriptions, abundant documentation of the language begins only with the rise of Islam, whose main texts
 Analyzer (ARLA), the first commercially available analyzer for Arabic text developed entirely in the United States United States, officially United States of America, republic (2005 est. pop. 295,734,000), 3,539,227 sq mi (9,166,598 sq km), North America. The United States is the world's third largest country in population and the fourth largest country in area. . ARLA is the latest addition to Basis Technology's suite of Rosette Language Analyzers, which also includes products for Chinese, Japanese, and Korean. Developed in response to the needs of the U.S. Intelligence Community, the new product is designed to plug into mainstream search engines and data mining products to facilitate search and retrieval of information written in Arabic.

"One of the most pressing issues facing the Intelligence Community today is the need to quickly and accurately identify, analyze, and extract information in foreign languages and scripts," said Glenn Nordin, Assistant Director Intelligence Policy (Language), Department of Defense. "Because U.S. Government computer systems are largely designed to work with the Latin alphabet Latin alphabet
 or Roman alphabet

Most widely used alphabet, the standard script of most languages that originated in Europe. It developed before 600 BC from the Etruscan alphabet (in turn derived from the North Semitic alphabet by way of the Phoenician and
 and US character sets, processing information in Arabic is a difficult undertaking. In the absence of universal transliteration standards, human transcript of foreign text into the Latin alphabet can result in significant corruption of the data and mismatches in searches. Finding solutions that enable intelligence analysts to extract and disseminate information in the original language and script could be of critical importance."

ARLA is a multi-platform, high-performance linguistic engine for analyzing Arabic documents. It performs orthographic or·tho·graph·ic   also or·tho·graph·i·cal
adj.
1. Of or relating to orthography.

2. Spelled correctly.

3. Mathematics Having perpendicular lines.
 and lexical normalization In relational database management, a process that breaks down data into record groups for efficient processing. There are six stages. By the third stage (third normal form), data are identified only by the key field in their record.  of text, including removal of grammatical affixes (such as conjunctions, prepositions, and pronouns) that complicate search and retrieval. ARLA utilizes advanced computational linguistics and specialized lexica lex·i·ca  
n.
A plural of lexicon.
 to convert plural nouns, including broken plurals, to their singular forms.

The new product is a component of the Rosette Globalization Platform, a comprehensive software suite which enables multilingual information processing. Other components include the Rosette Core Library for Unicode (RCLU), a portable framework for implementing Unicode, and the Rosette Language Identifier (RLI RLI Realtors Land Institute
RLI Reserve Life Index (oil industry)
RLI Rhodesian Light Infantry (Rhodesian Army Unit)
RLI Retail & Leisure International
RLI Resource List Interoperability
), which automatically identifies the language and encoding of incoming documents. RLI now supports over forty written languages, including Arabic, Farsi, transliterated Arabic, and transliterated Farsi.

"Linguistics technology is beginning to play an increasingly important role when it comes to ensuring national security," said Everette Jordan, Director of the National Virtual Translation Center The National Virtual Translation Center (NVTC) is a United States government organization that provides "timely and accurate translations of foreign intelligence for all elements of the Intelligence Community. , an organization jointly sponsored by the FBI and CIA CIA: see Central Intelligence Agency.


(1) (Confidentiality Integrity Authentication) The three important concerns with regards to information security. Encryption is used to provide confidentiality (privacy, secrecy).
 under the USA Patriot Act USA PATRIOT Act [Uniting and Strengthening America by Providing Appropriate Tools Required to Intercept and Obstruct Terrorists], 2001, U.S. . "Because of the enormous volume of multi-lingual intelligence information that must be analyzed with limited human resources, technologies that can assist in sifting, sorting, and finding critical information are essential in ensuring that threats are detected as quickly as possible. Whereas the U.S. Government cannot endorse any one product over another, we are pleased to see that companies are responding to the government's call for solutions to these difficult issues."

"Search and retrieval of information in Arabic documents is highly complex," explained Glenn Adams, Technical Director Emeritus of the Unicode Consortium, and co-author of the Unicode Standard. "For example, Arabic incorporates affixes and infixes indicating grammatical elements such as conjugation conjugation, in genetics
conjugation, in genetics: see recombination.
conjugation, in grammar
conjugation: see inflection.
, prepositions, and pronouns. Searching through documents for an exact match to a particular search term will miss many relevant hits. Searching for "book" ("kitaab") will not return the Arabic term for "the books" ("alkutub"). ARLA solves this problem and many others like it, resulting in a more accurate and comprehensive search that doesn't miss relevant terms because of slight grammatical variations."

Together with the other language components of the Rosette Globalization Platform, ARLA enables Federal law enforcement and intelligence agencies to expand their ability to detect and monitor intelligence originating in a foreign language, even when searching documents with terms which have been transcribed into the English alphabet.

"A key issue when searching Arabic text is the fact that names may be transcribed into English with many varied spellings, even though there will be far fewer ways of writing the same name in Arabic," said Carl Hoffman, CEO (1) (Chief Executive Officer) The highest individual in command of an organization. Typically the president of the company, the CEO reports to the Chairman of the Board.  of Basis Technology. "For example, there are over thirty different commonly-used English spellings for the name of Libya's ruler, all of which correspond to the unique spelling of his name in Arabic. Our software can be used to build applications that allow users to search and retrieve information in Arabic documents using "phonetic approximation"--spelling the name the way it sounds--without having knowledge of the many varied transliteration schemes. This significantly increases the likelihood of non-Arabic speakers locating the critical information for which they are searching."

ARLA is available for immediate shipment with plug-ins either available or under development for Convera RetrievalWare(R), FAST Data Search(TM), Microsoft(R) SQL Server(TM), and Oracle(R) Text/interMedia.

About Basis Technology

Basis Technology is the leading provider of products and services for software globalization and multilingual information processing. The company provides high-performance, highly reliable software components through its Rosette(R) Globalization Platform, a suite of interoperable products designed for applications that analyze and process all the world's languages. The company also provides rapid deployment engineering services covering all aspects of globalization, including source code audits, project management, software re-engineering, and global quality assurance.

Top-tier software vendors, content providers, multinational enterprises, and government agencies rely on Basis Technology's solutions for Unicode compliance, language identification, multilingual search, normalization, and transliteration. Customers include industry leaders Amazon.com, America Online, Convera, Fast Search & Transfer (FAST), Google, Hewlett-Packard, IBM (International Business Machines Corporation, Armonk, NY, www.ibm.com) The world's largest computer company. IBM's product lines include the S/390 mainframes (zSeries), AS/400 midrange business systems (iSeries), RS/6000 workstations and servers (pSeries), Intel-based servers (xSeries) , L.L. Bean, Overture Services, PeopleSoft, Siebel Systems, Software AG, and Verity.

Company headquarters are located in Cambridge, Massachusetts, with branch offices in San Francisco, California “San Francisco” redirects here. For other uses, see San Francisco (disambiguation).

The City and County of San Francisco (EN IPA: [sænfrənˈsɪskoʊ] 
; Herndon, Virginia; and Tokyo, Japan. For more information, visit www.basistech.com or call 800-697-2062.
COPYRIGHT 2003 Business Wire
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2003, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Publication:Business Wire
Date:Mar 4, 2003
Words:929
Previous Article:Summus and TriBond Enterprises Partner to Bring Games to Wireless Phones; Tribond, the Popular Board Game That Has Sold over Three Million Copies, to...
Next Article:Courion Granted Interface Certification by SAP AG.



Related Articles
The Arabic Literary Heritage: The Developments of its Genres and Criticism.(Review)
The Sociolinguistic Market in Cairo: Gender, Class and Education.(Review)
[Winning the war of words]: At a time of international conflict, language can make or break peace.
"A Fiction of Authenticity": Contemporary Art Center St. Louis.(Saint Louis)
HIV/AIDS prevention and care gay and lesbian asylum reproductive health and sex education domestic violence and women empowerment female genital...
The Undergraduate's Companion to Arab Writers and Their Web Sites.(Brief Article)(Book Review)
Scientists developing software to scan Arabic documents.(News, Trends & Analysis)(Brief Article)
Fighting with words: bridging language and culture gaps through games.(new computer wargame 'Tactical Iraqi Language Trainer' introduced)

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles