Printer Friendly

A hands-on guide for multilingual Web sites.

Globalization is on everyone's mind. Big companies are printing brochures in Korean and Finnish, and Japanese, and publishing their company newsletters in Spanish and French. But check out the web sites of the Fortune 500, and time and again you see only one language: English. This is surprising.

The Internet is the ultimate global information source, but American companies - even multinationals - are largely ignoring its ability to reach literally every human being with access to a computer. Perhaps English-only companies feel that making foreign-language web pages is just too hard to do. The fact is that translating a web site is not that difficult. Here is how it's done.


To start the process, a company must decide which languages are right for its web site. That choice will, of course, depend on the target readership. The most popular languages in corporate America are French, Spanish, German, Japanese and Chinese; a second tier would include Swedish, Portuguese, Italian, Russian and Korean. In addition, for some languages, a dialect must be specified. For example, Portuguese has two dialects, European and Brazilian, and Spanish has three main variants: European, Mexican and South American. For the most part, the difference among dialects is fairly subtle. The contrast between British and American English is a good illustration, with differences in spelling (colour vs. color) and vocabulary (lorry vs. truck) accounting for most of the variation. Nevertheless, these subtle differences are extremely apparent to any reader, so the proper choice of dialect is an important consideration. Chinese has two distinct writing systems, one used in China and Singapore, the other in Taiwan, Hong Kong and other Asian countries.

The next step is to identify which web pages should have language versions. The home page will be translated, naturally, as will most if not all pages that directly link to the home page. Beyond that, there should be two criteria for translation: marketing and maintenance. A page that emphasizes marketing, sales and corporate identity is a better candidate for translation than, say, a page about employment opportunities. As for maintenance, a page that changes regularly (for example a weekly events update) is a poorer candidate for translation than a page that is expected to remain stable for a long time.


Now that the languages are chosen and a set of relatively high-value, low-maintenance web pages is decided upon, translation can begin. The ideal translator is a professional writer with a knowledge of languages and expertise in the subject matter being covered. Translators should write exclusively in their native language, no matter how good their competence in other languages may be. Like all writers, translators need their editors. In fact, a translation is simply not complete until it has been thoroughly vetted by a professional editor. Assembling a professional translator-editor team for each language is crucial to the success of any multi-language web site. With the final translation in hand, the project can move on to the HTML phase.


HTML (short for HyperText Markup Language) is a set of codes that define the content and structure of web pages. Embedded in these codes is the text that users see on the screen. Within HTML, the basic unit of text is the paragraph. So there will be a sequence of codes, then a paragraph of text, then more codes, then another paragraph of text, then more codes, then another paragraph, and so on. The task here is simply to replace each English-language paragraph with its foreign-language equivalent, using the same cut-and-paste techniques that everyone is familiar with. If the operator is careful enough to avoid mixing up the paragraphs or making inadvertent changes to the code, it's easy. Of course, the new pages must be renamed, relinked, and placed within the hierarchy of the site, but these are standard webmaster functions that have nothing to do with the language aspect of the project. It's essentially a matter of cut-and-paste, and voila: the page is in another language.

Cutting and pasting paragraphs within HTML is the ideal solution, as it's easy to do and the resulting page looks just like the original. Unfortunately, this technique works only with languages that use the Roman alphabet. This includes all the Western European languages, but none of the Eastern European languages. Also excluded are Chinese, Japanese and Korean. The reason for this limitation resides with the nature of fonts.


A font is a set of characters. A typical font contains the letters from A to Z, the numbers 0 through 9, plus punctuation marks and other special characters. Each character is identified in the computer by one byte of information, such as 00110010. This "1 byte = 1 character" system can accommodate only 256 different characters (there are 2 kinds of bits - 0 and 1 - and there are 8 bits per byte: [2.sup.8] = 256). That's plenty for the Western languages, because our alphabet has just 26 letters, leaving more than enough slots for curiosities like [Pi] and [Beta. But the 256-character limit poses a problem for languages such as Chinese, which has well over 10,000 characters. They simply cannot be represented on the computer using only one byte of information each. So each character in a Chinese font is identified in the computer by two bytes, for example 00011100 10101000. This solution allows for 65,536 characters ([2.sup.16]) but it creates another problem: most Internet browsers in the West are designed for one-byte fonts, so they cannot read two-byte fonts. To get around this obstacle, it is necessary to have special "reader" software running in the background. Without that software, Chinese text looks like this: ???. As you cannot expect visitors to your site to have "reader" software, the only solution is to display these languages in art files - not as text pasted into the HTML code. Languages such as Russian or Polish do use one-byte fonts, but to read those fonts, a user's computer must be specially configured. This configuration is easy to do, and is available on all computers. The problem is that most browsers in the West are not configured for the Eastern European languages, so Russian text will look like Chinese text: gobbledygook. Again, the solution is to display the text as art files.


Turning text into graphics is the digital equivalent of photographing a word-processing file. The resulting image can be resized and repositioned, but its content cannot be edited. It is no longer text, but rather an image, indistinguishable (as far as the computer is concerned) from any other graphic, whether a corporate logo or a picture of the Brooklyn Bridge. Therefore it no longer matters that the object is in, say, Japanese. Any website professional can take that graphic and link it to the HTML in such a way that it will have the same look as its original English counterpart. But first it is necessary to create the graphics file itself. Step one is to make sure that the text looks exactly the way it is meant to appear on the web site. It should be formatted to match the height and width of the original English text. In general, each paragraph or independent unit of text should be converted to a separate art file. This approach gives the web-site professional more flexibility when positioning the files, and it provides for a faster download by the browser. Step two is to generate a .gif file using Photoshop (see sidebar). Although .gif generation requires specific skills, it does not require knowledge of foreign languages, so almost any company should have the resources to produce text in a graphics format.

As complicated as the processes above may seem at first, the task really is simple: embed the text in the HTML codes for the Western-language translations, and embed the text in .gif files for the Eastern-language translations. Using this dual approach, companies can produce multilingual web sites that not only showcase their commitment to the global marketplace, but also do so in a way that is accessible to everyone online.

Creating Legible Graphics Files from Text 1. Format the text in a word processor such as MSWord 97. Use the zoom function or change the font size to get the text close to the way you want it to appear in your HTML document.

2. Do a screen capture [ALT + PRINT SCREEN] and paste the resulting .bmp file into a new Photoshop file. Be sure to move your cursor out of the way or it will also be captured.

3. Use the crop tool to eliminate the peripheral elements from the image, leaving only the text you want. Create a copy of the background layer for safekeeping.

4. To match the background color of an existing web site, go to that site and capture the background image. Paste that file into a layer of your Photoshop file. Use the eyedropper to grab the color from the background image. Create a new layer, and fill this layer with the background color. Place your text layer above the background color layer. Select the white in your text layer with the magic wand. Set "feathering" to the minimum. Again, select the white in your text layer with the magic wand. Cut the white. If your background layer is visible, you'll now see your black text over the background layer.

5. Merge layers down. You may have unwanted anti-aliased pixels in your black text. Zoom in to check. If necessary, zoom in so that the anti-aliased pixels are very large. Use the magic wand to select one of the offending shades of gray. Select "similar." Make sure that your desired background color is the foreground color on your tools palette. Use the "fill" command to fill the selection.

Repeat selection and fill other unwanted anti-aliased pixels, zooming out frequently to 100 percent magnification to check the results. Export to .gif89a, and the file is ready for incorporation into your site.

Gerry Dempsey is a partner at Eriksen Translations and a linguistics professor at the graduate center of the City University of New York. Robert Sussman is a partner and the resident Photoshop expert at Eriksen Translations. They can be reached at
COPYRIGHT 1999 International Association of Business Communicators
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 1999, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:includes related article on creating legible graphics files from text
Author:Sussman, Robert
Publication:Communication World
Date:Jun 1, 1999
Previous Article:The promise of intranets: expectations and effectiveness.
Next Article:Join Your Peers in Vancouver, B.C., Canada.

Related Articles
Accountants' online users' manual: lesson 3.
Bridge to an updated automation guide.
No HTML required.
Desktop to net with ease.
The language of the Web.
Internet: powerful: the biggest benefits of the Web--limitless space and the ability for users to customize its information--are leading factors in...

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters