Printer Friendly

The use of spatialized speech in auditory interfaces for computer users who are visually impaired.

Structured abstract: Introduction: This article reports on a study that explored the benefits and drawbacks of using spatially positioned synthesized speech in auditory interfaces for computer users who are visually impaired (that is, are blind or have low vision). The study was a practical application of such systems--an enhanced word processing application compared to conventional screen-reading software with a braille display. Methods: Two types of user interfaces were compared in two experimental conditions: a JAWS screen reader equipped with an ALVA 544 Satellite braille display and a custom auditory interface based on spatialized speech. Twelve participants were asked to read and process three different text files with each interface and to collect the information about their form and structure. Task-completion times and the correctness of the perceived information on text decorations, text alignment, and table structures were measured. Results: The spatial auditory interface proved to be significantly faster (3 minutes, 12 seconds) than the JAWS screen reader with ALVA braille display (8 minutes, 38 seconds), F(1,70) = 391.523, p < .001, and 15% more accurate when gathering information on text alignment, F(1,70) = 28.220, p < .001. No significant difference between the interfaces could be established when comparing questions on text decorations, F(1,70) = 0.912, p = .343, or table structures, F(1,70) = 1.045, p = .310). Discussion: The findings show that the auditory interface with spatialized speech is more than 160% faster than the tactile interface while remaining equally accurate and effective for gathering information on various properties of text and tables. Implications for practitioners: The spatial location of synthesized speech can be used for the fast presentation of the physical position of texts in a file, their alignment, the dimensions of tables, and the position of specific texts within tables. The quality of spatial sound reproduction can play an important role in the overall performance of such systems.

**********

Today, most computer interfaces are based on visual interaction, requiring the user to be able to see for the interface to be used effectively. Users who are visually impaired (that is, those who are blind or have low vision) compensate for the blocked visual channel by using other senses, such as the auditory channel and the sense of touch. Tactile interfaces, that is, refreshable braille displays, provide an accurate and reliable method of interaction, but are hampered by a lower reading speed. They also require extensive learning and adaptation time to be used effectively. Auditory interfaces, on the other hand, are, in most cases, more intuitive and can be used with much less prior learning, although they still require a certain amount of knowledge of the meaning of different auditory cues or different auditory icons (Sodnik, Dicke, & Tomazic, 2010).

Auditory interfaces can be divided into two major groups: speech- and nonspeech-based interfaces (Brewster, 2002). Speech interfaces are based on human speech that can be recorded and replayed or synthesized by a computer. Since speech is the most common and intuitive way of exchanging information, speech interfaces require a short or no learning period and can be used by almost everyone, provided that the user understands the language used in the interface and is not hampered by any type of hearing impairment (Schmandt, 1994). Nonspeech interfaces are used mostly as an extension of graphical user interfaces (GUIs), visual interfaces that are presented via a computer screen and manipulated by a mouse and a keyboard. In this type of interface, sound is used to inform users about important background processes or programs that are running on their computers and requiring their attention at a certain moment, such as when new e-mail messages arrive, computer viruses are detected, or when batteries on portable machines reach low levels.

There are several types of computer interfaces that are designed especially for users who are visually impaired. In most cases, the auditory and tactile interfaces are merely supplements to the standard GUIs that are intended to present the graphically oriented content with sound or to display it on a braille display. Braille displays are available only from specialty manufacturers and are therefore expensive and not available to all potential users. The tools most commonly used by visually impaired computer users are screen readers. A screen reader scans the content of a GUI and reads the text parts with the use of synthesized speech.

In general, screen readers focus mainly on the text and give almost no information on the physical structure of the document, such as window sizes, text orientation, and style. The latter can in some cases be provided with special keyboard shortcuts at a user's request. For example, a screen reader informs the user of the current location in a table by saying: "column 4 of 5, row 2 of 3."

Crispien, Wurz, and Weber (1994) used spatial sound as an extension of a screen reader. The main part of their work was spatially positioned synthesized speech, which enabled the user to identify the position of the spoken text parts in relation to the visual representation on the screen. Their audio processing was based on a Beachtron system, a professional and high-quality digital sound-processing device for audio spatialization. Crispien et al. proposed some possible applications of such a system, but did not conduct any studies or evaluations of users.

Many researchers have proposed auditory web browsers for computer users who are visually impaired that use three-dimensional or spatial sound. By three-dimensional sound, we refer to the sources of sound whose place of origin is at a specific spatial position in relation to the listener. It can be played through a set of speakers or headphones if specific sound-processing techniques are applied (Cheng & Wakefield, 2001). Goose and Moller (1999) proposed a three-dimensional audio-only interactive web browser that transformed an HTML-based web page into a virtual space with multiple spatial sources of sound. The purpose of the system was to inform the users about the physical structure of visual documents, since most of the context of HTML files is lost when only the text parts are interpreted. In Goose and Moller's case, an HTML document was read, interpreted, and played by several static and moving sources of sound. The positions of the sources represented the actual positions of the graphical elements within the document.

[FIGURE 1 OMITTED]

A different version of a three-dimensional auditory browser, called ABWeb, was proposed by Roth, Petrucci, Assimacopoulos, and Pun (2000) and Roth, Petrucci, Pun, and Assimacopoulos (1999). This browser enabled users who were visually impaired to explore and interact with web pages. The system transformed digital web documents and their segments into three-dimensional sources of sound at various spatial locations.

Synchronized Multimedia Integration Language (SMIL) is a markup language for describing different multimedia presentations and documents. Two different three-dimensional audio extensions of SMIL were proposed by Goose, Kodlahalli, Pechter, and Hjelsvold (2002) and Pihkala and Lokki (2003).

In the study presented here, we explored the benefits and drawbacks of using spatially positioned synthesized speech in auditory interfaces for computer users who are visually impaired. A standard screen reader was upgraded with spatial sound and the capability of manipulating and changing the spatial position of synthesized speech. The latter means that synthesized speech can be positioned at any spatial position relative to the listener or even moved during playback. We report on a user study of a practical application of such systems--an enhanced word processing application compared to conventional screen-reading software with a braille display.

Design and architecture of the system

The entire system was developed in Java programming language and consists of several independent modules that are combined and reprogrammed to work together. The architecture of the system is shown in Figure 1.

The system itself was described in detail in Sodnik and Tomazic (2011). Only the basic functionalities of individual modules are described next. The main part of the speech synthesis module is FreeTTS (FreeTTS, 2011), a Java text-to-speech (TTS) program that is reprogrammed to output an array of 16-bit samples that can be used in the spatial sound-processing module. The latter adds spatial properties to the synthesized speech using OpenAL (audio library; OpenAL, 2011) and JOAL (Java OpenAL; JOAL, 2011) positioning libraries. It also includes an external Massachusetts Institute of Technology Media Lab HRTF (head-related transfer function) library (Gardner & Martin, 1994) to increase the accuracy of the final localization. Sennheiser HD 270 studio headphones are used for playback. Because of the headphones' good attenuation of ambient noise (from--10 dB to--15 dB), no special quiet room is required for evaluation studies.

We also developed a prototype of an auditory interface for a word processing application based on the proposed spatial positioning module. The main mechanism of the application is the manipulation of the spatial positions of synthesized speech. We used the following coding scheme for the representation of text (the numbers in brackets are the spatial angles in degrees in a Cartesian coordinate system): (1) central alignment: a fixed position of the synthesized voice at coordinates (0 degrees, 0 degrees, 0 degrees), (2) left alignment: a fixed position of the synthesized voice at coordinates (-20 degrees, 0 degrees, 0 degrees), (3) right alignment: a fixed position of the synthesized voice at coordinates (20 degrees, 0 degrees, 0 degrees), and (4) justified alignment: a moving position of the synthesized voice between coordinates (-20 degrees, 0 degrees, 0 degrees) and (20 degrees, 0 degrees, 0 degrees) (see Figure 2). Table dimensions are described with spatial positions of synthesized speech as well. For example, a 5 x 3 table is coded as follows: The text in a specific cell is read from a specified spatial position. Text styles are coded in the following way: italic text: increase of pitch and rate for 20%, bold text: decrease of pitch and rate for 20%, and underlined text: increase of rate for 40%.

This work is different from previous related work in a number of ways: (1) it proposes a software solution for sound localization that is based solely on generally available hardware and can therefore be reused in other interfaces and applications, (2) it proposes an innovative coding scheme for an auditory presentation of the form and layout of text and tables in a word processing application, and (3) it reports on the results of a study of computer users with visual impairments that proved that the system is highly usable. The main hypothesis was that the use of spatial sounds and spatial positioning of synthesized speech should significantly improve the usefulness of the auditory interface. We also expected the users to evaluate the new interface as a positive and interesting improvement of the traditional auditory interfaces that are used in their everyday interactions with a computer.

Method

We tried to establish whether the newly proposed auditory interface with spatially positioned synthesized speech could be successfully used in practice. We compared its efficiency, accuracy, and speed with a conventional screen reader equipped with a braille display. The comparison was made over several typical word processing tasks, such as the design of text and tables.

PARTICIPANTS

A total of 12 adults (4 with near-total visual impairment and 8 with profound visual impairment, with a mean age of 33.4 years) were recruited through an appeal of the Intermunicipal Society of Blind and Visually Impaired People of Kranj, Slovenia. They were interviewed about their age, computer skills and braille proficiency, visual impairment, and possible hearing disabilities. The final test group was chosen on the basis of their self-reports of sufficient braille proficiency and regular use of refreshable braille displays (a minimum of 5 years of experience with braille displays), regular use of JAWS (Job Access with Speech, version 10.0.512) (Freedom Scientific, 2011) screen-reading software, and no hearing disabilities. Most studies of users were performed at the participants' homes or at their preferred locations. The general volume of all sounds was adjusted individually by each participant to achieve optimum hearing conditions. The research was approved by the National Medical Ethics Committee (Republic of Slovenia, May 2010, Ref.: KME 21p/06/ 10).

EXPERIMENTAL CONDITIONS AND PROCEDURE

We compared two types of user interfaces in two experimental conditions: BR condition (a screen reader with a braille display): JAWS screen reader with ALVA 544 Satellite braille display (Optelec, 2011) and SS condition (spatialized speech): the custom auditory interface based on spatialized speech.

BR condition

An MS Word 2003 application running on Windows XP was used for the composition of different text documents. Multiple portions of text with different text alignments and styles were generated along with several tables of various dimensions. The ALVA 544 Satellite braille display that was used in the BR condition contains 44 reading cells and 4 navigation front keys. This type of display is now available under the name ALVA 544 Satellite Traveller braille display (Optelec, 2011; see Figure 3). There are also some additional keys for manipulating the cursor within the word processing software and functioning as the arrow keys on a conventional keyboard. The basic navigation in the document (that is, movement from line to line) can therefore be performed by using only the braille display, which decreases the need to switch between the two keyboards.

[FIGURE 3 OMITTED]

SS condition

The text was saved in the form of plain text files that could be accessed and read by the spatial positioning module. A standardized Central European QWERTZ keyboard was used in the SS condition, with the arrow keys used for basic navigation. The same document structure and layout were created as in the MS Word files in the BR condition. Figure 4 compares a portion of the same text in both experimental conditions. As can be seen in Figure 4, the text used in the experiment was simple and required merely a basic reading level to be understood correctly. The same key combinations for moving from line to line and from paragraph to paragraph were available in both experimental conditions: arrow keys: moving from line to line (up and down) and TAB or CTRL+ arrow keys: moving from paragraph to paragraph.
Figure 4. A sample of a word-processing file in both
experimental conditions.

MS Word file (BR condition)

She grabbed a can of beer and opaned it. She smelled it and she
  couldn't smell anything. She swallowed some.

Yuck! she said. The beer was horrible. How could Daddy drink that
stuff? She put the can back into the refrigerator.

Tagged text file (SS condition)

<male_1 P=1.0 R=1.0 X=0.0 Y=0.0 Z=0.0>
She grabbed a can of beer and opened it. She smelled it and she
couldn't smell anything.
<male_1>

<male_1 P=1.2 R=1.2 X=0.0 Y=0.0 Z=0.0>
She swallowed some.
</male_1>

<male_1 P=1.0 R=1.0 X=-20.0 Y=0.0 Z=0.0>
Yuck! she said.
</male_1>

<male_1 P=0.8 R=0.8 X=-211.0 Y=0.0 Z=0.0>
The beer was horrible.
</male_1>

<male_1 P=1.0 R=1.0 X=-20.0 Y=0.0 Z=0.0>
How could Daddy drink that stuff? She put the can back
into the refrigerator.
</male_1>


A within-subjects method was used in the experiment, with 6 of the 12 participants starting with the BR condition and the other 6 starting with the SS condition. A 15-minute break was assigned to each participant before the experiment was repeated with a new condition.

PROCEDURE

The experiment consisted of six independent tasks: three tasks in each experimental condition. In each task, the participants were asked to read and process a different text file and to collect the information about its form and structure. The text files consisted of simple stories with basic reading-level requirements. Each file contained a table of different dimensions (the sizes of the tables varied from 3 x 2 to 5 x 5 and were randomly distributed to balance the complexity across conditions) and multiple different text alignments and text styles (italic or bold). Each text file consisted of 10 paragraphs of approximately 30 words with different text alignments and with at least one portion of bold or italic text. A part of such a file is presented in Figure 5.

At the beginning of the study, the participants were acquainted with the experimental procedure. Since they were all skilled in processing documents with screen readers and refreshable braille displays, no special instructions about the BR condition were provided. However, a short explanation of the SS condition was provided, concentrating mainly on the use and meaning of spatial metaphors and spatial presentations of text elements. To prove the high intuitiveness of such interfaces, we provided no special training for the participants. Before each task, the participants were given three minutes to prepare for each experimental condition and to get familiar with the interface and the interaction technique. A test document with arbitrary content was used for this purpose.

DATA COLLECTION

Four variables were evaluated: (1) task-completion times, (2) the correctness of the perceived information on text styles and alignment, (3) the correctness of the perceived information on table structures, and (4) a subjective evaluation of the interface.

Task-completion times

The duration of each task was measured automatically by the application. The timer was controlled by the experimenter and was started when suggested by the participant. The timer was stopped when the end of the document was reached and when suggested by the participant.

Text styles and text alignment

In each file, 12 prominent layout properties were chosen that had to be perceived and reported by the participants. The latter was done on the basis of 12 questions regarding these properties, 1 question for each property. The experimenter asked the questions during the experiment, adjusting to the reading speed of each participant. The correct answers were marked with points. The participants could be assigned up to 6 points for the questions on text styles and up to 6 points for the questions on text alignment. Three examples of the test questions that were used are, What does the bold text in the first paragraph say? Is there any italic text in the second paragraph? and What type of text alignment is used in the second paragraph?

[FIGURE 5 OMITTED]

Tables

There was one table in each text file. The participants were asked to define the dimensions of the table as well as the positions or text styles of two specific text strings in the table. A point was assigned to the participants for each correct answer. Altogether, three points could be earned on the basis of three correct answers. Three examples of the test questions used are (1) How many columns and how many rows is the table composed of? (2) Which cell contains the word milk? and (3) Is the word whiskey in bold or italic?

Subjective evaluation of the interface

The users' feedback and personal evaluations of the system were also collected. After each experimental condition, every participant was asked a set of questions to obtain some descriptive and informal answers. A few examples of the questions are, How difficult was it to use the interface? How much mental and perceptual activity was required to solve a specific task? Was the coding scheme intuitive enough? In your opinion, what is the main advantage of the new interface? In your opinion, what is the main disadvantage? and How could the interface be improved? The answers were written down by the evaluator.

Results

TASK-COMPLETION TIMES

The task-completion time was logged within the application and controlled by the experimenter. An average completion time was calculated for each experimental condition from 36 time measurements (12 participants and three tasks within each interface). Figure 5 shows the average task-completion times and the corresponding confidence intervals for both interfaces.

The auditory interface with spatialized speech proved to be significantly faster than the JAWS screen reader with ALVA braille display (the average completion time was 8 minutes and 38 seconds in the BR condition and 3 minutes and 12 seconds in the SS condition). The within-subject analysis of variance (ANOVA) confirmed a significant difference between the interfaces: F(1,70) = 391.523, p < .001. The results reported here confirm the hypothesis that the auditory interface with spatialized speech is faster than the conventional screen reader with a braille display.

TEXT STYLES AND TEXT ALIGNMENT AND TABLES

The experimenter observed the participant while he or she was performing the test and asked the questions about the text styles and alignment of the following paragraph in advance, so the participant was always aware of the task and the information that had to be extracted next. The normalized values were calculated, and the results are presented in percentages. Each value was calculated as a sum of 36 values divided by a maximum number of points for each feature (style: 36 x 6 = 216, alignment: 36 x 6 = 216). Table 1 shows both the percentage and the absolute number of correct answers for both groups of tests.

Both interfaces seemed to be similarly effective for gathering information on various text properties; however, the ANOVA showed a significant advantage of the auditory interface with spatialized speech when comparing the questions on text alignment: F(1,70) = 28.220, p < .001. No significant difference between the interfaces could be established when comparing the questions on text style: F(1,70) = 0.912, p = .343.

Again the normalized values were calculated for the questions on table structures. In this case, the maximum number of points was three (structure: 36 x 3 = 108). The results presented in Table 2 shows no significant difference between the interfaces when comparing the questions on tables: F(1,70) = 1.045, p = 0.310.

SUBJECTIVE EVALUATION OF THE INTERFACE

The overall satisfaction with the proposed interface was high and thus encouraging. The participants mentioned the high intuitiveness of the spatial positioning of synthesized speech for the description of text layout and properties and of table structures, which also resulted in a low mental demand while solving tasks.

The positive comments referred mostly to the high level of intuitiveness of such an interface: "The spatial positioning of voices indicating text alignment is very intuitive and effective," "The presentation of tables is very intuitive and great to navigate through," "This is a great interface for a fast revision of documents," and "The interface could also be very useful in real-time editing." The negative comments referred mostly to the quality of sound reproduction: "The vertical localization of spatial sound sources could be better," "The differences in pitch and rate indicating text style could be more substantial in order to be more distinctive," and "The quality of the speech synthesizer should be improved."

Discussion

In the study, we evaluated the use of spatialized speech in auditory interfaces for computer users who are visually impaired. We developed an auditory interface similar to the interface of standard screen-reading software with the additional functionality of the spatial positioning of synthesized speech and the possibility of changing the pitch and rate. The main goal of the study was to measure and evaluate the speed and accuracy of this interface in comparison with standard screen-reading software.

The auditory interface with spatialized speech proved to be more than 160% faster than the tactile interface. The majority of participants reported on their subjective perception of the difference in speed between the two interfaces. It is important to note that although the auditory interface was faster, it did not cause any degradation in accuracy. Quite the contrary, the participants were significantly more successful in gathering information on text alignment. They had no problems separating three different horizontal positions of synthesized speech (-20 degrees, 0 degrees, and 20 degrees). They reported some uncertainty only in the situations in which one portion of the text was aligned to the left and the next one was aligned to the right. In such cases, the speech position changed from -20 degrees, 0 degrees, 0 degrees to 20 degrees, 0 degrees, 0 degrees, which sounded similar to the coding rule for the justified text alignment described as a smooth change of position from -20 degrees, 0 degrees, 0 degrees to 20 degrees, 0 degrees, 0 degrees during speech reproduction.

Both interfaces proved to be equally effective for the presentation of table structures and for navigation through the content of tables. Most participants reported having no problems determining the horizontal sequence and positions of cells; however, some of them complained that they had problems localizing the current elevation of synthesized speech and, as a consequence, determining the vertical position of individual table cells. Some of them mentioned also having problems separating two neighboring cells or rows. In our case, the minimum spatial angle between two sources of sound in near proximity was 10 degrees in the horizontal dimension and 20 degrees in the vertical dimension.

The use of variations in pitch and speaking rate proved to be an effective way of coding information on text style, such as underlined or bold text. However, some participants suggested that the differences in pitch and rate should be increased even more to make them more distinguishable.

Four participants complained about the relatively poor quality of synthesized speech, claiming that the speech sounded too "robotic." As was mentioned earlier, we used the freely available TTS software that enabled reprogramming and post processing. We believe the use of a better, perhaps specially developed, speech synthesizer would also increase the quality and intelligibility of speech output.

Nine participants confirmed the effectiveness of such an interface and commented on its high usability for word processing. The interface enables auditory feedback on the document structure and properties that otherwise cannot be gathered without a braille display. One participant specifically pointed out the excellent representation of tables and table structures, since immediate feedback on the position of the currently selected cell can be retrieved merely by localizing the synthesized speech.

With this experiment, we clearly demonstrated that significantly more information can be conveyed to the user by positioning the speech source in space. The position of the speech source can be used to describe the physical dimensions and locations of objects in the interface. We believe that the spatial positioning of synthesized speech could be an effective extension of any screen-reading software package. This feature could be used in many different situations and in various applications, from web browsers to word processors and graphic software packages.

LIMITATIONS

The current version of the interface gives information about the layout of existing documents. It is intended primarily for reviewing and not for creating new documents. Its functionality should be extended to enable real-time feedback in the process of writing and editing new documents.

We would also like to point out the well-known problem with elevation localization (Gardner & Martin, 1994; Bronkhorst, 1995; Sodnik, Susnik, Stular, & Tomazic, 2005). By elevation, we refer to the localization in the vertical dimension (up and down). The determination of the elevation of the source of sound is, in most cases, less reliable and less accurate than the determination of its azimuth or the localization in the horizontal plane. We therefore suggest that the elevation of the source of sound should not be used to present crucial information in the document. The interface should have an alternative way of presenting this information.

Another important limitation of the research was the lack of measurable information on the participants' braille proficiency, visual acuities, and hearing disabilities. The aforementioned data were collected with self-reports. A sufficient braille proficiency was determined with a minimum of five years of experience with refreshable braille displays and its regular use for work or study.

References

Brewster, S. A. (2002). Chapter 12: Non-speech auditory output. In J. Jacko & A. Sears (Eds.), The human-computer interaction handbook (pp. 220-239). Mahwah, NJ: Lawrence Erlbaum.

Bronkhorst, A. W. (1995). Localization of real and virtual sound sources. Journal of the Acoustical Society of America, 98, 2542-2553.

Cheng, C. I., & Wakefield, G. H. (2001). Introduction to head-related transfer functions (HRTFs): Representations of HRTFs in time, frequency, and space (invited tutorial). Journal of the Audio Engineering Society, 49, 231-249.

Crispien, K., Wurz, W., & Weber, G. (1994). Using spatial audio for the enhanced presentation of synthesised speech within screen-readers for blind computer users. Computers for Handicapped Persons, 860, 144-153.

Freedom Scientific. (2011). Previous JAWS for Windows downloads. Retrieved from http://www.freedomscientific.com/downloads/ jaws/JAWS-previous-downloads.asp

FreeTTS 1.2. (2011). A speech synthesizer written entirely in the Java programming language. Retrieved from http://freetts.sourceforge.net/docs/index.php

Gardner, B., & Martin, K. (1994). HRTF measurements of a KEMAR dummy-head microphone (Technical report 280). Cambridge, MA: MIT Media Lab.

Goose, S., Kodlahalli, S., Pechter, W., & Hjelsvold, R. (2002). Streaming speech3: A framework for generating and streaming 3D text-to-speech and audio presentations to wireless PDAs as specified using extensions to SMIL. Proceedings of the International Conference on World Wide Web (pp. 37-44). New York: Association for Computing Machinery.

Goose, S., & Moller, C. (1999). A 3D audio only interactive Web browser: Using spatialization to convey hypermedia document structure. Proceedings of the 7th ACM International Conference on Multimedia (pp. 363-371). New York: Association for Computing Machinery.

JOAL. (2011). Index of deployment archive. Retrieved from http://jogamp.org/deployment/archive/reN2.0-re2/archive/

OpenAL (2011). Creative Labs: Connect OpenAL downloads. Retrieved from http://connect.creativelabs.com/openal/Downloads/Forms/ AllItems.aspx

Optelec (2011). ALVA Satellite braille display series. Retrieved from http://www.optelec.com/en_GB/products/braillecomputer-access/alva-satellite- braille

Pihkala K., & Lokki, T. (2003). Extending SMIL with 3D audio. Proceedings of the 2003 International Conference on Auditory Display (pp. 995-998). Retrieved from http://lib.tkk.fi/Diss/2003/isbn9512268043/article3.pdf

Roth, P., Petrucci, L. S., Assimacopoulos, A., & Pun, T. (2000). Audio-haptic Internet browser and associated tools for blind and visually impaired computer users. Workshop on Friendly Exchanging Through the Net, 57-62. Retrieved from http://unige. academia.edu/PatrickRoth/Papers/ 1048737/Audio-haptic_internet_browser_and associated_tools_for_blind_and_visually_impaired_computer_users

Roth, P., Petrucci, L. S., Pun, T., & Assimacopoulos, A. (1999). Auditory browser for blind and visually impaired users. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, Pittsburg, PA (pp. 218-219). New York: Association for Computing Machinery.

Schmandt, C. (1994). Voice communication with computers: Conversational systems. New York: Van Nostrand Reinhold.

Sodnik, J., Susnik, R., Stular, M., & Tomazic, S. (2005). Spatial sound resolution of an interpolated HRIR library. Applied Acoustics, 66, 1219-1234.

Sodnik, J., Dicke, C., & Tomazic, S. (2010). Auditory interfaces for mobile devices. Encyclopedia of wireless and mobile communications (pp. 1-9). New York: Taylor & Francis.

Sodnik, J., & Tomazic, S. (2011). Spatial speaker spatial positioning of synthesized speech in Java. Lecture Notes in Electrical Engineering, 68, 359-371.

Jaka Sodnik, Ph.D., assistant professor, Faculty of Electrical Engineering, University of Ljubljana, Trzaska 25, 1000 Ljubljana, Slovenia; e-mail: <jaka.sodnik@fe.uni-lj.si>. Grega Jakus, Ph.D., assistant, Faculty of Electrical Engineering, University of Ljubljana, Slovenia; e-mail: <grega. jakus@fe.uni-lj.si>. Sago Tomazic, Ph.D., full professor, Faculty of Electrical Engineering, University of Ljubljana, Slovenia; e-mail: <saso.tomazic@fe.uni-lj.si>.
Table 1
The percentage and absolute number
of correct answers on text style and text
alignment.

                                         Word-
                                       processing
                    JAWS screen      software with
Text style and      reader with       spatialized
text alignment    braille display        speech

Text style         98% (211/216)     96% (207/216)
Text alignment     83% (179/216)     98% (212/216)

Table 2
The percentage and the absolute number
of correct answers on tables.

                                  Word
                               processing
            JAWS screen      software with
            reader with       spatialized
Tables    braille display        speech

Tables     100% (108/108)    99% (107/108)

Figure 2. An example of table coding for the table
containing five columns and three rows.

      POS                POS                POS
 (-20[degrees],     (-10[degrees],      (0[degrees],
  20[degrees],       20[degrees],       20[degrees],
  0[degrees])        0[degrees])        0[degrees])

      POS                POS                POS
 (-20[degrees],     (-10[degrees],       (0-,0-,0-)
  0[degrees],        0[degrees],
  0[degrees])        0[degrees])

      POS                POS                POS
 (-20[degrees],     (-10[degrees],      (0[degrees],
 -20[degrees],      -20[degrees],      -20[degrees],
  0[degrees])        0[degrees])        0[degrees])

      POS                POS
  (10[degrees],      (20[degrees],
  20[degrees],       20[degrees],
  0[degrees])        0[degrees])

      POS                POS
  (10[degrees],      (20[degrees],
  0[degrees],        0[degrees],
  0[degrees])        0[degrees])

      POS                POS
  (10[degrees],      (20[degrees],
 -20[degrees],      -20[degrees],
  0[degrees])        0[degrees])
COPYRIGHT 2012 American Foundation for the Blind
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2012 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Sodnik, Jaka; Jakus, Grega; Tomazic, Saso
Publication:Journal of Visual Impairment & Blindness
Article Type:Report
Geographic Code:1USA
Date:Oct 1, 2012
Words:5382
Previous Article:Videophone technology and students with deaf-blindness: a method for increasing access and communication.
Next Article:Teaching the benefits of smart phone technology to blind consumers: exploring the potential of the iPhone.
Topics:

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters