Sign gesture representation using and-or tree.
A group of deaf and hearing impaired people belonging to a region develops a Sign language among themselves to enhance the communication between them. In general Sign Language is not a universal language as it varies across the universe and it is also more localized. Sign Languages like American Sign Language (ASL), British Sign Language (BSL), Australian Sign Language (AUSLAN) and Indian Sign Language (ISL) are in use by the community in their specific region. Even though the deaf and hearing impaired people have their own language due to improper education facility, unavailability of good sign language tutor, lack of importance to that community by the other society people they cannot able to lead a normal life like other people and unemployment issues are also more in their community.
In order to improve and easy their life style they should be educated well. For this a communication system is needed which breaks the communication barrier between deaf and hearing impaired people and the normal people. The mentioned factors lead to a development of an automatic translation system which translates spoken language text to sign language gestures and to provide good education for them. If a system that automatically able generate sign gestures for the given spoken language word or sentences it will be more easy to everyone to interact with deaf and hearing impaired people and among themselves. The system should also be designed in such a way that it use less memory space, can be used without any Human interpreter and does not use any static videos.
In the path of development of automatic translation system begins with limited words where the words and respective signs are stored in a database and retrieved. This approach is restricted to the videos and the memory space of the system. Capturing sign videos also plays an important role. The camera position should be in the right place in order to cover the fine details of the SLOM notation (hand shape, hand location, hand orientation and hand movement). The lighting effect also should be prefect to capture the facial expression. These video should be appropriate format so that it can be accessed in all system without the requirement of any plug-ins. The size of the video should be small so that more number of videos can be stored.
Some of the system uses human interpreter for online broadcasting. As in television broadcasting were a human interpreter will sign according to the text as in news broadcasting. The disadvantage of this entire process depends on the signer and the regions were they belong to. As sign language has many form of dialects, if the end user doesn't belong to that particular region or not familiar with that type of signing he will not able understand the meaning of the sign gestures. The signing style and speed of signing of the signer also play an important role. So the selection of the human interpreter should be done properly.
The recent system uses humanoid for signing. These humanoid are virtual avatars which can be programmed for rendering sign gestures. This approach with the help of avatars overcomes the mentioned drawbacks. It does not have the trouble of capturing videos and does not engage expensive human interpreter. These avatars are easily programmable and they can also take different action or different forms depending on the changes in program code and they also reduce the trouble of re-taking the scene as it is the case in video capturing. One of the most interesting features of this approach is that it can be used in all regions depending on their sign language; it does not depend of signer style and region as only the coding to be changed with respect to their sign gestures. The disadvantage of this approach is that the programmer should have the knowledge of the sign language and sign gestures and the programmer should be educated with the signs and corresponding words.
To overcome the drawbacks of earlier methods a novel method of representing sign gestures explained in the forthcoming sections. The next section deals with the representation of sign gestures using And-Or trees, Section 3 discuss the correctness of the representation. The last section discusses the performance evaluation of this approach and future enhancement of this approach.
Representation of sign gesture:
The sign gesture is the union the sign phonemes. The sign phonemes (manual and non-manual features) are the basic units of the sign gestures. These are ordered based on SLOM notation (D.Narashiman et al.,2013). The SLOM notation gives the sequence in which these phonemes are arranged and finally rendered. These basic units' forms a set comprised of hand shape, hand location, hand orientation, hand movements and non-manual features. The non-manual features comprised of facial expression and bodily movements. As per SLOM notation the rendering of the sign gesture is given as (hand shape [right arrow] hand location [right arrow] hand orientation [right arrow] hand movement) [disjunction] Non-Manual features. In this approach the basic sign units are stored in memory, depending upon the sign gestures they are retrieved and rendered. As in recent literature there are various sign notation are developed to represent the sign gestures but they all lack in final representation of the sign gestures.
Stoke notation was developed in 1960 by William Stokoe, it was written using symbols similar to the Roman alphabet symbols along with vague character that indicates the location, hand shape, hand orientation and hand movement of the sign. They are language independent and usually written in horizontal manner (from left to right). Stoke does not support non- manual features and hence they are not frequently used by the deaf peoples. For example the American Sign Language word snake is represented in Stoke notation as shown in the figure 1.
Ham Nosys Notation:
HamNosys (Thomas Hanke 2004) was introduced by Hamburg university research group. It represents the physical action of a sign. In addition to the stoke notation, it also has an additional parameters set at the end of the word representation, the Non-manual features representation. They are usually represented in a symbolic format and language independent. It is written in a horizontal way. They also support non manual features. The purpose of HamNoSys has never been a usage in everyday communication. It was designed to comply with research requirements, e.g. for corpus annotation, sign generation, machine translation and dictionary construction (Tirthankar dasgupta et al., 2008). For example the American sign language word house is represented in HamNosys notation as shown in the figure 2.
Sign writing notation:
Sign Writing (Y. Bouzid and M. Jemni 2013)was implemented by Valerie Sutton in 1974. HamNosys and Sign writing are widely used. Sign writing is more advantageous over HamNosys. In sign writing approach, the Facial expressions can be easily represented. The Sign writing method is the representation of signs gesture, helps in retrieving signs for a particular word in organized sequences of gestures and does not have the eventual linguistic content. Sign writing can be used by linguists to write in sign languages, because it provides a means of representing the syntactical and so-called phonetic and phonologic aspects of signs, and neutral with respect to meanings. The sign writing is based on a set of graphical and schematic symbols that are highly intuitive, and use simple rules for translating symbols into signs; it provides a simple and effective way for hearing impaired and hearing people. One doesn't require any special technical training in learning sign language linguistics, to write the signs. The intuitiveness of an individual seems to be the main driving forces for the sign writing notation to increase acceptance among people and enhance the interest in written forms for sign languages. Symbols used in this system are pictures which are similar to the real forms of manual and non-manual features. As shown in the figure 3
The above table shows the various notations for same sign gestures. The column of the table represent word and its corresponding sign gesture, Sign writing notation, Stokoe notation and Ham Nosys notation. The rows of the table give the difference between these notations. It is clear that the symbols used by these notations are hard to remember and represent. The drawback of these representations is that they give little details about the location of the hand position in the signing space and in case of phrase the representation will be long enough to understand and it is also difficult to memorize all the symbols. Among them the sign writing is simple. To overcome these problems an automatic system should be developed which takes any one of this notation and transcribe them to animation sequences so that it can be easily understandable.
As state-of-the-art the sign gesture are generated using SWML (SignWriting Markup Language) is an XML(Extended Markup Language)-based format that was developed for the storage and processing of SW(Sign Writing) texts, allowing thus the interoperability of SW applications and promoting web accessibility to deaf people in their own natural languages. This approach overcomes drawbacks of earlier approach by applying rules to inherit the location of the hand in the signing space.
The proposed approach represent the sign gesture using And-Or tree. The advantage of the proposed method over the earlier method is that the location of the hand position is clearly mentioned. The figure 4 graphically show the And-Or tree representation one hand sign gestures. The root is an And node and it represent the sign morphemes or gesture shown by left hand. The children of root represent the manual and nonmanual features of the signs. The manual node and non-manual node is formed by the And operation of its children nodes. The children of manual node are hand shapes, hand locations, hand orientation, hand rotation, hand movements. The children of the non-manual node are eye shapes, eye brow shapes, mouth shapes, nose shapes, cheek shapes and head movements. The external nodes or leaves represent the sign phonemes of and shapes, hand location, hand orientation, hand movement, eye shapes, eye brow shapes, mouth shapes, nose shapes, cheek shapes and dummy nodes. The phonemes are the basic unit which represent different shapes, location, orientation, movements and facial expression components. The parent node corresponding to sign phonemes--hand shape, hand location, hand movement, eye shapes, eye brow shapes, mouth shapes, nose shapes, cheek shapes and head movements is an Or node and the node corresponding to the Orientation is an And node. The orientation corresponds to hand posture, hand rotation and touch -contact with body parts. Table 3 lists the sign gestures phonemes.
A similar tree is constructed for right hand also. Both the sub tree is joined to form the final sign gestures of two handed sign gesture. As shown the below figure 5
The next section discusses the correctness of And-Or representation of sign gesture and also proposes the ideal representation of the tree for various sign gestures possibility.
Correctness of the tree representation:
This section gives an overview of constructing the And-Or tree from the basic sign phonemes. Basically sign gestures are classified as one hand signs with or without facial expression, two handed signs with or without facial expression.
Single hand signs with or without facial expression:
It can represented by a single left or right And-Or tree depending on the selection of hand. If facial expression is not there the dummy node of the facial expression phonemes are combined with manual features. Otherwise the corresponding facial expression phonemes are rendered along with the manual features. Example Father. The sign gesture for father is shown in figure 6
Two hand signs with or without facial expression:
These signs can be represented by joining the left and right And-Or tree through a root which is an And root and if facial expression are present it will also be generated. Example Family. The sign gesture for family is shown in figure 7
If a sign gesture is a combination of more than one sign gesture then the single Left and Right handed sub trees are replicated. In this case the root will be the conjunction of two or more And-Or single tree. The temporal aspects are also taken care in this representation. The ordering of signs is done by left-right order. Example Brother-in-law. The sign gesture for father is shown in figure 8
Synchronisation at three different levels is considered in this representation. At the low level between the manual and non-manual features, middle level between the hands and high level between the signs. An order property is specified to maintain the synchronisation. The order property is that the rendering should be from left to right that is the left most tree is first rendered.
Low level synchronisation is between the manual and non-manual. It can be represented mathematically as single hand sign gesture = manual [disjunction] non-manual (manual and non-manual) as in this the left hand tree is rendered first and the right hand tree is rendered and both of them is combined at the root level. Middle level synchronisation is between two hands. In this type the root will be the combination of two single hand sign gesture tree it is represented as two handed sign gesture = single hand sign gesture [disjunction] single hand sign gesture. The third level of synchronisation is between the signs which is represented as multi hand sign gesture = two handed sign gesture [disjunction] two handed sign gesture. In all these level a new root node is formed by the combination of the sub-tree through conjunction operations. This construction is similar to leftist tree.
3.5 Extension of the tree structures: If any new phonemes is to insert it can be done in constant time as they will directly attached to the leaf or external node. The structure property of this representation is that the leaf forms the basic unit. So the insertion and deletion of phonemes are done in constant time.
The avatar should have realistic functionality of body movement, facial features and individual fingers on hands and be able to articulate non-manual features. Since all the basic units are in the tree structure the sign gesture will be properly rendered.
All features, characters and any other variables should be consistent throughout. All the features are retrieved from the same tree the consistency is maintained.
All movement particularly that of the hands should flow smoothly within signs and throughout the whole utterance. As the order property ensures synchronisation the fluidity is also achieved.
Conclusion and performance evaluations:
This section brief about the synthesis of avatar by using Sign writing notation and through And-Or tree. The following figure 9 shows the improper rendering of signing avatar by Vsigns animation system. (M. Papadogiorgaki et al, 2004)
From the figure the highlighted region clearly shows that two hands overlap each other and the rendering is not perfect and properly rendered(Kgatlhego Moemedi and James Connan -2010).This is due to the lack of information about hand orientation and hand movement. But with And-Or tree this not the case the animation are properly generated and rendered as the construction of the model is ordered from left to right. For complex signs SW notation way of rendering is not effective. By using And-Or tree approach it can easily achieved by Synchronisation.The following graph compare the sign gesture rendering by Signwriting notation and And-Or notation.
From the figure 10 the X-axis in the Bar graph represents the words and Y-axis represents the number of sign gestures rendered correctly by using both the notation. It is clear from the plotting that And-Or representation is better than sign writing notation for generating sign gestures. This approach is efficient, can be easily used, synchronisation between the sign gestures is also good in this approach. This representation can be extended to more sign phonemes.
Received 12 October 2014
Received in revised form 26 December 2014
Accepted 1 January 2015
Available online 25 February 2015
Kgatlhego Moemedi and James Connan, 2010. Rendering an Animated Avatar from Sign Writing Notation. Southern Africa Telecommunication Networks and Applications Conference.
Narashiman, D., Bavatharani Suriyan and Dr. T. Mala, 2013. An Avatar Rendering Hand Gesture for Tamil Words. 12th Tamil Internet Conference, pp: 71-75.
Papadogiorgaki, M., N. Grammalidis, L. Makris, N. Sams and M.G. Strintzis, 2004. VSigns--A Virtual Sign Synthesis Web Tool. Information and Knowledge Management for Integrated Media Communication, pp: 25-31.
Thomas Hanke, 2004. Ham No Sys--Representing Sign Language Data in Language Resources and Language Processing Contexts.Fourth international conference on Language Resources and Evaluation.
Tirthankar dasgupta, Sandipan dandpat, Sambit Shukla, Sandeep Kumar, Synny Diwakar and Anpam basu, 2008. A Multilingual Multimedia Indian Sign Language Dictionary Tool.6th Workshop on Asian Languae Resources, pp: 57-64.
D. Narashiman and Dr. T. Mala
Anna University, Department of Information Science and Technology, Faculty of Information and Communication Engineering, College of Engineering Guindy, Chennai-600 025
Corresponding Author: D. Narashiman, Teaching Fellow, Anna University, Department of Information Science and Technology, Faculty of Information and Communication Engineering, College of Engineering Guindy, Chennai-600 025
Table 2: Types of sign gestures. Type 1- Single hand signs with I, Me, You, It, They. or without facial expression. Type 2- Two handed signs with Family, Baby, Wedding, or without facial expression Relative, Children. Type 3- Complex signs Brother-in-Law, Grand-son, Hard Working. Table 3: Sign Gesture-Phonemes. SG Sign Gesture MF Manual Feature NMF Non-Manual Feature HS Hand Shapes HL Hand Location HO Hand Orientation HM Hand Movement S Shape L Location O Orientation M Movement HP Hand Position HR Hand Rotation HT Hand Touch P Position R Rotation T Touch ES Eye Shape EBS Eyebrow Shape NS Nose Shape BM Body Movement MU Mouth CS Cheek Shape E Eye EB Eyebrow N Nose MUS Mouth Shape
|Printer friendly Cite/link Email Feedback|
|Author:||Narashiman, D.; Mala, T.|
|Publication:||Advances in Natural and Applied Sciences|
|Date:||Jun 1, 2015|
|Previous Article:||Ontology assisted data mining and pattern discovery approach: a case study on Indian school education system.|
|Next Article:||New real coded crossover operators for genetic algorithms based on incomplete dominance and gene memory.|