Printer Friendly

Concept of video bookmark (videomark) and its application to the collaborative indexing of lecture video in video-based distance education.

This article describes the development of the video bookmark, hereinafter referred to as the videomark, and its application to the video-based distance education system. With the progress of computer and network technology, many higher education sector institutions such as universities are delivering their lectures via a network. Online education that uses personal computers and computer networks is one of the most promising and attractive learning methods (Harasim, Hiltz, Teles, & Turoff, 1995; Hazemi, Hailes, & Wilbur, 1998). Many tools already exist for online education, such as electronic mail (e-mail), bulletin board systems (BBS), and web-based online learning systems (Wolfe, 2000; Eisenstadt & Vincent, 1998). But currently all these tools exist separately. It is proposed that a more effective learning environment can be provided by combining these tools.


When we build online education systems, attractive teaching materials are essential. Participants of online education systems are isolated from other participants and often their will to study is reduced when they encounter tedious materials. But nowadays, many technologies are available for creating attractive teaching materials in the form of texts, images, and sounds. Multimedia in the form of texts, images, and figures are widely available and easy to use on personal computers (Vince & Earnshaw, 2001). Of these media, video data is the most impressive and most easy-to-understand teaching material. Creating a lecture video requires less effort than preparing teaching materials in other media. Watching a lecture video enables the learners to study as if they were participating in traditional classroom-style education. Usually, classroom lectures are recorded on videotapes and delivered through the network or off-line to participants who watch them and study. However, lectures usually include redundant parts. All parts of the lecture are not essential. Therefore, to watch a lecture video efficiently, it is necessary to find the essential parts of the lecture. To find those essential parts, it is necessary to index the video data.

In this article, the concept of the videomark, a variant of the bookmark, is proposed. A videomark is an electronic bookmark placed in the video. Furthermore, the combination of lecture video and the bulletin board system (BBS) is also proposed. Messages posted to the BBS are treated as videomarks of the lecture video. The combination of videomarks and BBS realizes a discussion-embedded lecture video. In this system, comments that are posted by the participants to the BBS are embedded in the corresponding parts of the video automatically. The participants can read the comments on the related topic while watching the lecture video, and can also watch the corresponding part of the lecture video while reading the discussion. Therefore, the videomarks can also be used as an index of the lecture video.

The development of a prototype system that adopts the previously-mentioned feature is also described in this article. This system proves that the proposed relation between the lecture and the discussion facilitates participant comprehension and promotes discussion effectively.


To indicate the essential parts of the lecture video, the following two methods can be adopted.

* Post-editing of the lecture video: The original raw lecture video is post-edited and only the essential parts are delivered.

* Indexing of the lecture video: An index of the contents is provided with the raw video.

However, both of these two methods incur a cost and are labor intensive. Therefore, automatic indexing is strongly required. Automatic video indexing is usually performed as follows (Furht, Smoliar, & Zhang, 1995):

1. Partitioning the video data: First, identify the elemental index units for video contents. In the case of text, for example, these units are words and phrases. In the case of video, these units are video frames and frame sequences. Camera breaks and scene changes are typical cue for partitioning the video data.

2. Representation and classification: After partitioning the video data, it is necessary to represent its contents. This representation may involve the use of text, or some typical video clip.

3. Indexing and storing: After representing the elemental index units, they should be arranged by their contents for retrieval.

4. Retrieving: Finally, users retrieve the video scenes they want to watch. For retrieving the scenes, some metrics that represent the similarity of the scene have to be introduced. Pair-wise pixel comparison and histogram comparison are typical examples of such metrics.

However, automatic indexing of the video data is difficult for two main reasons. The first reason is that video data does not have a clear alphabet. (Here, the term alphabet is used to denote the primitive description of the data.) For example, a natural language such as English has an alphabet set {a, b, c, ...} and an electric circuit drawing has an alphabet set {resistance, capacitor, transistor, ...}. However, video data does not have such a clear alphabet set. Basically, video data is a set of frames and each frame is a set of pixels. Each pixel represents the color information of the point. Therefore, pixels can be considered to be the alphabet of video data. But pixels are too primitive to be used as the alphabet of video data. Each pixel does not represent the contents of the video appropriately. The contents of video data are built by a set of video frames.

The second reason is the lack of a unified interpretation of the data. The term unified interpretation means that there is only one interpretation of each data. For example, there is no ambiguity in the interpretation of a circuit drawing. On the other hand, documents, sentences, and video scenes do not have a unified interpretation (Haga, 1997). These two reasons make it difficult to index video data automatically.

In our research, another way was selected. Namely, it is proposed that the index of video data should not be created automatically by computers, but by a human being. Our assumption is that only a human being can understand the contents and meaning of video data.

Furthermore, we emphasize that index should be created not by only one person but by plural persons. As there is no unified interpretation for video data, building an index by one person often represent the indexer's tendency. Creating index by plural persons enables more objective indexing. In the case of lecture video indexing, plural participants should take part in the creation of index.


Collaborative Indexing

To create an index made by lecture participants, the concept of the discussion-embedded video is proposed. The discussion-embedded video is video data that virtually includes discussions, comments, and messages within the video data. All messages are inserted at the appropriate position in the lecture video. Participants can treat these messages as the index of the lecture video. Furthermore, messages are managed hierarchically, which is similar to conventional BBS. In other words, the lecture video and BBS are combined in our system. Figure 1 illustrates an example of the display of the combination of lecture video and BBS.

As illustrated in Figure 1, the bulletin board includes one additional component, the timeline of the lecture video (enclosed by dashed line). This timeline represents the progress of the lecture video. One marker (indicated by black arrow in Figure 1) indicates the current play position of the lecture video. The rest of the bulletin board is similar to that of a conventional BBS system. The right part of the board displays the messages posted by the participants. Each top message has a pointer to the relevant part of the timeline of the lecture video. Additional comments and messages to each initial posted message are attached in the tree structure form adopted by conventional BBS systems.



In this system, participants who have watched the lecture video send messages to the BBS. A relation link between message and lecture video is created automatically when participants post their messages. Accumulated messages represent the summary of the lecture video. By viewing this board, later participants can understand the summary of the lecture video. This can be considered as an indexing of the lecture video. This index is created not by one person but by many participants. Therefore, this indexing can be called collaborative indexing.

General System Architecture

To create a discussion-embedded lecture video system, the method of how to distribute lecture videos has to be considered firstly. The easiest method of distribution is to use a computer network. In that system, all lecture videos are stored in video servers, and participants access video lectures by way of a computer network. However, watching a video of sufficient quality requires a high-level communication infrastructure such as an optical fibre network. Currently, few people have access to such a blessed environment. Up until now, most people have used normal telephone lines, Integrated Service Digital Networks (ISDNs) and Asynchronous Digital Service Lines (ADSLs) to access a network. These communication lines do not have enough capacity to watch lecture videos of sufficient quality.

On the other hand, almost all modern personal computers are equipped with high-speed removable storage devices such as CD-ROM and DVD-ROM drives. The transferring speed of these removable devices exceeds several megabytes per second. This speed is sufficient for playing full-screen (640 X 480 pixel) motion pictures. Furthermore, there are no difficulties in creating video CDs or DVDs. Therefore, these high-speed, high-density storage devices are suitable for distributing lecture videos.

Figure 2 illustrates the general system architecture of the system. In Figure 2, the "Discussion Management Server" manages the discussion. All comments and related information are stored in this server. All messages submitted by the participants will be sent to this server. The server periodically sends the discussion information to the client PCs. The participants can receive more recent discussion information when they log in to the system. As the lecture videos are sent by CD-ROM, we need not send them by way of a network. Discussion information is mainly constructed from text data. Therefore, sending them by way of a network does not require a high-level performance infrastructure.

Our prototype system works in the following manner:

1. CD-ROMs (DVD-ROMs) that contain lecture videos are distributed in advance.

2. Participants (learners) watch a lecture video on their PCs at a time convenient to them.

3. Participants send their messages to the discussion management server. When they send messages, information on the related position in the video is sent simultaneously and automatically.

4. Messages and position information are sent to all the clients at the appropriate time, for example, at the time of logging in to the system or at certain intervals.

5. By receiving information from the discussion management server, all participants are able to get complete information on the discussion.

Roughly speaking, the video data is a sequence of still images (called "frames"). Therefore, the position in the video data can be identified by pinpointing the number of the frame. In practical terms, this means sending information on the positions in the lecture video by putting videomarks in it. Comments, messages, and discussions will follow these videomarks. Usually, bookmarks pinpoint important information. Therefore, a set of bookmarks indicates the essential parts of the lecture video. By collecting the videomarks and watching the corresponding parts of the lecture video, the participants will be able to watch the essential parts of the lecture. Furthermore, as discussions are based on the contents of the lecture, videomark information helps users to understand the discussion more deeply.

Embedding Discussions within the Lecture Video as an Index

Conceptually, each comment is embedded in the corresponding part of the video by using the method shown in Figure 3. This method uses an allocation table that relates each comment to the corresponding part of the video. The allocation table has two sets of pointers. One points to a comment and the other to the lecture time when the comment was attached. Each comment has pointers to point to replies to comments and, thus, the message/reply structures are managed hierarchically. This helps the participants follow the thread of the topic in the same way as the conventional BBS. The allocation table and comment files are stored in the discussion manager in Figure 1. As these tables are constructed from the text data, sending them from servers to clients requires less computing power than sending video data. Of course, general comments can be related to more than one part of the video. However, only a one-to-one relationship is used in the proposed system, based on the assumption that the comment has only one main reference. It is important for contributors to easily relate the comment to the lecture, because this function eases the mental burden on the contributor and promotes further contribution.

Functions of a Prototype System

This section describes the fundamental functions of our prototype system. Figure 4 shows a general overview of the system. The digital lecture video can be played on the Lecture Video Form ([A] in Figure. 4) by using the play/stop button ([B] in Figure. 4).



The fundamental functions of this system are as follows:

1. Comment contribution: Participants can contribute a comment relating to the part of the video that they are watching. When they want to contribute to a part while watching the video, clicking the Comment Contribution Button ([C] in Figure 4) produces the Contribution Form ([D] in Figure 4) on which they can write their comment. This comment is then added to the comment file and the relation between the comment and the time of that part of the video is added to the allocation table (Figure 3) automatically Participants can not only provide a topic, but also reply to every comment. Clicking the Reply Button produces the Reply Form ([E] in Figure 4).

2. Discussion display: The discussion is displayed in the Discussion Form, presenting the relation between each comment and the relevant part of the lecture ([F] in Figure 4). Although the look is similar to that of the general BBS, the most important difference is that all root comments are sorted according to the time of the relevant part of the lecture, and not by the date of the comment. A coloured line links each root comment to a corresponding point on a slider bar that expresses the time of the lecture video. All comments on a topic are displayed together hierarchically for every topic in the same way as in the general BBS. If a comment is selected, the title, author, and date of the comment is coloured in blue, and the content is shown in the Content of Comment Form ([D] in Figure 4).


3. Current video time display: To show the relation between the part of the video currently being watched and the discussion, the location of the slider ([G] in Figure 4) indicates the current video time. Moreover, topics near the currently watched part are coloured in red. Therefore, participants can easily find and read the comments related to the part being watched.

4. Watching corresponding part of video: This function enables participants to watch the part of the video that corresponds to a comment that they are reading. This part of the video can be thought of as background knowledge to the comment; thus, they can understand the comment more deeply. Clicking the Watch Lecture Button ([H] in Figure 4) plays the part of the video corresponding to the selected topic of discussion.

Our prototype system was developed on an IBM-PC compatible machine with a Linux operating system. C++ was used for programming.


1. Relation between discussion and lecture: If participants want a deeper understanding of Petri Net, they can easily select the discussion. Of course, all comments about the example of Petri Net are managed hierarchically; thus, the thread of the discussion can be comprehended clearly. Conversely, if participants read this discussion before they watch the lecture and want to know the background of the discussion, the Watch Lecture Button provides the background (see Section 3.4 (4). This system relates a comment to a specific point in the lecture video. However, some comments need to be related to a longer section of the video or to more than one point. Therefore, this method of relating discussion and video still has room for improvement.

2. Comment management: In this system, participants can read topics together that are based on a closely preceding or succeeding part of the video. This function enables participants to select all comments on a topic appropriately, because topics based on a closely preceding or succeeding part of the video can be considered close in respect to content as well (see Figure 4). For example, when a participant reads a topic about an example of Petri Net ([B] in Figure 5), he or she can also read a topic about the disadvantages of Petri Net ([C] in Figure 5). That is, all topics about Petri Net can be read together. In this way, participants can find various viewpoints on a topic and thus a deep discussion about a topic may be achieved. In contrast, with a conventional BBS, it is difficult to select comments about a topic to read together because topics are managed by the date of the comments.

3. Determining the centre of participants' interests: This system can also determine the centre of the participants' interests. The density of the lines that tie a topic to the slider bar indicates the quantity of comments around a specific point in the lecture, the time of which correspond to the location of the slider bar. Therefore, the centre of interests can be easily determined (see Figure 6). For example, in Figure 6, participants' interests are concentrated on the latter half of the form. This feature is especially useful for lecturers, because they can grasp the focus of the discussion and lead it in an appropriate direction.

The prototype system was experimentally used in the author's laboratory. Some students used to study some topics including automata, formal language, and complier construction. However, only less than 10 students used this system and it was difficult to get statistically meaningful results. Currently our system still remains the "prototype" system. In the next step, we should evaluate this system by applying it to a practical situation. Now we are preparing for practical usage. A new version on Windows operating system is under development and it will be used in the author's institution. In this evaluation step, more than 100 students will use the system. Lessons-learned from this practical use will be reported later.


A concept of computer-mediated education system that uses discussion-embedded lecture videos has been proposed in this article. This system manages comments by relating them to the time of the lecture video, not to the date of the comment. Through this discussion management method, the participants can understand the content and background of the discussion deeply, can accurately select comments that are relevant to each other, and can easily determine the centre of the participants' interests. These advantages facilitate further discussion among the participants and, thus, a more effective distance-education environment can be achieved. In a next step, we will (a) apply this system to the practical situation to get statistically meaningful result, and (b) integrate this system into e-learning support tools.

In future work, we will improve the method that relates a comment to a point in the video. More precisely, the ability to relate comments to a longer section of the video and to more than one point is desirable. We should also generalize the methodology of the text-data-embedded video and apply it to other fields for more effective text data management.


Furthermore, we would like to extend the concept of the proposed method to a new data management method. In our proposed system, the lecture video is treated as an index of the discussions and the discussion board is treated as an index of the lecture video. In other words, the video and the discussion board are mutual indexes. We will be able to extend this concept to be a new indexing method for video data.


Eisenstadt, M. & Vincent, T. (Eds.) (1998). The Knowledge Web: Learning and Collaborating on the Net. London: Kogan Page Ltd.

Furht, B., Smoliar, S.W., & Zhang, H. (1995). Video and Image Processing in Multimedia Systems, Dordrecht, Netherlands: Kluwer Academic Publishers.

Haga, H. (1997). Classification framework of information based on two criteria. In Proceedings of the 14th IASTED International Conference on Applied Informatics, pp.246-249. ACTA Press.

Harasim, L., Hiltz, S.R., Teles, L., & Turoff, M., (1995). Learning Networks. London: The MIT Press.

Hazemi, R., Hailes, S., & Wilbur, S. (1998). The Digital University. London: Springer-Verlag London Limited.

Vince, J. & Earnshaw, R. (eds) (2001). Digital Content Creation. London: Springer-Verlag.

Wolfe, C.R. (Ed.) (2000). Learning and teaching on the world wide web. San Diego, CA: Academic Press.


COPYRIGHT 2004 Association for the Advancement of Computing in Education (AACE)
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2004, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

Article Details
Printer friendly Cite/link Email Feedback
Author:Haga, Hirohide
Publication:International Journal on E-Learning
Geographic Code:1USA
Date:Jul 1, 2004
Previous Article:E-entrepreneurship in a disadvantaged community: Project EdNet in California.
Next Article:Architecting usability properties in the e-learning instructional design process.

Related Articles
Supporting self-regulation in student-centered web-based learning environments.
ACPE moves distance education into new territory with courses completely online.
Virtually perfect: universities are finding new ways to capture, store, and retrieve content for their online courses.
The technology enables higher ed.
Constructing a streaming video-based learning forum for collaborative learning.
Use of relational and conceptual graphs in supporting e-learning tasks.
An evolving approach to learning problem solving and program development: the distributed learning model.
Web conferencing: rich media on the desktop: no more jerky videos and voices--today. there's a brighter web conferencing picture.
E-learning: encouraging international perspectives. A Mexican-UK comparative case study analysis.

Terms of use | Privacy policy | Copyright © 2018 Farlex, Inc. | Feedback | For webmasters