HCI Bibliography Home | HCI Conferences | IUCS Archive | Detailed Records | RefWorks | EndNote | Hide Abstracts
IUCS Tables of Contents: 09

Proceedings of the 3rd International Universal Communication Symposium

Fullname:Proceedings of the 3rd International Universal Communication Symposium
Editors:Kazumasa Enami
Location:Tokyo, Japan
Dates:2009-Dec-03 to 2009-Dec-04
Standard No:ISBN: 1-60558-641-2, 978-1-60558-641-0; ACM DL: Table of Contents hcibib: IUCS09
Links:Symposium Home Page
Summary:Today's world has observed the rapid growth of intelligent communication technology as efficient community infrastructure that is capable of distributing massive amounts of information anytime and anywhere. However, before we are able to freely exchange all information with anyone we choose, various communication barriers such as those of language, mistrust and distance need to be overcome. To discuss these issues, and to exchange ideas and results across a broad spectrum of fields related to universal communication technologies, two international symposia have been held in Kyoto and Osaka, respectively, in the past two years.
  1. Keynote papers
  2. Machine translation
  3. Ultra-realistic image technology I
  4. Human analysis
  5. Language resources
  6. Ultra-realistic image technology II
  7. Organized session: Real-world application
  8. Information analysis
  9. Ultra-realistic sound technology
  10. Visual information processing
  11. Search and interface
  12. Multisensory interfaces and cognitive dynamics
  13. Human computer interaction
  14. Poster session

Keynote papers

Language technology infrastructures in support to multilingualism BIBAFull-Text 3-11
  Joseph Mariani
The challenges of multilingualism are many, and needs are important, both in Europe and internationally. Language Technologies can help to meet those needs, but necessitate developing appropriate infrastructures and generating the resources which are mandatory to conduct research for the different languages. Some programs support this area, but suffer from a lack of scale, continuity and cohesion. The effort deserves to be coordinated between nations and international organizations in order to facilitate multilingualism in Europe and globally.
Communications and open systems BIBAFull-Text 12-13
  Mario Tokoro
The ultimate purpose of communications is understanding each other. Natural languages play the central role of communications, but other means such as gestures, facial expression, and gaze in the situations are equally important. Physical and social common sense is indispensable, and the historical backgrounds of nations, regions, families, and individuals of speakers and listeners are never negligible. All of these means, modes, and aspects are mutually dependent and change as time progresses.
   The method of modern science established in the 17th century contributed enormously to scientific advances and technological progress. In the method, we first define the domain of a problem, then reduce the problem in a way that exposes its true nature, and finally discover the underlying principles of the problem domain. When the domain of a problem is too unwieldy and too large for easily reducing the problem, it is broken up into smaller elements that are subjected to the same process. Hence it is called reductionism.
   Nonetheless, there are still plenty of stubborn issues that are not easily resolved. These unsolved issues are complicated ones that could not be addressed simply by reductionism alone. Earth sustainability is an example of such an issue. It involves energy, climate, population, food, biodiversity, safety assurance, etc., which are mutually dependent, and cannot be solved independently from the others. Another example is life and health. Many properties of the human body have been discovered through molecular biology, but real life also seems to be stochastic, contingent, and historical. Yet another example is the safety of gigantic infrastructures connected through networks. These infrastructures grow and change while they continue to function even in the event of various incidents without having any significant effect on the everyday lives of people. All these issues are related to the problems of integrated systems consisting of numerous interrelated subsystems. The solutions of individual problems cannot solve the overall problem and may even cause another problem or worsen the overall problem. Communications issues are such problems and may not be solved independently from the others.
   To solve such problems of integrated complex systems, a new approach called open systems science is proposed. The comparison of closed systems and open systems is presented first, and then the definition of open systems science is given. Some applications of this method to actual important problems are exemplified, and the issues on communications are discussed in depth.
A computer scientist looks at the energy problem BIBFull-Text 14-21
  Randy H. Katz

Machine translation

Discarding monotone composed rule for hierarchical phrase-based statistical machine translation BIBAKFull-Text 25-29
  Zhongjun He; Yao Meng; Hao Yu
Hierarchical phrase-based statistical machine translation systems often suffer from a huge rule table. This paper proposed a basic and efficient method for rule table reduction, discarding monotone composed rules. These rules are redundant because they may be monotonically recreated by minimal rules. Experiments show that the rule table is reduced 57%~71% without worsening translation quality.
Keywords: monotone composed rule, rule table reduction, statistical machine translation
Accuracy evaluation of sentences translated to intermediate language in back translation BIBAKFull-Text 30-35
  Mai Miyabe; Takashi Yoshino
The back-translation method is used to check the accuracy of a sentence translated to a native language. We believe that there exits a positive correlation between the accuracy of sentences translated to an intermediate language and that of back-translated sentences. However, this has not yet been verified. However, some back-translated sentences have high accuracy even if the translated sentence is inaccurate. Therefore, we have to verify the correlation between the accuracy of sentences translated to an intermediate language and that of back-translated sentences. We have evaluated the accuracy of back-translated sentences and that of sentences translated to an intermediate language to establish the correlation between the two accuracies. We have obtained the following results: (1) There exists a positive correlation between the accuracy of sentences translated to an intermediate language and that of back-translated sentences. (2) The occurrence rate of an accuracy mismatch case, wherein a back-translated sentence is accurate but the translated sentence is inaccurate, is less than or equal to 0.5%. (3) Back-translation can be used to check the accuracy of a translated sentence.
Keywords: back translation, machine translation, translation accuracy
Language independent word segmentation for statistical machine translation BIBAFull-Text 36-40
  Michael Paul; Andrew Finch; Eiichiro Sumita
This paper proposes an unsupervised word segmentation algorithm that identifies word boundaries in continuous text in order to optimize the translation quality of statistical machine translation (SMT) approaches. The proposed method is language-independent and uses a parallel corpus to align source language characters to the corresponding word units separated by whitespace in the target language. Successive characters aligned to the same target words are merged to a larger source language unit and a Maximum Entropy (ME) algorithm is applied to learn the word segmentation that optimizes the translation quality of an SMT system trained on the re-segmented bitext. Experimental results translating five Asian languages into English revealed that the proposed method outperforms a baseline system that translates unigram segmented source language sentences.
Automatic extraction of bilingual terms from a Chinese-Japanese parallel corpus BIBAKFull-Text 41-45
  Xiaorong Fan; Nobuyuki Shimizu; Hiroshi Nakagawa
This paper proposes a new approach for the automatic extraction of bilingual terms from a domain-specific bilingual parallel corpus. We combine existing monolingual term extractor and a word alignment tool to extract bilingual terms. Our method is different from those past studies as we simply use a word alignment tool to extract multi-words terms, and we use one monolingual term extractor for both of languages to reduce extraction imbalance. We obtained a good precision and an improved BLEU score in our experiment based on a Chinese-Japanese parallel corpus.
Keywords: automatic extraction, bilingual corpus, bilingual term, multi-words term, segmentation, word alignment
Utilizing semantic equivalence classes of Japanese functional expressions in machine translation BIBAKFull-Text 46-53
  Akiko Sakamoto; Takehito Utsuro; Suguru Matsuyoshi
This paper applied "Sandglass" machine translation architecture to the task of translating Japanese functional expressions into English. We employ the semantic equivalence classes of a recently compiled large scale hierarchical lexicon of Japanese functional expressions. We examine each class whether it is monosemous or not. We realize this procedure by empirically studying whether functional expressions within a class can be translated into a single canonical English expression. Furthermore, in order to precisely identify the class of functional expressions to which our translation rule is directly applicable, we further introduce two types of ambiguities of functional expressions and identify monosemous functional expressions. We finally show that the proposed framework outperforms commercial machine translation software products.
Keywords: Japanese functional expressions, machine translation, polysemy, sense disambiguation

Ultra-realistic image technology I

3-D display and communication technology BIBAKFull-Text 57-63
  Min-Chul Park; Jung-Young Son
In this paper, we describe 3-D display in the aspect of communication technology. 3-D display provides 3-D images to the viewers with more accurate and realistic information than which 2-D display does. This feature is an essential component of communication technology. Generally communication technology pursues for exchanging and sharing of thoughts, feelings and ideas. 3-D displays are effective contact media to achieve these goals. The concept of accessible spatial dimension of a person is used to describe 3-D display in the aspect of communication technology. It is classified into three dimensions and each of dimensions represents the dimension of contact media. Several research results, related with 3-D display and communication technology are introduced based on the concept.
Keywords: 3-D, communication, contact, dimension, display, media
Analysis and compensation of spatial distortion in integral three-dimensional imaging BIBAFull-Text 64-69
  Hisayuki Sasaki; Masahiro Kawakita; Jun Arai; Makoto Okui; Fumio Okano; Yasuyuki Haino; Makoto Yoshimura; Masahito Sato
We have been conducting research on three-dimensional (3D) television using the integral imaging method. To enhance integral 3D image quality, Extremely High-Resolution (EHR) imaging technology would be essential. Now, projection display systems are practical for EHR images and have some advantages for 3D imaging.
   We theoretically and experimentally analyzed the effects of distorted elemental images on a reconstructed image. We study an image processing method for a compensation of distorted elemental images in projection type 3D imaging systems. The experimental results show the effectiveness in eliminating distortion of reconstructed 3D images and improving the limitation of the viewing zone.
Electronic holography generated from integral photography BIBAKFull-Text 70-73
  Ryutaro Oi; Kenji Yamamoto; Tomoyuki Mishina; Takanori Senoh; Taiichiro Kurita
In this paper, we describe an electronic holography for non-coherent lighting environment. We used and integral photography (IP) to obtain 3D information of the scene. This method demands neither laser beams nor a darkroom at the recording. Therefore living or moving objects may be captured onto a hologram. The converter hardware calculates fringe patterns according to the IP at 30 frames per second by using our former proposed conversion algorithm. In our experiment, 3840x2160 pixels of color holograms are generated in real-time.
Keywords: FFT, electronic holography, holography, integral photography
An improved optical device for floating displays BIBAKFull-Text 74-77
  Sandor Markon; Satoshi Maekawa
We propose an improved design of an optical device for projecting floating images. The improved device is a modification of the original design of dihedral corner reflector arrays reported earlier [2], improving its manufacturability while largely maintaining its image forming capability. We describe the construction of the device, and show its properties by mathematical analysis and optical simulation.
Keywords: 3D display, dihedral corner reflector array, floating images

Human analysis

Wearable robotics as a behavioral assist interface like oneness between horse and rider BIBAKFull-Text 81-88
  Taro Maeda; Hideyuki Ando; Hiroyuki Iizuka
The Parasitic Humanoid (PH) is a wearable robot for modeling nonverbal human behavior. This anthropomorphic robot senses the behavior of the wearer and has the internal models to learn the process of human sensory motor integration, thereafter it begins to predict the next behavior of the wearer using the learned models. When the reliability of the prediction is sufficient, the PH outputs the errors from the actual behavior as a request for motion to the wearer. Through symbiotic interaction, the internal model and the process of human sensory motor integration approximate each other asymptotically.
Keywords: ability extension, embodiment, human hack, motion induction
Drowsy driving detection based on human pulse wave by photoplethysmography signal processing BIBAKFull-Text 89-92
  Hanbit Park; Seungwon Oh; Minsoo Hahn
Drowsiness of driver while driving is one major factor of traffic accident. Therefore, there are many researches to prevent and detect drowsy driving. Recent researches have focused on motion detection using cameras to determine drowsy driving. However, we have focused on non-invasive and inexpensive drowsiness detection system. In our previous research, we suggested a system based on the driver's head movement using infrared sensors. In this paper, we suggest another non-invasive and inexpensive system based on the driver's pulse wave by photoplethysmography (PPG) signal processing. Firstly, the system collects a pulse wave from a PPG sensor on a steering wheel and then it processes the signal to analyze driver's state. In order to evaluate the effectiveness of a human pulse wave for drowsiness detection, we integrated two systems. The experimental result using new integration system showed 83 percent drowsy driving detection rate in the state of real driving.
Keywords: driving, drowsiness detection, human pulse wave, photoplethysmography (PPG), sensing, signal processing
Use of active RFID and environment-embedded sensors for indoor object location estimation BIBAKFull-Text 93-99
  Ming Li; Taketoshi Mori; Hiroshi Noguchi; Masamichi Shimosaka; Tomomasa Sato
This paper describes a method for localizing objects in an actual living environment. We have developed this method by using a complementary combination of 1) received signal strength indicators (RSSIs) and vibration data acquired from active RFID tags, and 2) human behavior detected from various types of sensors embedded in the environment. Regarding the former, we use a pattern recognition method to select a feature appeared in SSIs received by several radio frequency (RF) readers at different places and to classify them into a particular location. In our work, we regard the estimated location as the most probable location where the object is placed. As for the latter, we use the detected human behavior to support the estimation based on the analysis of RSSIs. Experiment results showed that the proposed method improved the estimation performance from about 50 to 95% compared with using only RSSIs to localize objects. Moreover, the results also suggested that we can estimate object location indoors without sensors for detecting human position. This indoor object localization method can contribute for constructing an indoor object management system that improves living comfort.
Keywords: RSSI, active RFID, environment-embedded sensor, indoor localization
3D hand posture estimation with single camera by two-stage searches from database BIBAFull-Text 100-106
  Motomasa Tomida; Kiyoshi Hoshino
Previous systems for human hand posture estimation have adopted clustered multi-layer large-scale database with narrowing of search space by its past estimation results. But once an estimated result at a time is out of the search space, the system can't find out a true or optimal value. Our system therefore has adopted non-clustered large-scale database including narrowing of search space, rather, a coarse search at the first stage according to some aspects of inputted hand images, and an accurate search at the second stage with low-order image features. The experimental results showed that the averaged estimation error is -2.11 degrees, and the candidates for accurate search at the second stage are reduced from 28, 386 to 137.7 data sets, including our system realizes the stable hand posture estimation with high accuracy and processing speed as previous system without using the past results.

Language resources

Proposal for a multilanguage text input support system that is easy for beginner language learners BIBAFull-Text 109-114
  Kayo Ikeda; Hideho Numata; Masakatsu Kaneko; Kazuhiko Machida
In this paper, we propose an input support system which supports multiple languages and which will make it possible for users -- even users who are in the midst of learning a foreign language that they wish to use and are not familiar with it yet -- to easily access the information resources in their desired language. As a text input method that is not restricted by the OS or target language, we propose a system which performs input operations using a web browser. In text string input, by using characters within the ASCII domain, all of the text strings can be assigned to keys on the keyboard. For each language (script), a conversion dictionary is available which shows how the key input string and output string correspond. By devising a conversion dictionary, this system can support all languages (scripts). We perform text conversion in incremental search as a method to speed up input for users who are beginner language learners. Detailed Information Display is a function which displays information related to the vocabulary items that are among the conversion candidates. Using the proposed method, we succeeded in creating an environment in which Japanese students of foreign languages and foreigners living in Japan can input text regardless of their computer's environment.
QRpotato: a system that exhaustively collects bilingual technical term pairs from the web BIBAKFull-Text 115-119
  Takeshi Abekawa; Kyo Kageura
This paper reports the system QRpotato, which exhaustively collects bilingual technical term pairs from the Web. The system uses bilingual (Japanese-English) term pairs taken from existing terminological dictionary as seed pairs, search Web pages using the seed pairs, and extract bilingual term pair candidates from the retrieved Web pages, using relational patterns identified between seed term pairs. We have successfully collected about 2.2 million different term pair candidates by using about 210,000 seed term pairs. The manual evaluation of the parts of the candidates shows the effectiveness of the method.
Keywords: automatic term extraction, bilingual term pairs, bilingual terminology, web
Topic relatedness in evaluative information extraction BIBAFull-Text 120-125
  Takuya Kawada; Tetsuji Nakagawa; Kentaro Inui; Sadao Kurohashi
The task of extracting opinions/evaluations related to a given topic from a large number of documents such as Web documents is crucial for developing an automatic evaluation finding system, which can handle a wide variety of topics as input. In this paper, we discuss the topic relatedness of extracted evaluation through analysis of a corpus we developed. We suggest here that the semantic relationship between the target of each extracted evaluation and a given topic helps in judging topic relatedness. In addition, we point out other factors that are beyond the analysis of topic-target relations for judging the topic relatedness of evaluation.
Development of a large-scale web crawler and search engine infrastructure BIBAKFull-Text 126-131
  Susumu Akamine; Yoshikiyo Kato; Daisuke Kawahara; Keiji Shinzato; Kentaro Inui; Sadao Kurohashi; Yutaka Kidawara
This paper reports the ongoing development of a large-scale Web crawler and search engine infrastructure at National Institute of Information and Communications Technology. This infrastructure has the following characteristics: (1) It collects one billion Japanese Web pages while keeping them up-to-date. (2) It selects 100 million pages from among the collected pages and converts them into a standard data format to store the results of morphological analysis, dependency parsing, and synonym augmentation. (3) The selected set of pages is searchable and accessible to the users. (4) The scalability of the system is achieved by using a large-scale cluster machine for distributed data processing.
Keywords: crawler, search engine, web information analysis
A web service for automatic word class acquisition BIBAKFull-Text 132-138
  Stijn De Saeger; Jun'ichi Kazama; Kentaro Torisawa; Masaki Murata; Ichiro Yamada; Kow Kuroda
In this paper we present a Web service for building NLP resources to construct semantic word classes in Japanese. The system takes a few seed words belonging to the target class as input and uses automatic class expansion to suggest semantically similar training samples for the user to label. The system automatically generates random negative training samples as well, and then trains a supervised classifier on this labeled data to generate the target word class from 107 candidate words extracted from a corpus of 108 Web documents. This system eliminates the need for expert machine learning knowledge in creating semantic word classes, and we experimentally show that it significantly reduces the human effort required to build them.
Keywords: lexical acquisition, web service, word class construction

Ultra-realistic image technology II

One-dimensional integral imaging 3D display systems BIBAKFull-Text 141-145
  Yuzo Hirayama
We have developed several kinds of autostereoscopic display systems using one-dimensional integral imaging method. The integral imaging system reproduces light beams similar of those produced by a real object. Therefore our displays have continuous motion parallax. The design, fabrication, and optical evaluation of the displays have been made. By using our proprietary software, the fast playback of the CG movie contents and real-time interaction are also realized with the aid of a graphics card. Realization of the safety 3D images to the human beings is very important. We have measured the effects on the visual function and evaluated the biological effects. We have found that our displays show better results than those to a conventional stereoscopic display. Our display architecture is suitable for flatbed configurations because it has a large margin for viewing distance and angle. Mixed reality of virtual 3D objects and real objects are also realized on a flatbed display. The new technology opens up new areas of application for 3D displays, including communications, arcade games, e-learning, simulations of buildings and landscapes, and even 3D menus in restaurants.
Keywords: display, flatbed, integral imaging, three-dimension, visual function
Surrounding image projection with convex mirrors BIBAKFull-Text 146-149
  Naoki Hashimoto; Yuki Ishiwata; Makoto Sato
Immersive projection technologies surrounding users with large and high quality images are fundamental elements in our near-future information society. However, such large projection systems are frequently based on large implementation and high-cost components like a special projector and screen. This situation limits users receiving the benefits with those technologies. Therefore, in this paper, we propose an effective immersive projection system using simple projectors and convex mirrors for our everyday surfaces like a wall in a room. We also introduce a simple calibration method for making that system easy to use for many people.
Keywords: IPT, convex mirror, multi-projection, virtual reality
Video-based telemedicine with reliable color: field experiments of natural vision technology BIBAKFull-Text 150-153
  Masahiro Yamaguchi; Junko Kishimoto; Yasuhiro Komiya; Yoshifumi Kanno; Yuri Murakami; Hiroyuki Hashizume; Ryouji Yamada; Kosuke Miyajima; Hideaki Haneishi
High-fidelity color imaging technology that incorporates spectrum-based color reproduction system, called "natural vision" (NV) is applied to the field experiment of telemedicine. The experiment comprises mainly two parts; 1) High-fidelity color video of open surgery was captured by the six-band multispectral camera, and the image quality was visually evaluated by medical doctors, 2) Video-based teleconsultation experiment between a regional general hospital and a clinic in an island near the hospital, was conducted with using the natural vision system.
Keywords: color, image reproduction, multispectral imaging, natural vision, telemedicine, video transmission
Modeling the spatial behavior of virtual agents in groups for non-verbal communication in virtual worlds BIBAKFull-Text 154-159
  Hamid Laga; Toshitaka Amaoka
In this paper we propose a mathematical model for the concept of Personal Space (PS) and apply it to simulate the non-verbal communication between agents in virtual worlds. Persons within a group tend to maintain the distances between each other within a certain range that maximizes their degree of comfort. These distances reflect the type of their relationship, and changes in these distances reflect the evolution over time of their relationship. Human-like autonomous virtual agents should be also equipped with such capability to simulate natural interactions in virtual worlds. First we model the space around an agent as a probability distribution function which reflects at each point in the space the importance of that point to the agent. The agent updates dynamically this function according to (1) his relation and distance to other agents in the virtual space, (2) his face orientation, and (3) the evolution of the relationship over time as a stranger agent may become a friend. We demonstrate the concept on a multi-agent platform and show that space-aware agents exhibit better natural behavior.
Keywords: personal space, proxemics

Organized session: Real-world application

Implicit interaction with daily objects: applications and issues BIBAKFull-Text 163-168
  Kaori Fujinami
This paper describes augmentation of daily objects as a mean to interact with a ubiquitous/pervasive computing environment. A daily object employs a context-aware capability, where a user's specific context is captured implicitly and naturally by sensors from its original usage because such an everyday object has inherent roles and functionalities. Also, information is presented naturally and effectively during the utilization. A user does not need to learn how to get information, which fills the gap between a user and a complex ubiquitous/pervasive computing environment.
   In this paper, some projects on augmenting daily objects are presented, where possible applications and a technique to complement a missing piece of context that is obtained only from an instrumental object are presented. Also, we propose to assure a sensor placement for reliable sensing by a daily object.
Keywords: context-awareness, implicit interaction, information presentation, smart object
A preliminary exploration of augmented social landscapes BIBAKFull-Text 169-171
  Shin'ichi Konomi
The ubiquity of sensing devices, including location-aware, sensor-enabled mobile phones, creates an opportunity to design a novel digital layer of a city, which senses and shapes the experiences of urban inhabitants. This paper explores a possibility of ubiquitous sensing devices to generate alternative social landscapes of a city, and facilitate universal communication. Sensors have critical dual roles in this process: (1) analyzing existing social relations, and (2) providing resources for establishing new relations. Several examples are discussed in relation to the latter role of sensors in shaping social landscapes, suggesting the possibility to create various representations that could support novel communication and collaboration practices.
Keywords: augmented social landscapes, connectability, context awareness, geo-social networking, urban sensing
Network management architecture toward universal communication BIBAKFull-Text 172-175
  Yoshihiro Kawahara; Ahmad Kamil Abdul Hamid; TaeYoung Song; Kei Wada; Tohru Asami
Ubiquity of networked devices is one of the first steps toward realization of universal communication services. However, not much attention has been paid to the management architecture of the mashed-up services provided across the network domains. Absence of the scalable cross-domain network management architecture restricts the availability and penetration of the service. In this paper, we propose Tambourine framework which defines a web service based a network management API. Tambourine allows applications to access to the management and control information of networked devices across the domains.
Keywords: network management, new generation network, service composition, smart environment, webservice

Information analysis

People, clouds, and interaction for information access BIBAKFull-Text 179-180
  Tetsuya Sakai
Microsoft Research Asia (MSRA) currently has nineteen research groups that cover various areas in computer science. The Web InTelligence (WIT) Group, led by Chin-Yew Lin, is a recent spin-off from the Natural Language Computing Group, and tackles problems in sentiment analysis, expert and social search, social question answering and summarisation, user intent/activity recognition and prediction, assisting inarticulate users, and information access evaluation. In this talk, I will try to illustrate current strategies and future visions of the WIT group by discussing human-human interaction, computer-computer interaction, human-computer interaction and "evaluation evolution," each in turn.
Keywords: evaluation, information access, natural language processing, question answering, search, web intelligence
Using web page layout for extraction of sender names BIBAKFull-Text 181-186
  Rintaro Miyazaki; Ryo Momose; Hideyuki Shibuki; Tatsunori Mori
Recently, the credibility of information available on the Web has been regarded as an important issue. Sender name is one of the important indicators of the credibility of the information. In this paper, we propose a new method for extracting sender name. The proposed method use the named entity recognition method, and reducing the DOM node using Web page Layout for preprocessing. Experimental result shows that our proposed method can effectively extract sender names when the preprocessing is successful.
Keywords: information credibility, natural language processing, sender name, web page layout
Summarizing evaluative information on the web for information credibility analysis BIBAFull-Text 187-192
  Daisuke Kawahara; Tetsuji Nakagawa; Takuya Kawada; Kentaro Inui; Sadao Kurohashi
The World Wide Web comprises a wide variety of evaluative information. It consists of positive and negative opinions on innumerable topics from various perspectives, thus proving to be a useful information source for information credibility analysis. To present an informative and at-a-glance summary of any topic that a user of such an analysis system searches for, it is important to summarize many diverse evaluative expressions on the topic. In this paper, we describe a method for summarizing an extensive variety of evaluative expressions that are automatically extracted.
Web information credibility analysis by geographical social support BIBAKFull-Text 193-196
  Hiroaki Ohshima; Satoshi Oyama; Hiroyuki Kondo; Katsumi Tanaka
Since our daily lives strongly depend on information obtained by Web search, the credibility of Web search results has become crucial. An important aspect of the credibility of search results is regionality of Web pages. In this paper, we propose a system for helping users assess the credibility of search results by measuring and presenting the regionality of support to Web pages. We conceive two different types of measures for evaluating "geographical social support": the uniformity of support and the proximity of support. The uniformity of geographical support (US) indicates uniformity of geographic distribution of Web pages linking to a Web page. It is calculated by using the Kullback-Leibler (KL) divergence. The proximity of geographical support (PS) express how a page is supported by pages geographically located close to the page. We describe our implemented prototype system that shows the two measures for Web search results.
Keywords: information credibility, local web search, social support

Ultra-realistic sound technology

Application of 3D sound technology to intelligent robots BIBAKFull-Text 199-204
  Youngjin Park
Various high-fidelity VAD systems are developed for many practical application fields including games, home theatre, virtual reality, and military simulator, etc. Head-related Transfer Function is the one of key functions widely used in VAD system.
   We developed robot auditory systems for sound source localization to achieve the effective human-robot interaction. The developed robot auditory system, which includes artificial ear, MEMS sensor, SoC (system-on-chips) for sound localization can be used for intelligent robots to process speech/acoustic signals.
Keywords: acoustic MEMS sensor, human robot interaction, robot artificial ear, sound direction estimation, sound source localization, spatially mapped GCC function
Headphone calibration for 3D-audio listening BIBAFull-Text 205-210
  Ryouichi Nishimura; Parham Mokhtari; Hironori Takemoto; Hiroaki Kato
This paper proposes a new headphone calibration function for precise reproduction of 3D audio generated using simulated head-related transfer functions (HRTFs) or binaural recordings. In order to compensate for individual characteristics of the earcanal transfer functions and the eardrum impedance, which are generally different from person to person, the method consists of two steps: measuring sound pressure with blocked earcanals and that with open earcanals. The vibration of the eardrum can thereby be precisely reproduced as if the listener were in the original sound scene. Results of experiments using a head and torso simulator (HATS) revealed that sound pressure is correctly reproduced at the position of eardrum as well as at the entrance of the earcanal within a certain wide frequency range.
Representation and comparison of HRTF in spatio-temporal frequency domain BIBAKFull-Text 211-214
  Yasuko Morimoto; Takanori Nishino; Kazuya Takeda
We represent a head-related transfer function (HRTF) in the spatio-temporal frequency domain. Since an HRTF is defined as an acoustic function of time and location of sound source, the spatio-temporal frequency characteristics of HRTFs can be visualized and analyzed by multi-dimensional Fourier transform in time and space. In our experiments, we investigate a basic property of the spatio-temporal frequency characteristic and the difference between HRTFs obtained by numerical analysis and actual measurements. The influences caused by pinnae for the spatio-temporal frequency characteristic are also examined. It is found that the spatio-temporal spectral components are mostly concentrated in specific frequency bands, and these components are different in each measurement condition.
Keywords: Fourier transform, head-related transfer function, spatio-temporal frequency analysis, visualization
Subjective effect of synthesis conditions in 3D sound field reproduction system using a few transducers and wave field synthesis BIBAKFull-Text 215-220
  Toshiyuki Kimura; Munenori Naoe; Yoko Yamakata; Michiaki Katsumoto
In a conventional 3D sound field reproduction system using wave field synthesis, numerous loudspeakers are placed around the listener. However, since such a system is very expensive and loudspeakers are in the listener's field of vision, it is very difficult to construct an audio-visual virtual reality system. We have proposed a 3D sound field reproduction system using wave field synthesis and eight transducers, which are placed at the vertex of a cube. In this study, the effect of synthesis conditions on the localized perception was evaluated when the synthesis conditions, the directivity of microphones, and the size of cubic arrays, were varied. As a result, the performance of the localized perception was good when shotgun microphones were used and the size of arrays was that of a cube, measuring 0.4 m on each side.
Keywords: microphone directivity, sound field reproduction, wave field synthesis

Visual information processing

Multi-sensor based human activity detection for smart homes BIBAFull-Text 223-229
  Liyanage C. De Silva
At the University of Brunei Darussalam, we have designed and built a prototype smart home to monitor human activities to improve the energy efficiency and support elder people. In this paper we present some of our early work related to smart monitoring, control and communication along with a review of other related research initiatives by researchers around the world. Especially we looked at research work carried out in Singapore, Japan and New Zealand. Here our main objective was to look into research work that enhances energy efficiency and eldercare with the use of multitude of sensors. With our simple prototype implementation we have also demonstrated the use of smart home technologies to reduce energy consumption in an average house.
Human shape reconstruction via graph cuts for voxel-based markerless motion capture in intelligent environment BIBAFull-Text 230-236
  Masamichi Shimosaka; Kazuhiko Murasaki; Taketoshi Mori; Tomomasa Sato
In this paper, we propose a robust and real-time 3D human shape reconstruction method in daily life spaces to make practical voxel-based motion capture systems. Our algorithm extracts human silhouette and reconstructs human shape via volume intersection from multi view point images. The method presented in this paper is based on energy minimization via graph cuts, and its main features are: 1) to reduce the background subtraction errors caused by background clutter, 2) to have robustness for influences of shadows, 3) to segment the foreground region even if moving objects other than human. The precise human shape reconstructed by the method improves the accuracy of human pose estimation. Especially, 3) leads to enhance the range of application of the voxel-based human pose estimation. We demonstrate the effectiveness of our approach in terms of both quantitative and qualitative performance where strong shadows appear and moving objects are present in intelligent environment.
Stereo camera model of feed horns in focal plane array BIBAKFull-Text 237-240
  Jung-Young Son; Seokwon Yeom; V. P. Guschin; Yuriy Vashpanov; Dong-Su Lee; Shin-Hwan Kim
The equivalent camera model of two feed horns in millimeter wave imaging system is a radial type stereo camera with diverging axes. The stereo image characteristics of the camera are analyzed. This camera model allows minimizing the distance between cameras.
Keywords: feed horn, imaging system, radial type stereo camera with diverging axes, radiometry
A context-adaptive haptic interaction and its application BIBAKFull-Text 241-244
  Youngjae Kim; Youmin Kim; Minsoo Hahn
Haptic is a promising interface for the next generation ubiquitous computing environment. Most of the haptic-related study is limited to the first person-based human-computer interaction [5], not a human-to-human communication. The proposed system is focused on the personal communication such as chatting or text messaging. Our system is designed to provide manipulation ability of multiple sensors and multi actuators into a single framework. Our contribution can be summarized into three part; (1) design of a framework for a multi-sensor and multi-actuator interaction. (2) XML-based data structure for a haptic description. (3) context-adaptive actuation control using feedback mechanism.
Keywords: communication, haptic, interaction, location, spatial

Search and interface

Normalization on the modulation spectrum of the subband temporal envelopes for automatic speech recognition in reverberant environments BIBAKFull-Text 247-254
  X. Lu; M. Unoki; S. Nakamura
In this study, we proposed a feature extraction method based on the subband temporal envelopes (STEs) and their normalization for reverberated speech recognition. The STEs were extracted by using a series of constant bandwidth band-pass filters with Hilbert transform followed by a low-pass filtering. In the normalization, both the modulation spectrum (MS) of the subband temporal envelopes of the clean and reverberated speech are normalized to a reference MS calculated from a clean speech data set. Based on the normalized subband MS, the inverse Fourier transform was used to restore the subband temporal envelopes. We tested the proposed method on speech recognition in a reverberant room with different speaker to microphone distance (SMD). For comparison, the recognition performance of using the traditional Mel-cepstral coefficients with mean and variance normalization were used as the baseline. Experimental results showed that, by averaging the SMDs from 50 cm to 400 cm, there was a 44.96% relative improvement by only using subband temporal envelope processing, and further a 15.68% relative improvement by using the normalization on the subband modulation spectrum. Totally, there was about a 53.59% relative improvement, which was better than those of using other temporal filtering and normalization methods.
Keywords: automatic speech recognition, dereverberation, subband temporal envelope, temporal modulation
Evaluation for WFST-based dialog management BIBAKFull-Text 255-260
  Chiori Hori; Kiyonori Ohtake; Teruhisa Misu; Hideki Kashioka; Satoshi Nakamura
To construct an expandable and adaptable dialog system which handles multiple tasks, we proposes a dialog system using a weighted finite-state transducer (WFST) in which users concept and system action tags are input and output of the transducer, respectively. To test the potential of the WFST-based dialog management (DM) platform using statistical DM models, we construct a dialog system using a human-to-human spoken dialog corpus for hotel reservation, which is annotated with Interchange Format (IF). A scenario, a Spoken Language Understanding (SLU) and a Sentence Generation (SG) WFSTs are obtained from the corpus and then composed together and optimized to generate a Dialog Management (DM) WFST. We evaluate the detection accuracy of the system next actions using Mean Reciprocal Ranking (MRR). We evaluated how WFST optimization operations contribute to dialog systems and confirmed the optimization enhance the performance of accuracy of the next action detection.
Keywords: WFST optimization operation, interchange format (IF), spoken dialog, statistical dialog management, weighted finite-state transducer (WFST)
SOBEX: distributed service search engine that exploits service collaboration context BIBAFull-Text 261-268
  Rong Zhang; Koji Zettsu; Takafumi Nakanishi; Yutaka Kidawara; Yasushi Kiyoki
Service-oriented architecture (SOA) is emerging as a paradigm for developing distributed application. As the development of hardware and software technology, fast increasing of peers or services has issued critical problems for the popularity of SOA. One is system scalability and robustness, and the other one is service location validation. In this paper, we introduce SOBEX, a web service search engine which designs a distributed indexing structure SIKA, and proposes proactive web services reuse mechanism by introducing service context model SPOT.
   SIKA is a community-oriented virtual hierarchical distributed indexing structure based on classic Chord algorithm. Though it groups nodes into interest-based communities, it is completely distributed and without central management. Then it promises system search efficiency together with scalability and robustness. The growing number of web services available with an organization and on the web raises new problem: locating the desired web services. Generally keyword-based search has meet with high recall and low precision. In order to improve search efficiency, SOBEX proposes to qualify services using service usage context model, which tries to reduce the concept understanding gap between human and computer by proactively assigning the services with their own story background. On the other hand, besides traditional keyword-based methods, it introduces context-based queries to improve service reusability.
Towards moving phenomena data management BIBAKFull-Text 269-272
  Koji Zettsu; Kyoung-Sook Kim; Yutaka Kidawara; Yasushi Kiyoki
With the spread of Geoweb, people can more easily create and exchange geo-spatiotemporal contents on the Web. Consequently, it becomes an emerging issue to manage exploding amount of Geoweb contents more efficiently than traditional approaches for mapping contents on a digital earth and/or a time line. In this paper, we propose a novel approach for managing Geoweb contents based on the idea of moving phenomena such as a typhoon and price escalation. Our moving phenomenon DBMS defines data types and predicates of moving phenomena, and the visual query interface, called Sticker, allows users to aggregate Geoweb contents with 3D view of the moving phenomena. Our framework works well for obtaining comprehensive knowledge about the situations or influences of real-world phenomena from the Geoweb contents.
Keywords: Geoweb, Sticker, event aggregate, moving phenomena, spatiotemporal databases
Support tools for literature-based information access in molecular biology BIBAFull-Text 273-277
  Fabio Rinaldi; Dietrich Rebholz-Schuhmann
The fast production of information in molecular biology, driven by high-throughput experiments, leads to strong ongoing demands for the integration of the literature into the information and knowledge discovery channels of the biomedical research domain. This paper describes tools developed by the authors with the aim of supporting professional biologists in accessing the information contained in the scientific literature.

Multisensory interfaces and cognitive dynamics

Usage of change-related non-invasive imaging paradigms to investigate the representation of sound in the human brain BIBAFull-Text 281-284
  Christian F. Altmann
To efficiently recognize and localize sounds is of paramount importance for our everyday life. However, the computational processes that underlie these capabilities in the human brain are still not fully understood. A powerful tool to study the representation and transformation of sensory information in the human brain are change-related paradigms. This text reviews three recent examples from our lab that employed change-related paradigms with different brain imaging modalities to characterize the representation of sounds in the human brain. Specifically, a first experiment used functional magnetic resonance imaging and signal response suppression after stimulus repetition to characterize the representation of natural sounds. A second magnetoencephalo-graphic experiment, used a two-tone paradigm to describe the time-course of adaptation to natural sounds. In a third experiment, we employed a spatial mismatch negativity oddball paradigm during electroencephalography to test for head-related versus allocentric representation of sound sources in the human brain.
Measurements of vergence and accommodation while viewing a real 3D scene and its 2D image on a display BIBAFull-Text 285-288
  Haruki Mizushina; Hiroshi Ando; Takanori Kochiyama; Shinobu Masaki
It is widely thought that conflict between vergence and accommodation may be a major factor of visual fatigue and discomfort caused by viewing stereoscopic images on 3D displays. However, few studies measured vergence and accommodation simultaneously while viewing a real 3D scene and its 2D image on a traditional display.
   In this study we measured vergence and accommodation responses simultaneously while viewing a 3D real object located at various distances from the participant and its 2D image (photograph) including background scene presented on a display located at fixed distance. The result shows that vergence and accommodation varied with changing target distance while viewing the 3D real object, as expected. On the other hand changing target distance depicted in the photographic image while viewing the 2D display evoked no systematic change of vergence and accommodation. Some participants showed that noticeable accommodation lag and fixation disparity. In addition to that we observed considerable conflicts between vergence and accommodation in both 3D and 2D conditions, but no one reported perceived defocus and/or double vision. There were varieties of individual differences in the pattern of the conflicts.
Method for identifying aromas using adjective characteristics to enhance the reality of visual images BIBAKFull-Text 289-295
  Chika Oshima; Koichi Nakayama; Hiroshi Ando
Some works have suggested that certain aromas can enhance the reality of visual images of distant locations on the basis of the CONTENTS constituting the images; and these CONTENTS can be referred to using nouns. Aromatic materials need to be classified in order to identify the ones corresponding to each CONTENT. In this paper, we conducted two experiments. The subjects first rated the extent to which the aromas enhanced the reality of the visual images. CONTENTS of these visual images were different kinds of trees. Nine aromatic materials received high ratings. The subjects then rated the aromas using adjectives. The nine aromatic materials were classified into two clusters on the basis of the adjectives. These results showed that the use of adjectives to describe aromas is practical for such a classification.
Keywords: adjective, aroma, recommendation system
Neural correlates of externalized auditory motion perception under reverberation BIBAKFull-Text 296-299
  Akiko Callan; Hiroshi Ando
Using functional magnetic resonance imaging, we investigated neural substrates of realistic auditory motion perception. "Realistic" here means experiencing the sound as located outside the head instead of originating inside the head. In order to examine neural effects of moving sounds and neural effects of externalized sounds separately, we included two experimental factors in our design: whether auditory stimuli were externalized or not (externalizability factor) and whether auditory stimuli were moving or not (motion factor). Externalized sounds activated planum temporale (PT) more than non-externalized sounds. Moving sounds activated posterior middle temporal gyrus (pMTG) more than stationary sounds. An interaction effect was found in the right PT. Our results indicate that the PT and pMTG are involved in realistic auditory motion perception. The fidelity of auditory space presentation may be evaluated by observing neural activity change in the PT and pMTG.
Keywords: auditory motion, externalized, fMRI, neuroimaging

Human computer interaction

Eye-gaze experiments for conversation monitoring BIBAKFull-Text 303-308
  Kristiina Jokinen; Masafumi Nishida; Seiichi Yamamoto
Eye-tracking technology has recently been matured so that its use in studies dealing with unobtrusive and natural user experiments has become easier to conduct. Simultaneously, human computer interactions have become more conversational in style, and more challenging in that they require various human conversational strategies, such as giving feedback and managing turn-taking. In this paper, we focus on eye-gaze in order to investigate turn taking signals and conversation monitoring in naturally occurring dialogues. We seek to build models that deal with the important aspects of which interlocutor the speaker is talking to, and what kind of turn taking signals the partners elicit, and we report the first results of our eye-tracking experiments.
Keywords: eye-tracking, human-human interaction, multiparty conversation
A speech-driven embodied entrainment wall picture system for supporting virtual communication BIBAKFull-Text 309-314
  Yoshihiro Sejima; Tomio Watanabe
We have developed a speech-driven embodied entrainment system called "InterPicture" and have demonstrated the effectiveness of the system using an embodied virtual communication system. InterPicture is an image containing flowers that react to the speech input of talkers. We confirmed the importance of providing a communication environment in which not only avatars but also CG objects placed around the avatars are related to virtual communication. In this study, we have developed an advanced speech-driven embodied entrainment system called "InterWall". This system projects wall picture widely onto the wall surrounding avatars and behaves as a listener by producing nodding and body movements on the basis of the speech input of a talker. Further, a communication experiment has been performed, and the effectiveness of "InterWall" has been demonstrated by carrying out a sensory evaluation and a speech-overlap analysis for 20 pairs of 40 talkers.
Keywords: embodied communication, entrainment, human interaction, nonverbal communication, virtual communication
Sensing web: to globally share sensory data avoiding privacy invasion BIBAKFull-Text 315-318
  Ikuhisa Mitsugami; Michihiko Minoh; Tsuneo Ajisaka; Noboru Babaguchi
This paper gives an overview of the Sensing Web project, launched in 2007 in Japan. The project's aim is to open the data obtained by the sensors existing in our daily living environment for various purposes. Since the data obtained by observing the real world directly with sensors include real-world information different from the Web, a new worldwide social information infrastructure -- Sensing Web -- is realized. In this article, we discuss the research issues for arising in connection with the Sensing Web.
Keywords: information infrastructure, privacy-invasion-free, sensory data, symbolization
An agent-based management scheme of context information for context-aware service BIBAKFull-Text 319-324
  Hideyuki Takahashi; Takuo Suganuma; Norio Shiratori
This paper describes a scheme to increase the availability of context-aware services in ubiquitous computing environment by managing context information effectively where computational and network resources are insufficient. For the context-aware service, function of overall system and quality of service are also required to be maintained by circulating context information in adequate quality. This scheme manages the context information based on relationship between quality of context and quality of service. This scheme can avoid degradation of available network bandwidth and computational resource caused by circulation of excessive context information. From the initial experimental results using a prototype system of a ubiquitous live streaming video service, we confirm the available network bandwidth and computational resource. It is effectively maintained and recovered by controlling the update frequency of user's location information properly depending on the situation.
Keywords: context-aware service, multiagent systems, quality of context, quality of service
The utilization method of idle PC resources BIBAKFull-Text 325-328
  Yutaka Hirakawa; Yoshifumi Matsuda
Few of the large number of personal computers (PCs) in homes and small offices are used continuously. This article discusses a method of utilizing idle PC resources by assigning download jobs to idle PCs and distributing it among them. The requirements for the utilization method are as follows:
   R1: When a user suddenly starts using an idle PC, he/she must be able to work effectively.
   R2: When a user shuts down a PC abruptly, the system must continue to operate with any interruption.
   The proposed method of resource utilization monitors bandwidth usage and avoids inefficiency in users' work. The evaluation results of the experiment system are described.
   The proposed method requires the existence of a leader PC in a network. The evaluation results of a new effective leader election method that assumes the existence of network attached storage (NAS) are also described.
Keywords: distributed systems, leader election, utilization of idle PCs

Poster session

Analysis of hand movement variation related to speed in Japanese sign language BIBAKFull-Text 331-334
  Yuta Yasugahira; Yasuo Horiuchi; Shingo Kuroiwa
To achieve the greater accessibility for deaf people, sign language recognition systems and sign language animation systems must be developed. In Japanese sign language (JSL), previous studies have suggested that emphasis and emotion cause changes in hand movements. However, the relationship between emphasis and emotion and the signing speed has not been researched enough. In this study, we analyzed the hand movement variation in relation to the signing speed. First, we recorded 20 signed sentences at three speeds (fast, normal, and slow) using a digital video recorder and a 3D position sensor. Second, we segmented sentences into three types of components (sign words, transitions, and pauses). In our previous study, we analyzed hand movement variations of sign words in relation to the signing speed. In this study, we analyzed transitions between adjacent sign words by a method similar to that in the previous study. As a result, sign words and transitions showed a similar tendency, and we found that the variation in signing speed mainly caused changes in the distance hands moved. Furthermore, we compared transitions with sign words and found that transitions were slower than sign words.
Keywords: Japanese sign language, hand movement, transition
High resolution computer-generated cylindrical hologram BIBAKFull-Text 335-338
  Tomohisa Ito; Takeshi Yamaguchi; Hiroshi Yoshikawa
We investigate the computer-generated cylindrical hologram. Since the general flat format hologram has a limited viewable area, we usually cannot see the other side of the reconstructed object. There are some holograms to solve this problem. A cylindrical-type hologram is well known as the 360-deg viewable hologram. There are two kinds of cylindrical holograms, a multiplex hologram and a laser reconstruction 360-deg hologram. Since the multiplex hologram consists of many 2-D pictures, the reconstructed image is not truly 3-D. In contrast, a laser reconstruction 360-deg hologram has a true 3-D effect. In our previous study, the computer-generated cylindrical hologram was realized as a Fresnel hologram. However, since the spatial resolution and pitch of the output device is not enough. Its panel size 14.5mm x 10.9mm, resolution 1,400 x 1,050pixel, pixel pitch 10.4μm of Liquid Crystal on Silicon use reduced to 1/12 and made a hologram. In this report, panel size 13.8mm x 7.56mm, resolution 1,920 x 1,080 pixel, pixel pitch 7μm of Liquid Crystal on Silicon use reduced to 1/16 and made a hologram. To scale up reconstructed image size, we calculated high resolution computer-generated cylindrical hologram. Then, we print these fringes with the improved output device. As a result, we obtain a good reconstructed image from a computer-generated cylindrical hologram.
Keywords: computer-generated hologram, cylindrical hologram, fringe printer, holography
Spatial memorization aid system: registration of mental memory space BIBAKFull-Text 339-343
  Ken Ishigaki; Yasushi Ikei
The present paper proposes a novel approach to augment human memory based on spatial and graphic information mediated by an electronic device. A technique for memorizing a number of unstructured items is called mnemonics. Although its advantage is excellent, to acquire the skill to utilize the mnemonics is generally difficult. A new spatial mnemonics system presented in this paper has resolved the problem by facilitating the process of acquisition of the skill. A virtual memory peg is introduced for the purpose based on the images of the real space and object. The characteristics of the virtual memory peg was investigated in terms of the length of the peg that was created in the real physical environment. Another creation process based on the photographs provided by the experimenter was examined to show that the peg could be built without walking through the real environment. The both results clearly demonstrated the effectiveness of the proposed method.
Keywords: imagery, memory peg, mnemonics, photo-montage
Searching for comparison points between two objects from the web BIBAFull-Text 344-349
  Shinya Aoki; Takayuki Yumoto; Manabu Nii; Yutaka Takahashi
Recently, we have been able to often compare two objects using search engines. However, we often browse high ranked Web pages by search engines, which may give biased information. We propose a method for searching Web pages where two objects are compared using a search engine, extracting comparison points from those Web pages, and showing these points to users. Comparison points are keywords for comparing objects. The proposed method can be used to extract points for efficient comparison by using comparison expressions such as "Liquid Crystal TVs are better ..." and "... than Plasma TVs.", etc.
How-to information search by lightweight analysis of web pages BIBAFull-Text 350-354
  Ryouji Nonaka; Takayuki Yumoto; Manabu Nii; Yutaka Takahashi
We propose a method for searching for comprehensible how-to information on the Web. In our how-to information search, we use lightweight analysis of Web pages to extract how-to information from Web pages obtained by conventional Web search engines and rank them according to their easily-viewable-degree. In the extraction process, we focus on expressions in Web page text blocks that describe procedures. In the ranking process, we focus on images, the effect of letter string and the length of the how-to information.
Linking Wikipedia entries to blog feeds by machine learning BIBAKFull-Text 355-362
  Mariko Kawaba; Hiroyuki Nakasaki; Daisuke Yokomoto; Takehito Utsuro; Tomohiro Fukuhara
This paper studies the issue of conceptually indexing the blogosphere through the whole hierarchy of Wikipedia entries. This paper proposes how to link Wikipedia entries to blog feeds in the Japanese blogosphere by machine learning, where about 300,000 Wikipedia entries are used for representing a hierarchy of topics. In our experimental evaluation, we achieved over 80% precision in the task.
Keywords: Wikipedia, blog feed retrieval, blogosphere, topics
JSPad: a sign language writing tool using SignWriting BIBAKFull-Text 363-367
  Tadahiro Matsumoto; Mihoko Kato; Takashi Ikeda
SignWriting is a practical writing system for sign languages. In this paper we present a software program, JSPad, for writing Japanese sign language (JSL) with SignWriting. SignWriting has a large set of visually iconic symbols that represent handshapes, movements, locations and facial expressions; it can take a lot of time to find and choose appropriate symbols from the set to compose signs, particularly for novice users of SignWriting. JSPad can generate SignWriting signs from JSL text written in the JJS notation, which is a gloss-based notation system we proposed. This facility allows users to write JSL sentences in SignWriting in shorter time.
Keywords: SignWriting, deaf education, notation system, sign language, sign language text, writing system
Resources for Mongolian language BIBAKFull-Text 368-371
  Purev Jaimai; Odbayar Chimeddorj
Mongolian language is spoken by about 8 million speakers. This paper summarizes the current status of its resources in Mongolia.
Keywords: Mongolian language resources, corpus for Mongolian, natural language processing, tools for Mongolian language
Dialogue act annotation for consulting dialogue corpus BIBAFull-Text 372-378
  Kiyonori Ohtake; Teruhisa Misu; Chiori Hori; Hideki Kashioka; Satoshi Nakamura
This paper introduces a new corpus of consulting dialogues, which is designed for training a dialogue manager that can handle consulting dialogues through spontaneous interactions from the tagged dialogue corpus. We have collected 130 h of consulting dialogues in the tourist guidance domain. This paper outlines our taxonomy of dialogue act annotation that can describe two aspects of an utterances: the communicative function (speech act), and the semantic content of the utterance. We provide an overview of the Kyoto tour guide dialogue corpus and a preliminary analysis using the dialogue act tags.
Unit selection using k-nearest neighbor search for concatenative speech synthesis BIBAKFull-Text 379-382
  Hideyuki Mizuno; Satoshi Takahashi
We propose a new approach to rapidly identifying adequate synthesis units in extremely large speech corpora. Our aim is to develop a concatenative speech synthesis system with high performance (both speech quality and throughput) for various practical applications. Utilizing very large speech corpora allows more natural sounding synthesized speech to be created; the downside is an increase in the time taken to locate the synthesis units needed. The key to overcoming this problem is introducing state-of-the art database retrieval technologies. The first selection step, based on simple hash search, tabulates all synthesis unit candidates. The second step selects N best candidates using nearest neighbor search, a typical database retrieval technique. Finally, the best sequence of synthesis units is determined by Viterbi search. A runtime measurement test and subjective experiment are carried out. Their results confirm that the proposed approach reduces the runtime by about 40% compared to using only hash search with no degradation in the quality of synthesized speech for a 15 hour corpus.
Keywords: concatenative speech synthesis, nearest neighbor search, synthesis unit selection, text to speech
Dynamic selection method of the best search engine for a user's query BIBAKFull-Text 383-388
  Kodai Mizuno; Kyoji Kawagoe
In this paper, we propose a new dynamic selection method of the best search engine for a user's query. When users retrieve on the Internet, the expert users manually select the best search engine for their queries. However, the most important problem is that the novice users cannot understand features of all search engines. Consequently, because such users cannot select the best search engine, the users cannot obtain the best retrieval results. In this paper, we focus the number of retrieval results, and we calculate search engines' matching scores suitable for the user's query by using this focus point. As a result, novice users can select the best search engine using the scores calculated by our system.
Keywords: information retrieval, query, search engine selection, web search
Hyperbolic structure of fundamental frequency contour BIBAKFull-Text 389-394
  Jinfu Ni; Shinsuke Sakai; Hisashi Kawai; Satoshi Nakamura
In this paper, we propose an approach to transformation of fundamental frequency (F0) contours for conversational speech synthesis. The figure of F0 in relations to the period of cycles of sound waves is one branch of the rectangular hyperbola. Based on a few symmetry assumptions on the hyperbolic property, we achieve a generalized hyperbolic structure so as to aggressively manipulate F0 contours. The modeling proves an equivalent expression of the resonance mechanism capable for dealing with the interaction of tone and intonation. Also, it is language-independent because no language-dependent hypothesis is necessary. This paper describes two applications of the hyperbolic structures of F0 contours to prosodic information processing. One modulates the baseline F0 contours when fusing additional makeup information onto them without altering the underlying linguistic information. The other separates local rise/fall F0 movements and global scale component from observed F0 contours, both being useful for estimating dynamical F0 variation. Our experimental results are very positive.
Keywords: F0 control, intonation, speech prosody, speech synthesis
Automatic plagiarism detection among term papers BIBAKFull-Text 395-399
  Takahisa Ota; Shigeru Masuyama
Recently, plagiarized term papers have become a serious problem. Therefore, we propose, in this paper, a method to detect plagiarized parts between two term papers. Our method is based on the Smith-Waterman algorithm that can detect similar parts between two molecules. Moreover, we experimented on our method using a document set consisting of actually submitted term papers and artificially-produced ones that plagiarized a paper written on the same theme. Experimental results show that our method attains higher accuracy than conventional ones.
Keywords: Smith-Waterman algorithm, dynamic programming, partial text alignment, plagiarism detection, term papers
Spoken document retrieval using topic models BIBAKFull-Text 400-403
  Xinhui Hu; Ryosuke Isotani; Satoshi Nakamura
In this paper, we propose a document topic model (DTM) based on the non-negative matrix factorization (NMF) approach to explore spontaneous spoken document retrieval. The model uses latent semantic indexing to detect underlying semantic relationships within documents. Each document is interpreted as a generative topic model belonging to many topics. The relevance of a document to a query is expressed by the probability of a query being generated by the model. The term-document matrix used for NMF is built stochastically from the speech recognition N-best results, so that multiple recognition hypotheses can be utilized to compensate for the word recognition errors. Using this approach, experiments are conducted on a test collection from the Corpus of Spontaneous Japanese (CSJ), with 39 queries for over 600 hours of spontaneous Japanese speech. The retrieval performance of this model is proved to be superior to the conventional vector space model (VSM) when the dimension or topic number exceeds a certain threshold. Moreover, whether from the viewpoint of retrieval performance or the ability of topic expression, the NMF-based topic model is verified to surpass another latent indexing method that is based on the singular value decomposition (SVD). The extent to which this topic model can resist speech recognition error, which is a special problem of spoken document retrieval, is also investigated.
Keywords: NMF, document topic model, spoken document retrieval
Soft margin estimation on improving environment structures for ensemble speaker and speaking environment modeling BIBAKFull-Text 404-408
  Yu Tsao; Jinyu Li; Chin-Hui Lee; Satoshi Nakamura
Recently, we proposed an ensemble speaker and speaking environment modeling (ESSEM) approach to enhance the robustness of automatic speech recognition (ASR) under adverse conditions. The ESSEM framework comprises two phases, offline and online phases. In the offline phase, we prepare an environment structure that is formed by multiple sets of hidden Markov models (HMMs). Each HMM set represents a particular speaker and speaking environment. In the online phase, ESSEM estimates a mapping function to transform the prepared environment structure to a set of HMMs for the unknown testing condition. In this study, we incorporate the soft margin estimation (SME) to increase the discriminative power of the environment structure in the offline stage and therefore enhance the overall ESSEM performance. We evaluated the performance on the Aurora-2 connected digit database. With the SME refined environment structure, ESSEM provides better performance than the original framework. By using our best online mapping function, ESSEM achieves a word error rate (WER) of 4.62%, corresponding to 14.60% relative WER reduction (from 5.41% to 4.62%) over the best baseline performance of 5.41% WER.
Keywords: ASR, ESSEM, SME, model adaptation, noise robustness
A method for helpdesk-oriented question answering BIBAKFull-Text 409-415
  Satoru Sasaki; Atsushi Fujii
We propose a Question Answering (QA) method that answers actions for a how-question. We model an action as a verb phrase consisting of a main verb and its governing noun phrase. Existing QA methods resemble consulting dictionaries and encyclopedias, in which users satisfy their intellectual cravings. In contrast, our method is a step toward automation of a helpdesk or a call center, which suggests solutions to alleviate user's problems. We show the effectiveness of our method experimentally.
Keywords: question answering, web retrieval