Personalized Summarization of Broadcasted Soccer Videos with Adaptive Fast-Forwarding | | BIBAK | Full-Text | 1-11 | |
Fan Chen; Christophe De Vleeschouwer | |||
We propose a hybrid personalized summarization framework that combines
adaptive fast-forwarding and content truncation to generate comfortable and
compact video summaries. We formulate video summarization as a discrete
optimization problem, where the optimal summary is determined by adopting
Lagrangian relaxation and convex-hull approximation to solve a resource
allocation problem. Subjective experiments are performed to demonstrate the
relevance and efficiency of the proposed method. Keywords: Personalized Video Summarization; Adaptive Fast forwarding; Soccer Video
Analysis |
Real-Time GPU-Based Motion Detection and Tracking Using Full HD Videos | | BIBAK | Full-Text | 12-21 | |
Sidi Ahmed Mahmoudi; Michal Kierzynka; Pierre Manneback | |||
Video processing algorithms present a necessary tool for various domains
related to computer vision such as motion tracking, videos indexation and event
detection. However, the new video standards, especially those in high
definitions, cause that current implementations, even running on modern
hardware, no longer respect the needs of real-time processing. Several
solutions have been proposed to overcome this constraint, by exploiting graphic
processing units (GPUs). Although, they present a high potential of GPU, any is
able to treat high definition videos efficiently. In this work, we propose a
development scheme enabling an efficient exploitation of GPUs, in order to
achieve real-time processing of Full HD videos. Based on this scheme, we
developed GPU implementations of several methods related to motion tracking
such as silhouette extraction, corners detection and tracking using optical
flow estimation. These implementations are exploited for improving performances
of an application of real-time motion detection using mobile camera. Keywords: GPU; CUDA; video processing; motion tracking; real-time |
Feeling Something without Knowing Why: Measuring Emotions toward Archetypal Content | | BIBAK | Full-Text | 22-31 | |
Huang-Ming Chang; Leonid Ivonin; Wei Chen; Matthias Rauterberg | |||
To enhance communication among users through technology, we propose a
framework that communicates 'pure experience.' This framework can be achieved
by providing emotionally charged communication. To initiate this undertaking,
we propose to explore materials for communicating human emotions. Research on
emotion mainly focuses on emotions that are relevant to utilitarian concerns.
Besides the commonly-known emotions like joy and fear, there are
non-utilitarian emotions, such as aesthetic emotions, which are essential to
our daily lives. Based on Jung's theory of collective unconsciousness, we
consider archetypal content as a new category of affective stimuli of
non-utilitarian emotions. We collected pictures and sounds of the archetype of
the self, and conducted an experiment with existing affective stimuli of
utilitarian emotions. The results showed that archetypal content is potential
to be a new category of affective content. It is promising to explore other
affective content for further studies. Keywords: affective computing; non-utilitarian emotion; archetypal content |
Web and TV Seamlessly Interlinked: LinkedTV | | BIBAK | Full-Text | 32-42 | |
Lyndon Nixon | |||
This paper reports on the vision of LinkedTV driven by the EU project of the
same name, and the work done in its first year. LinkedTV is a new type of
television (or audio-visual) experience where Web and TV content can be
seamlessly interlinked based on the concepts present within that content. The
project addresses how the Web and TV is converging in end devices, and
particularly this paper focuses on how we intend to answer the research
challenges that the LinkedTV vision raises. Keywords: Smart TV; Networked Media; semantic multimedia; media annotation; Connected
TV; future TV |
VideoHypE: An Editor Tool for Supervised Automatic Video Hyperlinking | | BIBAK | Full-Text | 43-48 | |
Lotte Belice Baltussen; Jaap Blom; Roeland Ordelman | |||
Video hyperlinking is regarded as a means to enrich interactive television
experiences. Creating links manually however has limitations. In order to be
able to automate video hyperlinking and increase its potential we need to have
a better understanding of how both broadcasters that supply interactive
television and the end-users approach and perceive hyperlinking. In this paper
we report on the development of an editor tool for supervised automatic video
hyperlinking that will allow us to investigate video hyperlinking in a
real-life scenario. Keywords: video hyperlinking; interactive television; video analysis; user studies;
information extraction |
Interactive TV Potpourris: An Overview of Designing Multi-screen TV Installations for Home Entertainment | | BIBAK | Full-Text | 49-54 | |
Radu-Daniel Vatavu; Matei Mancas | |||
Home entertainment systems comprising multiple TV screens offer new
opportunities to display more content, accommodate more viewers, and deliver
enriched user experiences. In many cases, such installations take the form of
mixed-reality environments, in which video projections coexist with physical TV
sets. We refer to such installations as interactive TV potpourris, due to their
composite nature of hybridizing individual TV screens of different natures,
form factors, and potential to render different multimedia types. This work
discusses current implementations for interactive TV potpourris, identifies
technical and interaction challenges, and pinpoints future research and
development directions. It is our hope that this work will encourage new
explorations and developments of TV potpourris. Keywords: interactive TV; TV potpourris; interaction techniques; multiple displays;
home entertainment |
3D Head Pose Estimation for TV Setups | | BIBAK | Full-Text | 55-64 | |
Julien Leroy; Francois Rocca; Matei Mancas; Bernard Gosselin | |||
In this paper, we present an architecture of a system which aims to
personalize the TV content to the viewer reactions. The focus of the paper is
on a subset of this system which identifies moments of attentive focus in a
non-invasive and continuous way. The attentive focus is used to dynamically
improve the user profile by detecting which displayed media or links have drawn
the user attention. Our method is based on the detection and estimation of face
pose in 3D using a consumer depth camera. Two preliminary experiments were
carried out to test the method and to show its link to viewer interest. This
study is realized in the scenario of a TV with a second screen interaction
(tablet, smartphone), a behaviour that has become common for spectators. Keywords: attention; head pose estimation; second screen interaction; eye tracking;
Facelab; future TV; personalization |
Visualizing Rembrandt | | BIBAK | Full-Text | 65-70 | |
Tamara Pinos Cisneros; Andrés Pardo Rodríguez | |||
Visualizing Rembrandt is a web application that helps users to view
connections between Rembrandt and other artists with whom he had a professional
relationship. These connections can be made by choosing from different
criteria: teachers, pupils, influenced by, influenced on, human figure,
landscape, drawing and paintings. The data for this project was provided by the
RKD (Rijksbureau voor Kunsthistorische Documentatie) and an application built
with Java and Javascript was used for its display. This application is an
innovative tool that is helpful to display museums data in an efficient
fashion, which can be a good support for visualizing and connecting data in
museums and exhibitions (and can be used with different artists data). Keywords: Rembrandt; data visualization; art; museum |
Stylistic Walk Synthesis Based on Fourier Decomposition | | BIBAK | Full-Text | 71-79 | |
Joelle Tilmanne; Thierry Dutoit | |||
We present a stylistic walk modeling and synthesis method based on frequency
analysis of motion capture data. We observe that two peaks corresponding to the
walk cycle fundamental frequency and its first harmonic can easily be found for
most walk styles in the Fourier transform. Hence a second order Fourier series
efficiently represents most styles, as assessed in the subjective user
evaluation procedure, even though it results in a strong filtering of the
original signals and hence a strong smoothing of the resulting motion
sequences. Keywords: motion capture; synthesis; Fourier transform |
Automatically Mapping Human Skeletons onto Virtual Character Armatures | | BIBAK | Full-Text | 80-89 | |
Andrea Sanna; Fabrizio Lamberti; Gianluca Paravati; Gilles Carlevaris; Paolo Montuschi | |||
Motion capture systems provide an efficient and interactive solution for
extracting information related to a human skeleton, which is often exploited to
animate virtual characters. When the character cannot be assimilated to an
anthropometric shape, the task to map motion capture data onto the armature to
be animated could be extremely challenging. This paper presents a novel
methodology for the automatic mapping of a human skeleton onto virtual
character armatures. By extending the concept of graph similarity, joints and
bones of the tracked human skeleton are mapped onto an arbitrary shaped
armature. A prototype implementation has been developed by using the Microsoft
Kinect as body tracking device. Preliminary results show that the proposed
solution can already be used to animate truly different characters such as a
Pixar-like lamp, a fish or a dog. Keywords: virtual character animation; automatic armature mapping; motion capture;
graph similarity |
KinectBalls: An Interactive Tool for Ball Throwing Games | | BIBAK | Full-Text | 90-95 | |
Jonathan Schoreels; Romuald Deshayes; Tom Mens | |||
We present a tool that was developed in the context of the first author's
masters project. The tool implements an interactive computer game combining the
real and the virtual world in a seamless way. The player interacts with the
game by throwing balls towards a wall on which a virtual 3D scene is projected.
Using the Kinect 3D sensor, we compute and predict the trajectory, speed and
position of the ball. Upon impact with the screen, a virtual ball continues its
trajectory in the virtual scene, and interacts with the objects around it using
a physical and a graphical 3D engine Bullet, and Ogre3D. The prototype game has
been successfully tested on a large number of people of varying ages. Keywords: Kinect; HCI; virtual reality; object tracking |
Medianeum: Gesture-Based Ergonomic Interaction | | BIBAK | Full-Text | 96-103 | |
François Zajéga; Cécile Picard-Limpens; Julie René; Antonin Puleo; Justine Decuypere; Christian Frisson; Thierry Ravet; Matei Mancas | |||
The proposed Medianeum system consists in an interactive installation
allowing general audiences to explore a timeline and access informational
multimedia data such as texts, images and video.
Through a Microsoft Kinect depth sensor, users' skeletons are captured and their gestures are tracked to interact with the data presented on a screen in an ergonomic way. The graphical user interface is built upon ProcesSwing, our version of the Processing IDE embedded into a standard Swing Java GUI widget toolkit application, and the TimelineJS library from Vérité.co/Northwestern University, allowing to create online, personalized and interactive timelines that mash up historical events, sorted in definable categories. Keywords: timeline; Kinect; ergonomics; gestures; interface; interaction |
About Experience and Emergence -- A Framework for Decentralized Interactive Play Environments | | BIBAK | Full-Text | 104-113 | |
Pepijn Rijnbout; Linda de Valk; Arnold Vermeeren; Tilde Bekker; Mark de Graaf; Ben Schouten; Berry Eggen | |||
Play is an unpredictable and fascinating activity. Its qualities can serve
as an inspiration for design. In designing for play, we focus on play
environments with players and multiple interactive objects. The current
understanding of how to design these objects and interaction opportunities to
create meaningful interactions and engaging user experiences is limited. In
this paper we introduce a framework focusing on the development of
decentralized interactive play environments for emergent play. This framework
combines knowledge from different fields including play, user experience,
emergent behavior and interactions. Two case studies demonstrate its use as a
tool for analysis. Keywords: Framework; open-ended play; emergence; user experience; interactions |
MashtaCycle: On-Stage Improvised Audio Collage by Content-Based Similarity and Gesture Recognition | | BIBAK | Full-Text | 114-123 | |
Christian Frisson; Gauthier Keyaerts; Fabien Grisard; Stéphane Dupont; Thierry Ravet; François Zajéga; Laura Colmenares Guerra; Todor Todoroff; Thierry Dutoit | |||
In this paper we present the outline of a performance in-progress. It brings
together the skilled musical practices from Belgian audio collagist Gauthier
Keyaerts aka Very Mash'ta; and the realtime, content-based audio browsing
capabilities of the AudioCycle and LoopJam applications developed by the
remaining authors. The tool derived from AudioCycle named MashtaCycle aids the
preparation of collections of stem audio loops before performances by
extracting content-based features (for instance timbre) used for the
positioning of these sounds on a 2D visual map. The tool becomes an embodied
on-stage instrument, based on a user interface which uses a depth-sensing
camera, and augmented with the public projection of the 2D map. The camera
tracks the position of the artist within the sensing area to trigger sounds
similarly to the LoopJam installation. It also senses gestures from the
performer interpreted with the Full Body Interaction (FUBI) framework, allowing
to apply sound effects based on bodily movements. MashtaCycle blurs the
boundary between performance and preparation, navigation and improvisation,
installations and concerts. Keywords: Human-music interaction; audio collage; content-based similarity; gesture
recognition; depth cameras; digital audio effects |
DanSync: A Platform to Study Entrainment and Joint-Action during Spontaneous Dance in the Context of a Social Music Game | | BIBAK | Full-Text | 124-135 | |
Michiel Demey; Chris Muller; Marc Leman | |||
This paper presents a social music game, named DanSync, as a platform to
study joint-action. This game context proves to be an effective manner to study
spontaneous dance of players in a laboratory setting. Because of the gameplay
participants are engaged in dancing to music with a strong motivation.
Performance of dance synchronization to music is studied throughout the
gameplay. Joint-action in a dyad is quantified in terms of correlation and
phase-locking. Furthermore, entrainment and social bonding in small groups is
studied by introducing perturbations in the music stimulus. Keywords: Entrainment; gaming; music |
Graphical Spatialization Program with Real Time Interactions (GASPR) | | BIBAK | Full-Text | 136-145 | |
Thierry Dilger | |||
There is a dominant paradigm that links playback of audio files in a
multi-channel sound system. It consists of a "top view" (or 3D) representation
of the listening room with speakers set virtually in this space. The main
drawback of this paradigm is the lack of order and harmony of trajectories
representation leading to highly complex systems. In addition, it is very
difficult to have an overview of a sound piece whole spatialization process.
GASPR software gives the composer a new graphical representation of
trajectories in space and time. It is based on a programmable behavioral video
game engine. It is also possible to use any kind of sensors to control it live.
GASPR relies on the RGB (red, green blue) color coding working on three axes:
time (x), sound setup (y) and intensity of each sound (z). This paradigm opens
up new doors for interactive surround sound composition. Keywords: surround sound; spatialization; interaction; trajectories; video game
engine; sound behavior; graphical representation; color mapping; sound
installation; sound art |
Accuracy Study of a Real-Time Hybrid Sound Source Localization Algorithm | | BIBAK | Full-Text | 146-155 | |
Fernando A. Escobar; Xin Chang; Christian Ibala; Carlos Valderrama | |||
Sound source localization in real time can be employed in numerous
applications such as filtering, beamforming, security system integration, etc.
Algorithms employed in this field require not only fast processing speed but
also enough accuracy to properly cope with the application requirements. This
work presents accuracy benchmarks of a hybrid approach previously proposed,
which is based on the Generalized Cross Correlation (GCC), and the Delay and
Sum beamforming (DSB). Tests were performed considering a linear microphone
array simulated in MATLAB. Analysis through variations in array size, number of
microphones, spacing and other characteristics, were included. Results obtained
show that the proposed algorithm is as good as the DSB under some conditions
that can be easily met. Keywords: Accuracy; Sound localization; Generalized Cross Correlation; Beamforming;
Computational Complexity; Real Time |
Image Surround: Automatic Projector Calibration for Indoor Adaptive Projection | | BIBAK | Full-Text | 156-162 | |
Radhwan Ben Madhkour; Ludovic Burczykowski; Matei Mancas; Bernard Gosselin | |||
In this paper, we present a system able to calibrate projectors, perform 3D
reconstruction and project shadow and textures generated in real-time. The
calibration algorithm is based on Heikkila's camera calibration algorithm. It
combines Gray coded structured light patterns projection and a RGBD camera. Any
projection surface can be used. Intrinsic and extrinsic parameters are computed
without a scale factor uncertainty and any prior knowledge about the projector
and the projection surface. The projector calibration is used as a basis to
augment the scene with information from the RGBD camera. Shadows are generated
with lights. Their position is modified in real-time to follow a user position.
The 3D reconstruction is based on the Kinect fusion algorithm. The model of
scene is used to apply texture on the scene and to generate correct shadows. Keywords: projection; calibration; tracking; scene augmentation |
EGT: Enriched Guitar Transcription | | BIBAK | Full-Text | 163-168 | |
Loïc Reboursière; Stéphane Dupont | |||
EGT (Enriched Guitar Transcription) is a real-time and automatic guitar
playing transcription software. Unlike most of the automatic score
transcription software, not only the note on, note off events and pitch
tracking are performed, but all the main guitar playing techniques are detected
as well, providing a more complete transcription of the playing. These
detections are made possible thanks to the use of an hexaphonic pickup (one
pickup per string) enabling a string-by-string analysis. These transcriptions
can then be used in many different contexts and / or embedded in different
tools in order to obtain high-level information on the instrumentalist playing.
This paper will demonstrate two use cases: a complete and realtime tablature
writer and a 3D neck model controlled by the detected guitar playing events. Keywords: Guitar playing techniques; hexaphony; music information retrieval; automatic
score transcription; augmented guitar; guitar controller |
Performative Voice Synthesis for Edutainment in Acoustic Phonetics and Singing: A Case Study Using the "Cantor Digitalis" | | BIBAK | Full-Text | 169-178 | |
Lionel Feugère; Christophe d'Alessandro; Boris Doval | |||
A real-time and gesture controlled voice synthesis software is applied to
edutainment in the field of voice pedagogy. The main goals are teaching how
voice works and what makes the differences between voices in an interactive,
real-time and audio-visual perspective. The project is based on "Cantor
Digitalis", a singing vowel digital instrument, featuring an improved formant
synthesizer controlled by a stylus and touch graphic tablet. Demonstrated in
various pedagogical situations, this application allows for simple and
interactive explanation of difficult and/or abstract voice related phenomena,
such as source-filter theory, vocal formants, effect of the vocal tract size,
voice categories, voice source parameters, intonation and articulation, etc.
This is achieved by systematic and interactive listening and playing with the
sound of a virtual voice, related to the hand motions and dynamics on the
tablet. Keywords: edutainment; voice synthesis; performative synthesis; graphic tablet |
MAGEFACE: Performative Conversion of Facial Characteristics into Speech Synthesis Parameters | | BIBAK | Full-Text | 179-182 | |
Nicolas d'Alessandro; Maria Astrinaki; Thierry Dutoit | |||
In this paper, we illustrate the use of the MAGE performative speech
synthesizer through its application to the conversion of realtime-measured
facial features with FaceOSC into speech synthesis features such as vocal tract
shape or intonation. MAGE is a new software library for using HMM-based speech
synthesis in reactive programming environments. MAGE uses a rewritten version
of the HTS engine enabling the computation of speech audio samples on a
two-label window instead of the whole sentence. Only this feature enables the
realtime mapping of facial attributes to synthesis parameters. Keywords: speech synthesis; software library; performative media; streaming
architecture; HTS; MAGE; realtime audio software; face tracking; mapping |
Multimodal Analysis of Laughter for an Interactive System | | BIBAK | Full-Text | 183-192 | |
Jérôme Urbain; Radoslaw Niewiadomski; Maurizio Mancini; Harry Griffin; Hüseyin Çakmak; Laurent Ach; Gualtiero Volpe | |||
In this paper, we focus on the development of new methods to detect and
analyze laughter, in order to enhance human-computer interactions. First, the
general architecture of such a laughter-enabled application is presented. Then,
we propose the use of two new modalities, namely body movements and
respiration, to enrich the audiovisual laughter detection and classification
phase. These additional signals are acquired using easily constructed
affordable sensors. Features to characterize laughter from body movements are
proposed, as well as a method to detect laughter from a measure of thoracic
circumference. Keywords: laughter; multimodal; analysis |
@scapa: A New Media Art Installation in the Context of Physical Computing and AHRI Design | | BIBAK | Full-Text | 193-198 | |
Andreas Gernemann-Paulsen; Claudia Robles Angel; Lüder Schmidt; Uwe Seifert | |||
In this paper @scapa, an installation developed in the context of Artistic
Human-Robot Interaction design (AHRI design) is introduced. AHRI design is a
methodological approach to realize cognitive science's research paradigm of
situated or embodied cognition within cognitive musicology to investigate
social interaction in artistic contexts [11], [12], [13] using structured
observation [2], [6]. Here we focus on design aspects the course of development
of @scapa using procedures of Physical Computing and the aspects to develop
such a New Media Installation in the framework of AHRI design [5]. Keywords: situated cognition; cognitive science; research methodology; New Media Art;
Physical Computing; artistic human-robot interaction design; structured
observation |