HCI Bibliography Home | HCI Conferences | CrowdMM Archive | Detailed Records | RefWorks | EndNote | Hide Abstracts
CrowdMM Tables of Contents: 14

Proceedings of the 2014 International Workshop on Crowdsourcing for Multimedia

Fullname:Proceedings of the 3rd International Workshop on Crowdsourcing for Multimedia
Editors:Judith Redi; Mathias Lux
Location:Orlando, Florida
Standard No:ISBN: 978-1-4503-3128-9; ACM DL: Table of Contents; hcibib: CrowdMM14
Links:Workshop Website | Conference Website
  1. Keynote Address
  2. Affect
  3. Crowdworkers' Motivation
  4. Poster Session
  5. Annotation

Keynote Address

Microworkers Crowdsourcing Approach, Challenges and Solutions BIBAFull-Text 1
  Nhatvi Nguyen
Founded in May 2009, Microworkers.com is an international Crowdsourcing platform focusing on Microtasks. At present, more than 600,000 users from over 190 countries have already registered to our platform. This extensively diverse workforce is the key to the current success of Microworkers as it gives opportunity to our clients to draw widely varying experiences and knowledge from a large, heterogeneous audience in arriving at innovative solutions. With the explosion of social media, mobile apps and online digital technology, the communication channels on how modern-day workers and tech-savvy consumers have profoundly changed. With that, how businesses communicate with their consumers, and to their employees, have also greatly transformed. While innovation remains the hallmark of staying competitive, the power of crowdsourcing is becoming more widely recognized because of the broad participation that takes place at relatively minimal costs. This brilliant mass collaboration approach allows companies to generate solutions from freelance professionals who get paid only if you utilize their ideas. Crowdsourcing lets any business, of any size and nature, tap into the collective intelligence of global crowds in order to complete business related tasks that a company would normally either perform itself or outsource to a third-party provider. It has become more possible to optimize multimedia systems more rapidly and to address human factors more effectively. Not only it allows businesses to expand the size of their talent pool, it is also a time and resource-efficient method to gain deeper insight into what direct consumers really want.
   In crowdsourcing platforms, there is perfect meritocracy. Especially in systems like Microworkers; age, gender, race, education, and job history does not matter, as the quality of work is all that counts; and every task is available to Users of every imaginable background. If you are capable of completing the required Microtask, you've got the job. For the past five years, Microworkers have effortlessly given opportunities to countless individuals across the globe whom are either looking for a temporary source of income, supplemental income or in many instances, main source of livelihood. While making opportunities available to eager, talented Workers and at the same time providing cost-effective solutions to job providers, Microworkers creates a win-win structure to anyone who believes can take advantage of its system. Apart from serving as a platform that connects Workers Employers, over time Microworkers Users have formed communities that provide support and assistance to fellow Users. Though having a large diverse workforce is the framework for delivering solutions to our clients, the same pose challenges both on our underlying infrastructure, as well as on providing support. Many other challenges arise in crowdsourcing set ups due to the fact that a community of users (or Microworkers) is a complex and dynamic system highly sensitive to changes in the form and the parameterization of their activities. Microworkers' present approach in dealing with these challenges include identification of optimal crowd members, ensuring clear directions and requirements, designing incentive structures that are not conducive to cheating, among many others.


A Protocol for Cross-Validating Large Crowdsourced Data: The Case of the LIRIS-ACCEDE Affective Video Dataset BIBAFull-Text 3-8
  Yoann Baveye; Christel Chamaret; Emmanuel Dellandréa; Liming Chen
Recently, we released a large affective video dataset, namely LIRIS-ACCEDE, which was annotated through crowdsourcing along both induced valence and arousal axes using pairwise comparisons. In this paper, we design an annotation protocol which enables the scoring of induced affective feelings for cross-validating the annotations of the LIRIS-ACCEDE dataset and identifying any potential bias. We have collected in a controlled setup the ratings from 28 users on a subset of video clips carefully selected from the dataset by computing the inter-observer reliabilities on the crowdsourced data. On contrary to crowdsourced rankings gathered in unconstrained environments, users were asked to rate each video through the Self-Assessment Manikin tool. The significant correlation between crowdsourced rankings and controlled ratings validates the reliability of the dataset for future uses in affective video analysis and paves the way for the automatic generation of ratings over the whole dataset.
Modeling Image Appeal Based on Crowd Preferences for Automated Person-Centric Collage Creation BIBAFull-Text 9-15
  Vassilios Vonikakis; Ramanathan Subramanian; Jonas Arnfred; Stefan Winkler
This paper attempts to model IA in personal photo collections through a user-centric perspective. To understand how users deemed an image as being more/less appealing, an extensive crowdsourcing experiment was conducted with 350 workers and five different albums. The significant variance in selection probabilities for the most and least appealing images indicated that images were not selected randomly, and there were underlying factors that influenced some images to be selected more often than others. We then employed nine low level image attributes to model the image selection process, and trained SVRs which could adequately predict image selections for the album-specific conditions. However, a generic SVR failed to model the selection patterns as adequately as the album-specific SVRs suggesting that context greatly influences the categorization of what is more and less appealing. Experimental results demonstrate that our approach is promising. However, more attributes (related to image semantics) are needed to accurately model image selection characteristics.
A Multi-task Learning Framework for Time-continuous Emotion Estimation from Crowd Annotations BIBAFull-Text 17-23
  Mojtaba Khomami Abadi; Azad Abad; Ramanathan Subramanian; Negar Rostamzadeh; Elisa Ricci; Jagannadan Varadarajan; Nicu Sebe
We propose Multi-task learning (MTL) for time-continuous or dynamic emotion (valence and arousal) estimation in movie scenes. Since compiling annotated training data for dynamic emotion prediction is tedious, we employ crowdsourcing for the same. Even though the crowdworkers come from various demographics, we demonstrate that MTL can effectively discover (1) consistent patterns in their dynamic emotion perception, and (2) the low-level audio and video features that contribute to their valence, arousal (VA) elicitation. Finally, we show that MTL-based regression models, which simultaneously learn the relationship between low-level audio-visual features and high-level VA ratings from a collection of movie scenes, can predict VA ratings for time-contiguous snippets from each scene more effectively than scene-specific models.

Crowdworkers' Motivation

Crowdsourcing for Rating Image Aesthetic Appeal: Better a Paid or a Volunteer Crowd? BIBAFull-Text 25-30
  Judith Redi; Isabel Povoa
Crowdsourcing has the potential to become a preferred tool to study image aesthetic appeal preferences of users. Nevertheless, some reliability issues still exist, partially due to the sometimes doubtful commitment of paid workers to perform the rating task properly. In this paper we compare the reliability in scoring image aesthetic appeal of both a paid and a volunteer crowd. We recruit our volunteers through Facebook and our paid users via Microworkers. We conclude that, whereas volunteer participants are more likely to leave the rating task unfinished, when they complete it they do so more reliably than paid users.
Development and Validation of Extrinsic Motivation Scale for Crowdsourcing Micro-task Platforms BIBAFull-Text 31-36
  Babak Naderi; Ina Wechsung; Tim Polzehl; Sebastian Möller
In this paper, we introduce a scale for measuring the extrinsic motivation of crowd workers. The new questionnaire is strongly based on the Work Extrinsic Intrinsic Motivation Scale (WEIMS) [17] and theoretically follows the Self-Determination Theory (SDT) of motivation. The questionnaire has been applied and validated in a crowdsourcing micro-task platform. This instrument can be used for studying the dynamics of extrinsic motivation by taking into account individual differences and provide meaningful insights which will help to design a proper incentives framework for each crowd worker that eventually leads to a better performance, an increased well-being, and higher overall quality.

Poster Session

Is That a Jaguar?: Segmenting Ancient Maya Glyphs via Crowdsourcing BIBAFull-Text 37-40
  Gulcan Can; Jean-Marc Odobez; Daniel Gatica-Perez
Crowdsourcing is popular in multimedia research to obtain image annotation and segmentation data at scale. In the context of analysis of cultural heritage materials, we propose a novel crowdsourced task, namely the segmentation of ancient Maya hieroglyph-blocks by non-experts. This is a task that is highly perceptual and thus potentially feasible even though the crowd is not likely to have prior specialized knowledge about hieroglyphics. Based on a new data set of glyph-block line drawings for which ground-truth segmentation exists, we study how non-experts perceive glyph blocks (e.g. whether they see closed contours as a separate glyph, or how they combine visual components under plausible hypotheses of the number of glyphs present in a block.) Using Amazon Mechanical Turk as platform, we perform block-based and worker-based objective analyses to assess the difficulty of glyph blocks and the performance of workers. The results suggest that a crowdsourced approach is promising for glyph-blocks of moderate degrees of complexity.
Making use of Semantic Concept Detection for Modelling Human Preferences in Visual Summarization BIBAFull-Text 41-44
  Stevan Rudinac; Marcel Worring
In this paper we investigate whether and how the human choice of images for summarizing a visual collection is influenced by the semantic concepts depicted in them. More specifically, by analysing a large collection of human-created visual summaries obtained through crowdsourcing, we aim at automatically identifying the objects, settings, actions and events that make an image a good candidate for inclusion in a visual summary. Informed by the outcomes of this analysis, we show that the distribution of semantic concepts can be successfully utilized for learning to rank the images based on their likelihood of inclusion in the summary by a human, and that it can be easily combined with other features related to image content, context, aesthetic appeal and sentiment. Our experiments demonstrate the promise of using semantic concept detectors for automatically analysing crowdsourced user preferences at a large scale.
A Crowdsourcing Procedure for the Discovery of Non-Obvious Attributes of Social Images BIBAFull-Text 45-48
  Mark Melenhorst; María Menéndez Blanco; Martha Larson
Research on mid-level image representations has conventionally concentrated relatively obvious attributes and overlooked non-obvious attributes, i.e., characteristics that are not readily observable when images are viewed independently of their context or function. Non-obvious attributes are not necessarily easily nameable, but nonetheless they play a systematic role in people's interpretation of images. Clusters of related non-obvious attributes, called interpretation dimensions, emerge when people are asked to compare images, and provide important insight on aspects of social images that are considered relevant. In contrast to aesthetic or affective approaches to image analysis, non-obvious attributes are not related to the personal perspective of the viewer. Instead, they encode a conventional understanding of the world, which is tacit, rather than explicitly expressed. This paper provides an introduction to the notion of non-obvious image attributes of social images and introduces a procedure for discovering non-obvious attributes using crowdsourcing.
A Crowdsourced Data Set of Edited Images Online BIBAFull-Text 49-52
  Valentina Conotter; Duc-Tien Dang-Nguyen; Michael Riegler; Guilia Boato; Martha Larson
We present a crowdsourcing approach to tackle the challenge of collecting hard-to-find data. Our immediate need for the data arises because we are studying edited images in context online, and the way that this use impacts users' perceptions. Study of this topic cannot advance without a large, diverse data set of image/context pairs. The image in the pair should be suspected of having been edited, and the context is the place (e.g., website or social media post) in which it has been used online. Such pairs are hard to find, and could not be collected, due to techno-practical constraints, without the support of crowdsourcing. This paper describes a three-step approach to data set creation involving mining social data, applying image analysis techniques, and, finally, making use of the crowd to complete the necessary information. We close with a discussion of the potential and limitations of the data set collected.
Click'n'Cut: Crowdsourced Interactive Segmentation with Object Candidates BIBAFull-Text 53-56
  Axel Carlier; Vincent Charvillat; Amaia Salvador; Xavier Giro-i-Nieto; Oge Marques
This paper introduces Click'n'Cut, a novel web tool for interactive object segmentation designed for crowdsourcing tasks. Click'n'Cut combines bounding boxes and clicks generated by workers to obtain accurate object segmentations. These segmentations are created by combining precomputed object candidates in a light computational fashion that allows an immediate response from the interface. Click'n'Cut has been tested with a crowdsourcing campaign to annotate images from publicly available datasets. Results are competitive with state-of-the-art approaches, especially in terms of time needed to converge to a high quality segmentation.


Users Tagging Visual Moments: Timed Tags in Social Video BIBAFull-Text 57-62
  Peng Xu; Martha Larson
A timed tag is a tag that a user has assigned to a specific time point in a video. Although timed tags are supported by an increasing number of social video platforms on the Internet, multimedia research remains focused on conventional tags, here called "timeless tags", which users assign to the video as a whole, rather than to a specific moment. This paper presents a video data set consisting of social videos and user-contributed timed tags. A large crowdsourcing experiment was used to annotate this data set. The annotations allow us to better understand the phenomenon of timed tagging. We describe the design of the crowdsourcing experiment, and how it was executed. Then we present results of our analysis, which reveal the properties of timed tags, and their differences from timeless tags. The results suggest that the two differ with respect to what the user is attempting to express about the video. We close with an outlook that lays the groundwork for further study of timed tags in social video within the research community.
Crowd-based Semantic Event Detection and Video Annotation for Sports Videos BIBAFull-Text 63-68
  Fabio Sulser; Ivan Giangreco; Heiko Schuldt
Recent developments in sport analytics have heightened the interest in collecting data on the behavior of individuals and of the entire team in sports events. Rather than using dedicated sensors for recording the data, the detection of semantic events reflecting a team's behavior and the subsequent annotation of video data is nowadays mostly performed by paid experts. In this paper, we present an approach to generating such annotations by leveraging the wisdom of the crowd. We present the CrowdSport application that allows to collect data for soccer games. It presents crowd workers short video snippets of soccer matches and allows them to annotate these snippets with event information. Finally, the various annotations collected from the crowd are automatically disambiguated and integrated into a coherent data set. To improve the quality of the data entered, we have implemented a rating system that assigns each worker a trustworthiness score denoting the confidence towards newly entered data. Using the DBSCAN clustering algorithm and the confidence score, the integration ensures that the generated event labels are of high quality, despite of the heterogeneity of the participating workers. These annotations finally serve as a basis for a video retrieval system that allows users to search for video sequences on the basis of a graphical specification of team behavior or motion of the individual player. Our evaluations of the crowd-based semantic event detection and video annotation using the Microworkers platform have shown the effectiveness of the approach and have led to results that are in most cases close to the ground truth and can successfully be used for various retrieval tasks.
Getting by with a Little Help from the Crowd: Practical Approaches to Social Image Labeling BIBAFull-Text 69-74
  Babak Loni; Jonathon Hare; Mihai Georgescu; Michael Riegler; Xiaofei Zhu; Mohamed Morchid; Richard Dufour; Martha Larson
Validating user tags helps to refine them, making them more useful for finding images. In the case of interpretation-sensitive tags, however, automatic (i.e., pixel-based) approaches cannot be expected to deliver optimal results. Instead, human input is the key. This paper studies how crowdsourcing-based approaches to image tag validation can achieve parsimony in their use of human input from the crowd, in the form of votes collected from workers on a crowdsourcing platform. Experiments in the domain of social fashion images are carried out using the dataset published by the Crowdsourcing Task of the Mediaeval 2013 Multimedia Benchmark. Experimental results reveal that when a larger number of crowd-contributed votes are available, it is difficult to beat a majority vote. However, additional information sources, i.e., crowdworker history and visual image features, allow us to maintain similar validation performance while making use of less crowd-contributed input. Further, investing in "expensive" experts who collaborate to create definitions of interpretation-sensitive concepts does not necessarily pay off. Instead, experts can cause interpretations of concepts to drift away from conventional wisdom. In short, validation of interpretation-sensitive user tags for social images is possible, with "just a little help from the crowd".