|
|
|
|
|
Crowdsourced Data Management: Industry and Academic Perspectives
Author(s): Adam Marcus;Aditya Parameswaran
Source: Journal:Foundations and Trends® in Databases ISSN Print:1931-7883, ISSN Online:1931-7891 Publisher:Now Publishers Volume 6 Number 1-2, Pages: 161(1-161) DOI: 10.1561/1900000044 Keywords: Crowdsourcing
Abstract:
Crowdsourcing and human computation enable organizations to accomplish tasks that are currently not possible for fully automated techniques to complete,
or require more flexibility and scalability than traditional employment relationships can facilitate. In the area of data processing, companies have benefited
from crowd workers on platforms such as Amazon’s Mechanical Turk or Upwork to complete tasks as varied as content moderation, web content extraction, entity resolution,
and video/audio/image processing. Several academic researchers from diverse areas ranging from the social sciences to computer science have embraced crowdsourcing as a
research area, resulting in algorithms and systems that improve crowd work quality, latency, or cost. Given the relative nascence of the field, the academic and the
practitioner communities have largely operated independently of each other for the past decade, rarely exchanging techniques and experiences. In this monograph,
we aim to narrow the gap between academics and practitioners. On the academic side, we summarize the state of the art in crowd-powered algorithms and system design
tailored to large-scale data processing. On the industry side, we survey 13 industry users (e.g., Google, Facebook, Microsoft) and 4 marketplace providers of crowd work
(e.g., CrowdFlower, Upwork) to identify how hundreds of engineers and tens of million dollars are invested in various crowdsourcing solutions. Through the monograph,
we hope to simultaneously introduce academics to real problems that practitioners encounter every day, and provide a survey of the state of the art for practitioners to
incorporate into their designs. Through our surveys, we also highlight the fact that crowdpowered data processing is a large and growing field. Over the next decade,
we believe that most technical organizations will in some way benefit from crowd work, and hope that this monograph can help guide the effective adoption of crowdsourcing
across these organizations.
|
|
|
|