Algorithmic Crowdsourcing

成立时间:February 1, 2012

To build a machine learning based intelligent system, we often need to collect training labels and feed them into the system. A useful lesson in machine learning is that “more data beats a clever algorithm”. In the current days, through a commercial crowdsourcing platform, we can easily collect a large amount of labels at a cost of pennies per label.

However, the labels obtained from crowdsourcing may be highly noisy. Training a machine learning model with highly noisy labels can be misleading. This is widely known as “garbage in, garbage out”. There are two main reasons on label noise. One is that crowdsourcing workers may not have expertise on a labeling task, and the other is that crowdsourcing workers may have no incentives to produce high quality labels.

Our goal in this project to develop principled inference algorithms and incentive mechanisms to guarantee high quality labels from crowdsourcing in practice.

Contact person: Denny Zhou

人员

John  Platt的肖像

John Platt

Principal Scientist

Google

Xi  Chen的肖像

Xi Chen

Intern

CMU

Nihar  Shah的肖像

Nihar Shah

Intern

UC Berkeley

Qiang  Liu的肖像

Qiang Liu

Visiting Scholar

Dartmouth

Chao  Gao的肖像

Chao Gao

Intern

Yale

Tengyu Ma的肖像

Tengyu Ma

Visiting Scholar

Princeton